Tuesday — July 1, 2025

AI context engineering redefines task-solving while Apple explores Anthropic/OpenAI for Siri's upgrade, and Clojure hack contest pushes LLMs to evolve self-modifying game code.

News

The new skill in AI is not prompting, it's context engineering

Context engineering is a new concept in the AI world that involves providing all the necessary context for a task to be solvable by a large language model (LLM), going beyond just crafting the perfect prompt. It's about designing and building dynamic systems that provide the right information and tools, in the right format, at the right time, to enable an LLM to accomplish a task, and is becoming a crucial factor in building powerful and reliable AI agents.

There are no new ideas in AI only new datasets

AI has made significant progress over the last 15 years, with some researchers proposing a "Moore's Law for AI" where capabilities increase exponentially with time. However, despite continuous improvement, some argue that progress is slowing down, with recent models only showing marginal improvements, and major breakthroughs, such as deep neural networks and transformers, being relatively rare and often based on existing ideas.

Entry-level jobs down by a third since launch of ChatGPT

The UK job market has seen a cautious recovery, with annual vacancy growth and rising wages, but entry-level opportunities have taken a significant hit, dropping by nearly a third since the advent of widely available generative AI tools at the end of 2022. Average advertised salaries have risen for the twelfth month in a row, reaching £42,403, but graduate job postings have dropped by 28.4% compared to the same time last year, and entry-level roles now make up just 25% of all jobs advertised in the UK.

Apple weighs using Anthropic or OpenAI to power Siri

Apple is considering replacing Siri's AI with technology from either Anthropic or OpenAI, potentially using their large language models to power a new version of Siri. The company has discussed using Anthropic's Claude or OpenAI's ChatGPT, and has asked both companies to train versions of their models that could run on Apple's cloud infrastructure for testing.

If AI Lets Us Do More in Less Time–Why Not Shorten the Workweek?

As AI increases productivity and enables people to do more in less time, there is a growing debate about shortening the workweek, with some companies and politicians, such as Senator Bernie Sanders, proposing a four-day workweek. The idea is not to cut jobs, but to share the gains from new technologies and give workers some of their time back, easing fears about automation and encouraging people to use AI productively.

Research

Small language models are the future of agentic AI

Small language models (SLMs) are sufficiently powerful and economical for many applications in agentic AI systems, making them a suitable alternative to large language models (LLMs) for specialized tasks. The adoption of SLMs could have a significant operational and economic impact on the AI industry, and the authors propose a shift towards heterogeneous agentic systems that utilize multiple models, including SLMs, to achieve efficient and effective AI solutions.

Transformers Are Graph Neural Networks

Transformers can be seen as a type of Graph Neural Network (GNN) that operates on fully connected graphs of tokens, using self-attention to capture relationships between tokens and positional encodings to understand sequential structure. This connection to GNNs reveals that Transformers are expressive set processing networks, and their efficiency on modern hardware is due to their implementation via dense matrix operations, giving them a significant advantage over traditional sparse message passing GNNs.

Embodied AI Agents: Modeling the World

Researchers are developing AI agents that interact with their environments and users through various forms, such as virtual avatars, wearable devices, and robots, allowing them to learn and act in a more human-like way. The development of "world models" is key to these agents' ability to understand and predict their environment, user intentions, and social contexts, enabling them to perform complex tasks autonomously and collaborate effectively with humans.

Survey on Evaluation of LLM-Based Agents

The emergence of LLM-based agents has led to a significant advancement in AI, enabling autonomous systems to interact with dynamic environments in complex ways, and this paper provides a comprehensive survey of evaluation methodologies for these agents. The survey analyzes various evaluation benchmarks and frameworks, revealing emerging trends and critical gaps in areas such as cost-efficiency, safety, and robustness, and proposes directions for future research to address these limitations.

Large Language Model-Powered Agent for C to Rust Code Translation

The C programming language's manual memory management can lead to memory safety issues, prompting the development of Rust as a memory-safe alternative, and researchers are exploring the use of large language models (LLMs) to automate the translation of legacy C code to Rust. A new approach, using a Virtual Fuzzing-based equivalence Test (VFT) and an LLM-powered Agent for C-to-Rust code translation (LAC2R), has been proposed to address the challenges of C-to-Rust translation and has shown effectiveness in large-scale, real-world benchmarks.

Code

Show HN: TokenDagger – A tokenizer faster than OpenAI's Tiktoken

TokenDagger is a high-performance implementation of OpenAI's TikToken, designed for large-scale text processing, offering 2x throughput and 4x faster code sample tokenization. It is a drop-in replacement with full compatibility with OpenAI's TikToken tokenizer, and can be easily installed via PyPI with the command pip install tokendagger.

Show HN: Local LLM Notepad – run a GPT-style model from a USB stick

Local LLM Notepad is a portable, open-source app that allows users to run large-language models locally on any Windows PC without installation, internet, or admin rights. The app can be run from a USB drive, features a clean UI, and includes functionalities such as source-word underlining, save/load chats, and hotkeys for easy navigation and interaction with the model.

Show HN: C.O.R.E – Opensource, user owned, shareable memory for Claude, Cursor

C.O.R.E (Contextual Observation & Recall Engine) is a private, portable, and user-owned memory system for Large Language Models (LLMs) that allows users to store and manage their context, facts, and preferences. It can be run locally or used as a hosted version, and provides features such as dynamic temporal knowledge graphs, full transparency, and auditability, enabling users to track changes and access relevant information across multiple tools and applications.

Show HN: Semantic-dictionary – A Python dictionary with semantic lookup

The Semantic Dictionary is a Python class that uses semantic similarity for key matching instead of exact matches, allowing for flexible and robust dictionary lookups. It can be installed via PyPI and supports various embedding models, including sentence-transformers, Hugging Face, and OpenAI, with features like drop-in replacement for standard dictionaries and adjustable similarity thresholds.

First Hack Contest for LLMs:)

The contest challenges Large Language Models to create a self-modifying Clojure program that evolves into a working Conway's Game of Life GUI, with the first model to achieve this goal declared the winner. The program reads its own source code, sends it to the LLM for improvement, and overwrites itself with the new code, repeating this process until it successfully displays a functional Conway's Game of Life.