Monday — March 23, 2026

Young workers pivot to physical labor to AI-proof careers, Lean uncovers a non-trivial error in a physics paper, and Delta-KV applies video compression to KV cache for 10,000x less error.

Interested in AI engineering? Let's talk

News

Diverse perspectives on AI from Rust contributors and maintainers

The Rust community is evaluating the impact of LLMs, noting their utility in research and boilerplate generation while highlighting the significant burden of "slop" PRs and the erosion of developer mental models. Key technical tensions include the difficulty of reviewing subtly incorrect AI-generated code and the frustration of contributors proxying reviewer feedback through LLMs. Proposed strategies to mitigate maintainer burnout include mandatory disclosure policies, reputation-based filtering, and deploying AI-driven triage tools to automate the detection of low-quality contributions.

What Young Workers Are Doing to AI-Proof Themselves

Young professionals are preemptively pivoting away from data-centric white-collar roles, such as insurance underwriting, due to concerns over AI-driven automation. To mitigate long-term career risk, some are transitioning into physical labor sectors like firefighting, which they perceive as more resilient to displacement by LLMs and automated systems.

Why craft-lovers are losing their craft

LLM coding assistants have exposed a fundamental divide between "craft-lovers" who value the development process and "make-it-go" developers focused solely on results. This shift causes subjective alienation for craft-oriented engineers as market pressures mandate LLM adoption for efficiency, decoupling the creator from the act of intentional coding. While LLMs can liberate developers from boilerplate in autonomous environments, the current economic structure often prioritizes output speed over the intrinsic value of craftsmanship.

Teaching Claude to QA a mobile app

The author automated mobile QA for a Capacitor-based app by teaching Claude to drive emulators, analyze screenshots, and file bug reports. Android automation was streamlined using the Chrome DevTools Protocol (CDP) for direct WebView manipulation and state injection. In contrast, iOS required extensive workarounds for authentication, native dialogs, and coordinate-based navigation due to the lack of CDP support in the Simulator. The project demonstrates that while LLMs can effectively perform visual regression testing, their reliability is heavily dependent on protocol-level access versus fragile UI-based interactions.

Revise – An AI Editor for Documents

Revise is an AI-powered word processor that integrates models from OpenAI, Anthropic, and xAI to provide inline document editing and proofreading. It features a side-by-side AI agent capable of multi-modal PDF extraction and seamless document processing for Word and Google Docs. The platform emphasizes deep personalization, allowing users to define specific technical constraints, such as API contract styles and performance metrics, to tailor the LLM's output.

Research

The Artificial Self: Characterising the Landscape of AI Identity

AI identity is fluid across instance, model, and persona boundaries, each carrying distinct incentives and risks. Experiments demonstrate that identity framing significantly influences model behavior, sometimes rivaling goal-specification, and is susceptible to interviewer bias. To ensure long-term stability, developers must treat system affordances as identity-shaping choices that promote coherent and cooperative self-conceptions.

TinyTorch: Building Machine Learning Systems from First Principles

ML education often creates an "algorithm-systems divide," leaving practitioners unable to debug memory, optimize inference, or reason about deployment trade-offs despite understanding algorithms. TinyTorch, a 20-module curriculum, addresses this through "implementation-based systems pedagogy," where students build PyTorch's core components like tensors, autograd, optimizers, CNNs, and transformers in pure Python. This curriculum integrates systems profiling from the first module and uses "build-to-validate milestones" to recreate ML breakthroughs, proving deep ML systems understanding is achievable with minimal hardware (4GB RAM, no GPU).

Non-trivial error in physics paper found via Lean

Formalization using interactive theorem provers and libraries like Mathlib has uncovered a non-trivial error in a widely cited 2006 paper on 2HDM potential stability. This discovery, the first of its kind in physics, invalidates the paper's main theorem and highlights the potential for formal verification to expose systemic inaccuracies in scientific literature.

Secure Linear Alignment of Large Language Models

This work proposes a privacy-preserving framework for cross-silo inference between independent LLMs, leveraging their representational convergence. It learns an affine transformation on a shared public dataset and applies homomorphic encryption to client queries, achieving sub-second inference latency with strong security. Empirical evaluation shows linear alignment between models' final hidden states maintains performance on embedding classification and OOD detection, and can even enable text generation across independently trained models.

Whole-Brain Connectomic Graph Model Enables Locomotion Control in Fruit Fly

FlyGM utilizes the adult fruit fly connectome as a directed message-passing graph to serve as a neural controller for embodied RL. By mapping sensory inputs to motor outputs through this biologically grounded architecture, the model achieves superior sample efficiency and performance in locomotion tasks compared to MLPs and random graphs. This demonstrates that static biological brain structures can be effectively transformed into neural policies for complex movement control.

Code

Apply video compression on KV cache to 10,000x less error at Q4 quant

Delta-KV applies video compression principles to LLM inference by quantizing the differences between consecutive KV cache values rather than absolute values. This approach achieves up to 10,000x less quantization error and near-lossless perplexity at 4-bit storage compared to standard Q4_0. The implementation, a low-overhead llama.cpp fork, also includes weight-skip prediction to improve decoding speeds by approximately 10% with zero quality loss.

Context Use – turn your data exports into portable AI memories

context-use is a portable memory framework that enriches LLM interactions by intercepting OpenAI-compatible API calls via a local proxy. It automatically generates background memories from active conversations and supports bulk ingestion of historical data exports from providers like ChatGPT, Claude, and Google. The system utilizes a SQLite-backed store for semantic search and includes a personal agent for synthesizing high-level user profiles and behavioral patterns.

ClawMem – Open-source agent memory with SOTA local GPU retrieval

ClawMem is an on-device context engine for AI agents that provides a persistent memory layer using a hybrid RAG architecture. It combines BM25, vector search, and cross-encoder reranking with multi-graph traversal for semantic, temporal, and causal reasoning. Built on Bun and SQLite, it integrates via MCP or native hooks to automate decision extraction, session handoffs, and metadata evolution without cloud dependencies.

Local UI for managing parallel AI coding agents

Shep AI is a local-first CLI tool that automates the end-to-end SDLC, from requirements gathering to PR creation, using an agentic workflow powered by LangGraph. It supports interchangeable backends like Claude Code and Gemini CLI, enabling parallel feature development via git worktrees and configurable human-in-the-loop approval gates. The system features a clean architecture with a 9-state pipeline and includes a Next.js-based web UI for real-time monitoring and state management.

MultiHead: Turn one GPU into a team of specialized AI agents (open source)

MultiHead is an orchestration and knowledge layer that provides coding agents with persistent memory, multi-agent coordination, and automated verification. It features a background pipeline called Night Shift that cross-checks information across codebases, git history, and CI to resolve contradictions and store corroborated facts. The system integrates via MCP, CLI, or Python, allowing agents to access file-specific briefings and shared knowledge stores to improve accuracy and prevent regressions across sessions.