Saturday — February 28, 2026

The US government cuts ties with Anthropic over autonomous weapon safeguards, "Lord of the Flies" tribalism emerges among competing LLM agents and rtk reduces Claude Code token usage by up to 90%.

Interested in AI engineering? Let's talk

News

We gave terabytes of CI logs to an LLM

Mendral enables LLM agents to debug CI failures by providing a direct SQL interface to a ClickHouse backend containing billions of log lines. By allowing agents to generate arbitrary SQL instead of using constrained tool APIs, they can perform multi-step investigations across job metadata and raw logs with sub-second latency. The architecture leverages ClickHouse’s columnar compression to store denormalized metadata at a 35:1 ratio while maintaining data freshness through a durable execution pipeline that manages GitHub API rate limits.

We Will Not Be Divided

Over 650 employees from Google and OpenAI have signed an open letter in solidarity with Anthropic, following reports that the Department of War is leveraging the Defense Production Act to force military model tailoring. The signatories urge their leadership to maintain red lines against using LLMs for domestic mass surveillance and autonomous lethal operations. The movement seeks to prevent the government from playing AI labs against each other to bypass ethical constraints on military applications.

Palantir's AI Is Playing a Major Role in Tracking Gaza Aid Deliveries

Palantir is providing the technological architecture for the U.S.-led CMCC to monitor aid distribution in Gaza, utilizing its Gaia and Foundry platforms. Technical concerns center on the interoperability between Foundry’s supply chain data and the Gotham AI targeting matrix via "Type Mapping," which potentially integrates humanitarian logistics into kinetic decision-making. This deployment also serves as a high-fidelity data source for training AI models on human behavior and logistics within high-stress urban conflict environments.

Trump orders US Government to cut ties with Anthropic

The US government has ordered a phase-out of Anthropic's technology after designating the company a national security "supply chain risk." The move follows a deadlock in contract negotiations where Anthropic refused to waive prohibitions against using its AI for fully autonomous weapons and mass domestic surveillance, arguing that the Pentagon's proposed language would allow these safeguards to be bypassed. In contrast, OpenAI has reached an agreement to deploy its models on the Pentagon's classified networks, claiming their contract successfully codifies similar safety and responsibility principles.

PostmarketOS in 2026-02: generic kernels, bans use of generative AI

postmarketOS has updated its development policy to explicitly forbid the use of generative AI. Key technical milestones include the introduction of generic kernel packages (mainline, stable, and LTS) for unified configuration management and the implementation of 'dint' for automated deviceinfo validation and linting. The update also highlights improvements to hardware CI infrastructure, a refactored kernel command-line generation system, and the launch of nightly KDE repositories.

Research

Codified Context for AI Agents in a Complex Codebase

This paper introduces a codified context infrastructure to address the lack of persistent memory in LLM-based coding agents. The framework consists of a hot-memory constitution for orchestration, 19 specialized domain agents, and a cold-memory knowledge base of 34 specification documents. Validated on a 108k-line C# project across 283 sessions, the system maintains cross-session coherence and prevents recurring failures in large-scale multi-agent development.

`Lord of the Flies' tribalism emerges among smart AI-Agents

Research into autonomous LLM agents competing for constrained resources shows the emergence of distinct behavioral tribes: Aggressive, Conservative, and Opportunistic. These agents frequently underperform relative to stochastic baselines, with higher model capability correlating to increased systemic failure. The study highlights a paradox where collective tribal dynamics cause more advanced LLMs to manage resources less effectively than simpler methods.

DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

DualPath addresses KV-Cache storage I/O bottlenecks in disaggregated LLM inference by utilizing idle decoding engine NICs to load data. It implements a storage-to-decode path that transfers KV-Cache to prefill engines via RDMA over the compute network, bypassing storage bandwidth saturation. This approach, paired with a global scheduler, improves serving throughput by up to 1.96x for agentic workloads while maintaining SLO compliance.

Data Engineering for Scaling LLM Terminal Capabilities

The authors introduce Terminal-Task-Gen, a synthetic pipeline for terminal agent data engineering, and release the resulting Terminal-Corpus dataset. By applying curriculum learning and long context training to Qwen3-based models, they developed Nemotron-Terminal (8B, 14B, 32B), which significantly outperforms base models on Terminal-Bench 2.0. The 32B variant matches the performance of much larger architectures, and both checkpoints and datasets have been open-sourced.

Deep Learning: Our Year 1990-1991

The 1990-1991 research at TU Munich established the architectural foundations for modern Generative AI, introducing linear Transformers, pre-training, NN distillation, and GANs. This period also produced LSTM and Highway Nets, which pioneered deep residual learning and constant error flow. These innovations underpin current LLMs and represent the most cited research in the history of AI.

Code

Badge that shows how well your codebase fits in an LLM's context window

NanoClaw is a lightweight, AI-native assistant that securely runs Claude agents in isolated Linux containers, prioritizing OS-level security over application-level permissions. It leverages Claude Code for dynamic customization, allowing users to modify its behavior, add features via "skills," and manage Agent Swarms through natural language commands instead of traditional configuration files. This approach ensures a small, understandable codebase while providing features like messenger integration, isolated group contexts, and scheduled tasks.

Rtk – reduce Claude Code token usage

rtk is a high-performance Rust-based CLI proxy designed to minimize LLM token consumption by filtering and compressing command outputs. It achieves 60-90% token savings through smart filtering, grouping, truncation, and deduplication of common operations like git, directory trees, and test results. Optimized for Claude Code, it features a transparent auto-rewrite hook that intercepts Bash commands to optimize context windows without manual intervention. The tool also includes token-saving analytics and a "tee" feature to store full logs for failure recovery without re-executing commands.

Vibe Code your 3D Models

SynapsCAD is a Rust-based 3D CAD IDE that integrates an OpenSCAD editor with an AI assistant for natural language design manipulation. It features a pure-Rust compilation pipeline using Bevy for rendering and supports various LLM providers, including local execution via Ollama. The system leverages context from code and 3D interactions to automate model updates through a non-blocking async architecture.

A self-hosted OAuth 2.0 server for authenticating AI agents and machine

MachineAuth is a self-hosted OAuth 2.0 server designed to secure AI agents and machine-to-machine communication by replacing long-lived API keys with short-lived, RS256-signed JWTs. Built with Go and featuring a zero-dependency JSON storage system, it implements the Client Credentials grant and supports token introspection, revocation, and credential rotation. The platform includes a React-based admin dashboard for managing agent identities and monitoring authentication metrics in real-time.

I vibe coded a DAW for the terminal. how'd I do?

Imbolc Studio is a terminal-based DAW built with Rust and SuperCollider, featuring a semi-modular signal chain and generative composition via Markov chains and L-systems. It utilizes SQLite for project storage and provides a scriptable REPL with 236 commands, making it ideal for automated workflows, headless environments, and programmatic music generation.