Sunday March 22, 2026

Tinybox enables offline 120B parameter models, Interwhen improves reasoning accuracy by 15% through verifiable feedback, and developers successfully run an LLM on a PlayStation 2.

Interested in AI engineering? Let's talk

News

Blocking Internet Archive Won't Stop AI, but Will Erase Web's Historical Record

News publishers are implementing technical blocks against the Internet Archive to prevent content scraping for LLM training, a move the EFF warns will irreversibly damage the web's historical record. While publishers aim to control data used for AI models, the EFF argues that nonprofit archiving is a protected fair use distinct from commercial AI development. This trend threatens the availability of verifiable primary sources and longitudinal data used by researchers and journalists.

Tinybox – Offline AI device 120B parameters

tinygrad is a minimalist neural network framework that decomposes complex architectures into three fundamental OpTypes: Elementwise, Reduce, and Movement (copy-free via ShapeTracker). It utilizes lazy tensors and custom kernel compilation to enable aggressive operation fusion and shape specialization, aiming to outperform PyTorch in speed and simplicity. The ecosystem includes the tinybox, a high-performance hardware platform optimized for cost-effective deep learning training and inference.

Atuin v18.13 – better search, a PTY proxy, and AI for your shell

Atuin v18.13 introduces atuin ai, an English-to-bash utility leveraging frontier LLMs and man-page datasets for command generation with integrated safety guardrails and granular privacy controls. The update also features a daemon-managed, in-memory search index using a modified nucleo engine and a lightweight PTY proxy called hex for improved TUI rendering without clearing terminal output.

Senior European journalist suspended over AI-generated quotes

Mediahuis suspended senior journalist Peter Vandermeersch after he published dozens of fabricated quotes generated by LLMs including ChatGPT, Perplexity, and NotebookLM. Vandermeersch admitted to relying on AI-generated summaries without verifying their accuracy, resulting in significant model hallucinations. The incident underscores the critical necessity of human oversight in AI-assisted editorial workflows to prevent the dissemination of synthetic misinformation.

The Impact of AI on Game Dev Jobs. Open to Work Crisis

The tech industry has transitioned from a pandemic-era hiring bubble to an AI-driven contraction where LLMs and tools like Cursor have drastically increased individual developer velocity. This shift has led to "jobs lost that were never had," as generalists leverage AI to perform specialized tasks previously requiring additional headcount. Despite the rise of "vibe-coding," human-generated content remains a critical bottleneck for model training and user engagement to avoid the degradation of AI slop.

Research

Dissociating Direct Access from Inference in AI Introspection

Replicating the Lindsey et al. (2025) paradigm, this study identifies two distinct mechanisms for thought injection detection in LLMs: probability-matching of prompt anomalies and content-agnostic direct access to internal states. While models can detect internal state anomalies, they struggle to identify semantic content and frequently confabulate high-frequency concepts. These findings suggest that AI introspection is consistent with established psychological and philosophical theories of internal monitoring.

Interwhen: A Generalizable Framework for Verifiable Reasoning

Reasoning models require robust test-time verification, but current methods either miss early errors or incur high compute costs. interwhen introduces a single-trajectory verification framework that steers model behavior by providing feedback on intermediate verifiable properties. It extracts intermediate solutions by periodically polling the reasoning trace without imposing predefined structure and runs verifiers asynchronously to minimize latency, interrupting only on error. This design improves accuracy by up to 15 percentage points over standard chain-of-thought execution within 1.5x token compute cost, achieving Pareto-optimal performance.

Code

AI SDLC Scaffold, repo template for AI-assisted software development

The AI SDLC Scaffold is a repository template for AI-first software development, enabling AI agents to manage the entire SDLC from objectives to deployment under human supervision. It promotes an "everything-in-repo" model, storing all project knowledge, decisions, and instructions alongside code, optimized for context-window efficiency through hierarchical structures and artifact indexing. AI agents utilize built-in skills to automate tasks across phases, ensuring traceability and consistent decision capture.

Rover – turn any web interface into an AI agent with one script tag

Rover is an open-source framework that transforms websites into actionable AI interfaces by operating directly on the DOM and accessibility tree. It enables autonomous agents to execute multi-step tasks like navigation and form-filling with millisecond latency, bypassing the need for RAG pipelines or vision-based VMs. Developers can integrate Rover via a script tag or npm, while AI agents can interact with enabled sites through the Agent Task Protocol (ATP) via a standardized task API.

Yeah: LLM-powered yes/no CLI tool

yeah is a Go-based CLI tool that evaluates yes/no queries using LLMs, returning results via exit codes (0 for true, 1 for false) for seamless integration into shell scripts. It supports Anthropic and OpenAI models and implements security measures like macOS sandbox-exec and Linux landlock to restrict file system access during execution.

I ran a language model on a PS2

The PS2 LLM Demo enables transformer inference on the PlayStation 2's Emotion Engine by streaming quantized weights from CD-ROM matrix-by-matrix to bypass the 32 MB RAM constraint. By keeping only activations and the KV cache in memory, the engine can run models like brandon-tiny-10m (Q8) and TinyLlama (Q4) using a custom PSNT binary format. The project features a C-based inference engine and a conversion pipeline supporting ternary, Q4, and Q8 quantization.

CopySpeak – A lightweight tool for quick AI text-to-speech

CopySpeak is a lightweight Windows desktop application built with Rust (Tauri v2) and Svelte 5 that provides AI-driven TTS for clipboard content. It supports local inference via ONNX (Kitten), Piper, and Kokoro, alongside cloud APIs like OpenAI and ElevenLabs, triggered through a double-copy mechanism or hotkeys. Key features include text normalization, real-time waveform visualization, and persistent history management for streamlined audio generation.

    Tinybox enables offline 120B parameter models, Interwhen improves reasoning accuracy by 15% through verifiable feedback, and developers successfully run an LLM on a PlayStation 2.