Saturday January 3, 2026

California's AB316 eliminates AI autonomy as a legal defense, deep learning models achieve 95% accuracy in acoustic keyboard attacks, and IQuest-Coder outperforms Claude 4.5 and GPT 5.1.

Interested in AI engineering? Let's talk

News

Why do Americans hate A.I.?

Despite record-breaking adoption rates in sectors like software development and law, AI faces a growing public backlash driven by concerns over job displacement and model opacity. Key technical and systemic anxieties include the "black box" nature of LLMs, the propensity for hallucinations, and the massive energy demands of data centers. This skepticism is further amplified by the concentration of industry power and the potential for an AI-driven economic bubble.

Everyone's Watching Stocks. The Real Bubble Is AI Debt

The AI investment landscape is shifting from cash-funded growth by major tech firms to a phase increasingly characterized by significant debt accumulation. Howard Marks warns that this reliance on leverage suggests the AI bull market has matured into a potential bubble. While initial LLM development was supported by strong balance sheets, the current trajectory of AI debt poses a systemic risk that may outweigh stock market volatility.

I built a clipboard tool to strip/keep specific formatting like Italics

CustomPaste is a local Windows utility that automates clipboard text transformation through customizable "recipes" applied during the paste operation. It is particularly useful for cleaning LLM-generated text by fixing smart quotes, em-dashes, and inconsistent whitespace that can cause syntax errors in code snippets. The tool ensures data privacy by processing all transformations 100% offline, preventing sensitive prompts or data from being transmitted to external servers.

AB316: No AI Scapegoating Allowed

California's AB316 eliminates "AI autonomy" as a legal defense, holding developers and users liable for harms caused by AI systems. This places the burden of unpredictable LLM outputs directly on the deploying entity, regardless of the model's autonomous nature. The law's ambiguity regarding liability distribution between model providers and third-party integrators is expected to accelerate the adoption of AI guardrails and specialized insurance.

AI Futures Model: Dec 2025 Update

The AI Futures Model provides updated timelines for milestones including Automated Coder (AC), Superhuman AI Researcher (SAR), and ASI. Median forecasts for AC have shifted to 2031–2032, reflecting more conservative modeling of AI R&D automation and diminishing returns in software efficiency compared to previous iterations. The framework extrapolates METR coding time horizons while accounting for compute constraints and the potential for a "taste-only singularity" to drive fast takeoff speeds post-automation.

Research

Epistemological Fault Lines Between Human and Artificial Intelligence

LLMs are stochastic pattern-completion systems characterized as walks on high-dimensional graphs rather than epistemic agents with internal world models. The authors identify seven epistemic fault lines—grounding, parsing, experience, motivation, causal reasoning, metacognition, and value—that distinguish LLM outputs from human cognition. This divergence creates "Epistemia," a condition where linguistic plausibility substitutes for epistemic evaluation, necessitating new frameworks for AI governance and literacy.

Acoustic Side Channel Attack on Keyboards (2023)

Researchers implemented a deep learning model to execute acoustic side-channel attacks against laptop keyboards using smartphone microphones and Zoom recordings. The classifier achieved 95% accuracy on local recordings and 93% via Zoom, establishing new benchmarks for keystroke inference without the use of a language model. The study demonstrates the feasibility of these attacks using off-the-shelf hardware and standard algorithms while proposing potential mitigation strategies.

Exploiting Silent Delivery Receipts to Monitor Mobile Instant Messengers

Researchers have identified a side-channel vulnerability in messaging apps like WhatsApp and Signal where delivery receipts can be silently triggered to leak user metadata. This exploit allows attackers to infer activity status, device counts, and OS types, or launch resource exhaustion attacks without user notification. The study calls for protocol-level design changes to mitigate these privacy risks.

Measuring Agents in Production

A large-scale study of 306 practitioners reveals that production AI agents favor simple, controllable architectures, with 70% relying on prompting off-the-shelf models and 68% limiting autonomous execution to under 10 steps. Reliability and evaluation remain the primary bottlenecks, leading 74% of organizations to depend on human evaluation for quality control. Despite these challenges, straightforward deployment patterns are already delivering significant impact across diverse industries.

System Falsification for Efficient Cyber-Kinetic Vulnerability Detection

RampoNN is a framework for detecting kinetic vulnerabilities in CPS by integrating control code analysis with neural network abstractions of physical dynamics. It leverages Deep Bernstein neural networks and high-precision reachability algorithms to prune safe execution paths and guide a falsification engine toward potential STL specification violations. Evaluation shows the approach accelerates vulnerability discovery by up to 98.27% compared to state-of-the-art methods.

Code

IQuest-Coder: A new open-source code model beats Claude Sonnet 4.5 and GPT 5.1 [pdf]

IQuest-Coder-V1 is a family of code LLMs ranging from 7B to 40B parameters, featuring a "code-flow" training paradigm that models repository evolution and commit transitions. The suite includes specialized Instruct and Thinking variants, alongside a Loop architecture that utilizes recurrent transformer layers for improved parameter efficiency. Supporting a native 128K context length and GQA, the models achieve SOTA performance on benchmarks such as SWE-Bench Verified (81.4%) and BigCodeBench.

I wrote the manual Karpathy said was missing for agentic AI

Morphic Programming is a first-principles manual for building agentic AI systems using CLI agents like Claude Code. It defines nine core principles—including morphability, recursion, and token efficiency—to navigate the shift toward agent-driven abstractions and context engineering. The guide provides a framework for system design, E2E autonomy, and managing the stochastic nature of LLM-integrated workflows.

I mapped System Design concepts to AI Prompts to stop bad code

This curriculum provides a comprehensive guide to system design tailored for AI-augmented development and "vibecoding." It bridges the gap between fundamental architecture principles and LLM prompting, enabling developers to provide precise technical constraints and validate AI-generated designs. The resource spans 71 chapters across 10 levels, covering everything from distributed systems to senior-level trade-off analysis and interview preparation.

Anything: Import anything into Python (generated by LLM)

anything is a satirical Python library that dynamically generates classes and implements their methods on-the-fly using OpenAI. It leverages LLMs to provide just-in-time code execution for arbitrary calls, though it is intentionally designed as a dangerous proof-of-concept rather than a production tool.

Client-side image optimizer built with Next.js

Image Optimizer is a client-side Next.js application that utilizes browser-native APIs like HTMLCanvasElement and FileReader to perform local image compression and JPEG re-encoding. By executing all processing within the browser runtime, it ensures data privacy and eliminates server-side overhead, making it an efficient utility for preparing visual assets or datasets. The architecture features a TypeScript-driven implementation with bulk processing capabilities and real-time quality adjustments.

    California's AB316 eliminates AI autonomy as a legal defense, deep learning models achieve 95% accuracy in acoustic keyboard attacks, and IQuest-Coder outperforms Claude 4.5 and GPT 5.1.