Thursday — October 23, 2025
Amazon allegedly replaced its AWS DevOps team with AI before a crash, a new LLM framework is built in just 100 lines, and a paper claims a homological proof for P != NP.
News
AI assistants misrepresent news content 45% of the time
A large-scale EBU/BBC study evaluating over 3,000 responses from ChatGPT, Copilot, Gemini, and Perplexity found that these AI assistants systematically misrepresent news content. The research identified significant issues in 45% of all answers, including serious sourcing problems (31%) and major inaccuracies or hallucinations (20%). Gemini performed the worst, with significant issues in 76% of its responses, primarily due to poor sourcing. These systemic, cross-platform failures erode public trust in both the LLMs and the original news sources they cite, a growing concern as AI becomes a primary news gateway.
Look, Another AI Browser
The author criticizes the recent wave of "AI browsers" from companies like OpenAI and Perplexity, dismissing them as uninnovative Chromium wrappers with AI features. It's noted as ironic that a company pursuing AGI would release a simple reskin, suggesting the monumental engineering challenge of building a browser from scratch remains a significant barrier even for leading AI labs.
Amazon Allegedly Replaced 40% of AWS DevOps Workers with AI Days Before Crash
An unverified report alleges that Amazon replaced 40% of its AWS DevOps team with an AI system days before a major service outage. The AI was reportedly tasked with autonomously managing IAM permissions, VPC configs, and Lambda rollbacks. While the article stresses the claim is unconfirmed, it highlights the suspicious timing of the events.
Chezmoi introduces ban on LLM-generated contributions
The developer guide for the Go project chezmoi details its build and test procedures. It contains a strict policy that immediately and permanently bans any contributor who uses an LLM to assist with their contributions.
Chess engines didn't replace Magnus Carlsen, and AI won't replace you
The text analogizes using LLM coding assistants to how Magnus Carlsen uses chess engines for post-game analysis. It argues that developers should not treat LLMs as autopilots but as sparring partners, where the generated code is a starting point for rigorous code review. This human-in-the-loop process shifts the developer's role towards critical judgment, using the review to learn new patterns, catch subtle errors, and ultimately augment their own expertise rather than replacing it.
Research
The Dragon Hatchling: The missing link between the transformer and brain models
Dragon Hatchling (BDH) is a new, biologically-inspired LLM architecture based on a scale-free network of locally-interacting neurons. As an attention-based state space model, it demonstrates Transformer-like performance and scaling laws, rivaling GPT-2 at similar parameter counts. The architecture is designed for inherent interpretability with sparse, positive activations and monosemanticity, while its working memory relies on synaptic plasticity with Hebbian learning.
The Free Transformer
This work extends the decoder Transformer by conditioning its generative process on unsupervised, variationally-learned latent variables. This conditioning mechanism is shown to yield substantial improvements on downstream tasks.
Large Language Models Inference Engines Based on Spiking Neural Networks
This work proposes NeurTransformer, a methodology for converting pre-trained transformers into efficient SNNs for inference to address the quadratic complexity of standard models. The approach replaces the self-attention mechanism with a spike-based version (SSA), converts the feed-forward blocks, and then fine-tunes only the SSA component. Applied to GPT-2, this method showed a minimal performance drop, including a 9.7% perplexity reduction on the small model, while achieving a 64-85% estimated energy reduction for the self-attention block on digital hardware.
Query Decomposition for RAG
This work formulates RAG subquery decomposition and document retrieval as an exploitation-exploration problem, using bandit learning methods to dynamically select the most informative sub-queries. By sequentially retrieving documents to update beliefs about a sub-query's utility, the system decides whether to continue exploiting a query or explore an alternative. This approach, which estimates document relevance using rank information and human judgments, yielded a 35% gain in document-level precision and a 15% increase in α-nDCG, improving downstream generation performance.
A Homological Proof of P != NP: Computational Topology via Categorical Framework
This paper claims a proof for P ≠ NP using a novel approach from homological algebra and category theory. It introduces a computational homology theory where problems in P have trivial homology, while NP-complete problems like SAT possess non-trivial homology, thereby separating the classes. The proof is formally verified in Lean 4, establishing computational topology as a new paradigm for complexity analysis.
Code
Ovi: Twin backbone cross-modal fusion for audio-video generation
Ovi is an audio-video generation model featuring a twin backbone cross-modal fusion architecture to simultaneously generate synchronized video and audio. It accepts text or text+image inputs to produce 5-second, 24 FPS videos. The model, which builds upon Wan2.2 for its video branch and introduces a new 5B audio branch, can generate at higher resolutions and variable aspect ratios despite being trained only at 720x720. The 11B model, inference code, and quantized fp8/qint8 versions are available, allowing it to run with as little as 24GB of VRAM.
Show HN: Timeplus Proton 3.0 – First vectorized streaming SQL engine
Timeplus Proton is a high-performance stream processing engine in a single C++ binary, serving as a lightweight alternative to Flink or ksqlDB. It extends the ClickHouse engine with true stream processing capabilities, using SQL for complex real-time tasks like streaming ETL, materialized views, and multi-stream JOINs. This enables the creation of low-latency data pipelines for observability and feeding live, transformed data into AI systems.
Show HN: SerenDB – A Neon PostgreSQL fork optimized for AI agent workloads
SerenDB is a serverless, open-source fork of Neon that serves as an alternative to AWS Aurora Postgres. It aims to reduce database branching time to under 100ms and introduces new security features specifically for AI-Agent context storage, such as context finger-printing. The architecture separates storage and compute, using stateless PostgreSQL nodes with a custom storage engine composed of a Pageserver and Safekeepers for the WAL service.
Show HN: 100-Line LLM Framework
Pocket Flow is a minimalist LLM framework implemented in just 100 lines of code with zero dependencies or vendor lock-in. It uses a graph as its core abstraction to build common patterns like multi-agent systems, workflows, and RAG, positioning itself as a lightweight alternative to bloated frameworks. The project also promotes an "Agentic Coding" paradigm where AI agents assist in development.
Show HN: MCP server that teaches LLMs to write production grade Postgres SQL
Tiger CLI provides an MCP server that goes beyond a simple API wrapper by using expert-written prompt templates and RAG over versioned documentation. This enables LLMs to generate production-ready Postgres code and design schemas with best practices. The server also exposes tools for direct database management and query execution.