Friday — November 28, 2025

The White House is accused of bailing out the AI industry, DeepSeek releases a self-verifying math model and humans beat LLMs in a competitive coding tournament.

News

AI CEO – Replace your boss before they replace you

The text satirically pitches an "AI CEO" as a cost-effective and efficient replacement for human executives. It critiques corporate culture by training the AI on cynical pillars like "growth at all costs" and "maximum output with minimum empathy." The concept also mocks the tendency of LLMs to confidently generate nonsensical "thought leadership."

TPUs vs. GPUs and why Google is positioned to win AI race in the long term

Google's TPUs are ASICs designed with a systolic array architecture to efficiently handle the matrix multiplications central to AI, offering superior performance-per-dollar and power efficiency over general-purpose GPUs for specific workloads. While wider adoption is hindered by the dominance of Nvidia's CUDA ecosystem and their exclusivity to GCP, TPUs provide Google a key strategic advantage by enabling a vertically integrated, cost-effective AI hardware stack. The latest TPU generations demonstrate performance on par with top-tier Nvidia offerings, solidifying their role in powering models like Gemini and positioning GCP competitively against other cloud providers.

The current state of the theory that GPL propagates to AI models

The theory that copyleft licenses like the GPL propagate from training data to an AI model remains legally unresolved. Ongoing lawsuits, such as Doe v. GitHub and GEMA v. OpenAI, keep this theory relevant, with a German court notably ruling that a model's memory can constitute a legal "reproduction" of training data. Despite strong legal, technical, and practical counterarguments against propagation, the issue remains a significant legal risk pending further judicial and legislative outcomes.

Has the bailout of generative AI begun?

The White House's "Genesis" AI program, announced via Executive Order, involves large-scale government purchases of chips and services from AI companies. The initiative is being widely interpreted as a government bailout for the financially overextended AI industry. The author questions whether the program will produce genuine scientific impact with LLMs or is simply the first of many subsidies for the sector.

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

DeepSeekMath-V2 is an LLM designed for self-verifiable mathematical reasoning, moving beyond the limitations of final-answer-based rewards. The approach trains an LLM-based verifier for theorem proving, which then acts as a reward model for a proof generator. This generator is incentivized to self-correct its reasoning, while the verifier is continuously improved by labeling new, difficult proofs. The model demonstrates strong performance on benchmarks like IMO-ProofBench and recent math competitions.

Research

Can Vibe Coding Beat Graduate CS Students? An LLM vs. Human Coding Tournament

Existing LLM code generation benchmarks, focused on unit-test pass rates, are insufficient for real-world problems requiring strategic reasoning. A new multi-agent benchmark based on a competitive logistics optimization problem was introduced to test these capabilities. In a large-scale tournament, human-coded agents consistently outperformed 40 LLM-coded agents, the majority of which failed to beat simple baselines. The results highlight a significant gap in LLMs' ability to synthesize competitively viable code for complex, reasoning-driven tasks, as even the best LLM worsened a top human solution when asked to improve it.

Refrag: Rethinking RAG Based Decoding

RAG systems suffer from high latency and memory overhead due to long contexts from retrieved documents. The proposed REFRAG framework addresses this by exploiting the observation that attention patterns in RAG are sparse, as most retrieved passages are irrelevant to the query. By eliminating these unnecessary computations during decoding, REFRAG achieves a 30.85x TTFT acceleration and a 16x context length extension without any loss in perplexity or accuracy on various long-context tasks.

Bayesian Neural Networks (2018) [pdf]

Bayesian Neural Networks (BNNs) integrate probabilistic models with neural networks to combine the function approximation capabilities of NNs with the uncertainty modeling of stochastic methods. This fusion enables BNNs to output a full posterior distribution for predictions, providing robust uncertainty quantification. Additionally, BNNs learn a distribution over the model's parameters, offering insights into the parameter space and making them attractive for both theoretical and practical applications.

The Iceberg Index: Measuring Workforce Exposure Across the AI Economy

Project Iceberg uses Large Population Models to simulate the human-AI labor market, addressing the failure of traditional metrics to capture pre-adoption disruption. It introduces the Iceberg Index, a skills-based metric measuring the wage value of occupational tasks that AI can technically perform. The model reveals that AI's technical capability for cognitive automation in administrative and professional services (~$1.2T) is fivefold larger and more geographically distributed than the visible adoption in the tech sector, enabling proactive identification of exposure hotspots.

Game Theory in Cosmology

Cosmological Teleodynamics is a new game-theoretic statistical framework that recasts the dark sector as an emergent property of the universe's nonlocal memory and structural organization. By applying a maximum-caliber weight and a bias functional to cosmic histories, the model derives modified physical equations that explain cosmic acceleration and clustering while helping to resolve the $H_0$ and $S_8$ tensions. The framework posits that the universe operates like a giant potential game, evolving towards a continuous form of Nash equilibrium.

Code

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning [pdf]

DeepSeekMath-V2 introduces a self-verifiable approach to mathematical reasoning, moving beyond traditional RL methods that only reward correct final answers. The system trains an LLM-based verifier to assess the rigor of step-by-step theorem proving, which then serves as a reward model for a proof generator. To continuously improve, the verifier uses scaled compute to automatically label new, difficult proofs, creating a data feedback loop. This method has resulted in strong performance on advanced math competitions like IMO, CMO, and Putnam.

Show HN: Runprompt – run .prompt files from the command line

runprompt is a single-file Python CLI tool for executing templated .prompt files against various LLM providers. It supports structured JSON output via schemas and allows for chaining prompts by piping I/O. Configuration is handled in the file's frontmatter and can be overridden with CLI flags or environment variables.

Show HN: Era – Open-source local sandbox for AI agents

ERA Agent is a sandbox for safely executing untrusted or AI-generated code within lightweight microVMs. It leverages krunvm and buildah to provide strong security isolation with a container-like developer experience and fast 200ms launch times. The system is managed by a local Go CLI for local development, with an optional Cloudflare Worker-based control plane for remote orchestration and API access.

Show HN: Whole-home VPN router with hardware kill switch (OpenWrt and WireGuard)

This project details building a network-level VPN gateway on a Raspberry Pi or mini PC using OpenWrt, WireGuard, and its obfuscated fork AmneziaWG. It provides whole-home privacy with a robust, firewall-based kill switch to prevent traffic leaks and bypass DPI. The entire stack is explicitly designed for deployment and management by AI coding agents like Claude and GPT, with detailed instructions for guided setup.

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

DeepSeekMath-V2 introduces a self-verifiable approach to mathematical reasoning, moving beyond rewards based on final answers. It trains an LLM-based verifier to assess the rigor of each reasoning step, which then serves as a reward model for a proof generator. The system uses a feedback loop to scale verification and continuously improve the verifier on harder problems. The resulting model demonstrates strong theorem-proving capabilities on advanced math competitions like IMO, CMO, and Putnam.