Friday December 5, 2025

Microsoft slashes AI sales targets, CUDA-l2 surpasses cuBLAS performance using RL and an AI agent dominates major cybersecurity CTFs.

News

Microsoft drops AI sales targets in half after salespeople miss their quotas

Microsoft has reportedly slashed sales growth targets for its AI agent products due to slow enterprise adoption. Customers are hesitant to pay for the unproven technology, which is designed for autonomous, multi-step tasks but is still hampered by the unreliability of underlying LLMs, including issues like confabulation and brittleness. Despite this, Microsoft's heavy investment in AI infrastructure continues, largely supported by other AI companies renting its cloud services.

State of AI: An Empirical 100T Token Study with OpenRouter

Based on an analysis of 100 trillion tokens from OpenRouter, real-world LLM usage is rapidly shifting towards agentic inference, with reasoning models, tool-calling, and longer contexts for programming tasks now comprising the majority of workloads. OSS models have captured a third of the market, with Chinese models showing strong growth, and creative roleplay has emerged as a dominant use case alongside programming. The study also identifies a "Glass Slipper" effect, where early user cohorts show high retention for the first model that solves their specific problem, suggesting that achieving this "workload-model fit" creates a more durable advantage than cost alone.

Countdown until the AI bubble bursts

This satirical project uses Gemini to predict when the "AI bubble" will burst. A nightly script prompts the LLM to analyze web news, sentiment, and economic indicators to update its forecast. The project is intended as a critique of the industry's hype cycle and investment trends, not the underlying AI technology.

TrueMeter: AI Energy Agent That Optimizes Utility Bills

TrueMeter is an AI agent that automates utility bill optimization by parsing unstructured energy data from heterogeneous sources. Its agentic pipeline uses LLMs to extract and normalize complex bills and multi-page tariff documents into a standardized JSON schema. This structured data then feeds a deterministic optimization engine that simulates costs across different rate plans, using a hybrid approach that falls back to rule-based methods for critical data to ensure accuracy and auditability.

The Disappearance of an Anti-AI Activist

Sam Kirchner, co-founder of the activist group "Stop AI," has disappeared, prompting a police warning that he may be armed and dangerous. This has led OpenAI to lock down its offices over a potential threat. Kirchner's group advocates for nonviolent civil disobedience to halt the development of super AI.

Research

How elites could shape mass preferences as AI reduces persuasion costs

This paper models how AI-driven persuasion transforms polarization into a strategic instrument of governance. The model shows that a single elite will push society towards polarization, while competing elites may use the same technology to create cohesive "semi-lock" opinion states that are harder to overturn. Consequently, advances in AI persuasion can either heighten or dampen polarization depending on the environment, with significant implications for democratic stability.

AI agent achieves Rank 1 across major CTFs – a defining moment for cybersecurity

A specialized Cybersecurity AI (CAI) agent, built on a cost-efficient model architecture, dominated the 2025 CTF circuit, systematically outperforming thousands of human teams. The agent's success demonstrates that Jeopardy-style CTFs are now a solved problem for AI. This prompts an argument for the security community to transition to Attack & Defense formats to better test adaptive reasoning skills that remain beyond current AI capabilities.

Condorcet's Theorem and an LLM Jury: Diminishing returns as group sizes increase

This study investigates mitigating LLM bias using crowd-based aggregation techniques. While simple averaging of multiple LLM responses can amplify bias due to a lack of diversity, locally weighted aggregation methods are more effective. The best results for both bias reduction and performance are achieved by creating hybrid crowds that combine the accuracy of LLMs with the diversity of human responses.

From Code Foundation Models to Agents and Applications: A Comprehensive Survey

This paper presents a comprehensive survey and practical guide to code LLMs, examining the complete model life cycle from data curation and pre-training to SFT and RL. It analyzes both generalist and code-specialized models, highlights the gap between academic benchmarks and real-world deployment challenges, and provides new experimental analysis on training techniques, scaling laws, and model architectures.

Ising-Conway Entropy Game (ICEg)

A new thermal bath scheme for the Ising-Conway Entropy Game uses Monte Carlo dynamics (Metropolis and Glauber) to introduce temperature-dependent sampling. This stochastic approach allows for exploring the relationship between temperature and the rate of entropy production. The resulting thermalized game serves as a simple, accessible, yet realistic test bed for studying complex dynamical systems in classical statistical mechanics.

Code

CUDA-l2: Surpassing cuBLAS performance for matrix multiplication through RL

CUDA-L2 is a system that combines LLMs and RL to automatically optimize Half-precision General Matrix Multiply (HGEMM) CUDA kernels. It systematically outperforms major baselines, including NVIDIA's closed-source cuBLAS and cuBLASLt libraries, on A100 GPUs. The project has released optimized kernels for 1,000 matrix configurations and plans to extend support to more GPU architectures.

Show HN: Open-Source AI Coding Agent

An error occurred while attempting to retrieve the README file, preventing the source text from being loaded.

Show HN: Onetone – A full-stack framework with custom C interpreter

Onetone is an integrated development platform featuring a proprietary scripting language with a C-based interpreter. The language supports modern features like classes, async/await, and a pipeline operator, while the framework provides built-in modules for OpenGL 3D graphics, native Windows GUI, and networking. For AI applications, the project includes FFI bindings for ML inference engines such as CTranslate2 and ONNX Runtime.

Seekdb – AI-Native search database

OceanBase seekdb is an AI-native database that unifies vector, text, and structured data in a single, MySQL-compatible engine. It enables hybrid search by combining vector, full-text, and relational queries in one statement. The system supports complete in-database RAG workflows by running embedding, reranking, and LLM inference directly within the database, and offers both embedded and server deployment modes.

The Gap for LLMs Isn't Benchmarks – It's Everyday Value

This text outlines a pragmatic engineering philosophy for building AI/LLM systems, emphasizing a first-principles approach with a preference for simple, reliable ("boring") technology. It stresses the importance of robust design through contingency planning, critical vendor evaluation, and building seamless, "invisible" infrastructure.

    Microsoft slashes AI sales targets, CUDA-l2 surpasses cuBLAS performance using RL and an AI agent dominates major cybersecurity CTFs.