Saturday — May 31, 2025

Stanford's AI system outpaces human-optimized kernels, the Darwin Gödel Machine evolves its own algorithms for superior coding performance, and llm_poker enables AI players to strategize in Texas Hold'em.

News

Surprisingly fast AI-generated kernels we didn't mean to publish yet

Researchers at Stanford have developed an AI system that can generate extremely fast kernels, outperforming human-optimized production kernels in PyTorch, with results including 101.3% performance of FP32 torch.matmul and 179.9% performance of FP32 torch.nn.Conv2D. The system uses a novel approach to generate optimization ideas in natural language and explores multiple implementations in parallel, allowing it to discover new and effective optimization strategies.

AI is not our future

Procreate, a digital art company, has taken a stance against generative AI, believing it threatens human creativity and is built on a foundation of theft. The company prioritizes human-made art, respecting users' skills and privacy, and instead focuses on developing tools that empower artists to create original work.

The Darwin Gödel Machine: AI that improves itself by rewriting its own code

The Darwin Gödel Machine is a self-improving AI system that rewrites its own code to improve performance on programming tasks, leveraging principles of open-ended algorithms like Darwinian evolution to search for empirical improvements. Experiments demonstrate the DGM's ability to continuously self-improve, outperforming hand-designed agents and showing substantial gains in performance on coding benchmarks, with its self-improvement and open-ended exploration capabilities enabling it to discover novel solutions and avoid suboptimal designs.

Mary Meeker's first Trends report since 2019, focused on AI

The pace of change in artificial intelligence (AI) is unprecedented, with user and usage trends ramping up materially faster than the internet business, driven by global internet accessibility, growing digital datasets, and breakthrough large language models. The AI landscape is becoming increasingly complex, with rising competition, open-source momentum, and China's growing presence, leading to unprecedented growth in AI user and usage trends, as well as rising costs and monetization threats.

What Happens When AI-Generated Lies Are More Compelling Than the Truth?

The creation and dissemination of fake photographs and videos, known as deepfakes, has become increasingly easy and convincing with the use of artificial intelligence tools, allowing for the automation of their production and making it difficult to distinguish between real and fake content. This has significant implications for politics, democracy, and trust, as the proliferation of deepfakes can lead to a world where objective truth loses its power, and authoritarian regimes can exploit the resulting uncertainty and doubt to define what is true.

Research

Mind the Gap: Deep Learning Doesn't Learn Deeply

This paper investigates how neural networks learn algorithmic reasoning, focusing on graph neural networks (GNNs) and using a technique called neural compilation to compare learned and compiled algorithms. The study aims to understand why neural networks often fail to learn effective algorithms and to characterize the expressability-trainability gaps that hinder learning algorithmic reasoning, with a hypothesis that inductive learning is most effective for parallel algorithms within the computational class NC.

Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

The Darwin Gödel Machine (DGM) is a self-improving AI system that iteratively modifies its own code, using a process inspired by Darwinian evolution to explore new coding agents and improve its capabilities. Through empirical validation and open-ended exploration, the DGM has demonstrated significant improvements in coding performance, outperforming baselines and taking a step towards autonomous AI development with potential for endless innovation.

Quaternion formulation claimed to resolve Navier Stokes Millennium Problem

The authors present a novel framework using quaternion-complex numbers to formulate the incompressible Navier-Stokes equations, which reveals the underlying geometric structure of viscous fluid motion and resolves the Clay Institute's Millennium Prize problem. This framework proves global regularity and prevents finite-time singularities, providing a rigorous mathematical foundation for understanding turbulence and demonstrating that smooth initial data yields a unique global smooth solution to the three-dimensional incompressible Navier-Stokes equations.

Reconstructing the Antikythera Mechanism's Central Front Dial Parts

The design of the Antikythera Mechanism's central front parts was analyzed and reconstructed in bronze, with a focus on the zodiac dial ring and its 365 subdivisions. The reconstruction accurately represented the solar anomaly and unequal time span of astronomical seasons by dividing the zodiac months into unequal subdivisions, resulting in a functional central front dial with minimal hypothetical assumptions.

Attention Is All You Need

The Transformer, a new network architecture based solely on attention mechanisms, achieves superior results in machine translation tasks while being more parallelizable and requiring less training time. The model sets new state-of-the-art scores in English-to-German and English-to-French translation tasks and also generalizes well to English constituency parsing, demonstrating its effectiveness and versatility.

Code

Show HN: Open-source LLM-powered test automation library for mobile and web

Alumnium is an experimental project that aims to simplify test automation by providing a higher-level abstraction for testing, working with tools like Appium, Playwright, and Selenium. It is currently in the early stages of development and not recommended for production use, but offers features like natural language interactions and robust assertion mechanisms, as demonstrated in its quick start example using Selenium and OpenAI.

Memvid – Video-Based AI Memory

Memvid is a lightweight AI memory solution that encodes text data into videos, enabling fast semantic search and sub-second retrieval times across millions of text chunks. It compresses knowledge bases into compact video files, providing 10x storage efficiency and instant access to information, making it suitable for various applications such as digital libraries, educational content, and corporate knowledge bases.

AgentDesk - Desktops for AI agents

AgentDesk is a platform that provides full-featured desktop environments that can be programmatically controlled by AI agents, allowing for tasks such as launching UI, opening web pages, and taking screenshots. It can be installed and used locally or in the cloud, with features including a REST API, support for Docker and Kubernetes, and image processing capabilities.

llm_poker: A minimal Hold'em environment that manages multiple LLM-based players

The llm_poker environment is a minimal Texas Hold'em setup that allows multiple LLM-based players to compete, managing everything from dealing cards to betting rounds and showdowns. The environment includes features such as blinds, betting, and showdown logic, and can be installed and run using the command line, with options to customize the game, including the number of rounds and starting chip stack.

MCP Streamable HTTP – Python and TypeScript Examples

This repository provides example implementations of MCP Streamable HTTP client and server in Python and Typescript, allowing for cross-language compatibility and demonstrating how to set up a client-server stack. The examples include a weather service that uses Anthropic's Claude language model, and can be run using either Python or Typescript, with clients able to communicate with servers written in the other language.