Saturday July 12, 2025

ETH Zurich and EPFL are set to release a LLM developed on public infrastructure, a new LLM Inference Handbook provides comprehensive guidance for engineers, and researchers have introduced dynamic chunking for end-to-end hierarchical sequence modeling.

News

ETH Zurich and EPFL to release a LLM developed on public infrastructure

The provided text appears to be a sitemap or navigation menu for the ETH Zurich website, listing various pages and categories related to the university, including news, education, research, and more. There is no specific information or article content provided, just a collection of links and menu items.

LLM Inference Handbook

This handbook provides comprehensive guidance on LLM inference in production, covering core concepts, performance metrics, optimization techniques, and operation best practices to help engineers deploy, scale, and operate LLMs efficiently. It is a continuously updated resource that aims to bring together fragmented knowledge on LLM inference, making it a valuable tool for engineers seeking to improve the speed, cost, and reliability of their LLM deployments.

Recovering from AI addiction

Internet and Technology Addicts Anonymous (ITAA) is a Twelve-Step fellowship that supports individuals recovering from internet and technology addiction, including AI addiction, which involves compulsive and harmful use of AI-powered applications. The effects of AI addiction, a subset of internet addiction disorder, can lead to changes in the brain, impairing decision-making, cognitive function, and emotional processing, and is associated with various mental and physical health problems, including anxiety, depression, and increased risk of cardiometabolic disease and mortality.

AI agent benchmarks are broken

Current AI agent benchmarks are often unreliable, with 8 out of 10 popular benchmarks found to have severe issues, leading to misestimation of agents' capabilities by up to 100%. To address this, researchers have developed the AI Agent Benchmark Checklist (ABC), a 43-item checklist to help build more rigorous and trustworthy benchmarks that accurately evaluate AI agents' abilities.

Show HN: RULER – Easily apply RL to any agent

RULER (Relative Universal LLM-Elicited Rewards) is a new general-purpose reward function that simplifies the process of adapting reinforcement learning (RL) to new tasks, eliminating the need for labeled data, hand-crafted reward functions, and human feedback. RULER has been shown to outperform traditional methods in four realistic agentic applications, and its implementation is fully open-sourced as part of the ART agent-training framework.

Research

Human-Like Forgetting Curves in Deep Neural Networks

Researchers have developed a framework to measure information retention in neural networks, finding that they exhibit human-like forgetting curves, with knowledge becoming more robust through scheduled reviews. This discovery suggests that neural networks naturally emulate human memory decay, and can inform the development of continual learning algorithms to mitigate forgetting and improve training efficiency.

WatchWitch: Interoperability, Privacy, and Autonomy for the Apple Watch

Researchers have reverse-engineered the Apple Watch's wireless protocols, discovering security issues and creating a custom Android reimplementation called WatchWitch that allows for interoperability and enhanced privacy controls. This breakthrough enables users to break free from Apple's closed ecosystem, providing more consumer choice and control over their smartwatch data and devices.

Global Warming in the Pipeline

The Earth's climate sensitivity to greenhouse gas emissions is estimated to be around 1.2°C per W/m$^2$, which suggests that current emissions could lead to a global warming of 10°C, reduced to 8°C by aerosols. To avoid catastrophic consequences, including piercing the 1.5°C and 2°C ceilings in the near future, a drastic change in approach is needed, including a global price on emissions, international cooperation, and intervention in Earth's radiation imbalance to reduce human-made climate transformation.

Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Researchers have introduced a new technique called dynamic chunking, which enables language models to learn content- and context-dependent segmentation strategies, allowing for true end-to-end models that eliminate the need for pre-processing steps like tokenization. The resulting hierarchical network (H-Net) outperforms traditional token-based models, demonstrating improved performance, robustness, and data efficiency, particularly in languages and modalities with weaker tokenization heuristics.

Potential Danger to Satellites from a 2032 Lunar Impact by Asteroid 2024 YR4

The asteroid 2024 YR4 has a 4% chance of impacting the Moon on December 22, 2032, which could release a significant amount of energy and eject lunar material into space. If the asteroid does impact the Moon, some of the ejected material could accrete to Earth and pose a risk to satellites in near-Earth space for up to a decade after the impact.

Code

Show HN: Vibe Kanban – Kanban board to manage your AI coding agents

Vibe Kanban is a tool that streamlines the process of working with AI coding agents, such as Claude Code and Codex, by enabling easy switching, orchestration, and review of tasks. It can be installed by running npx vibe-kanban in the terminal, and its documentation and user guides are available on the Vibe Kanban website.

Show HN: An Improvisational Web Server

Ginprov is an improvisational web server that generates HTML pages and images in real-time using Google's Gemini AI, allowing any URL path to become a content prompt. It can be run locally for free by obtaining a Gemini API key and following the provided installation instructions, with the project licensed under the MIT License.

Convert Pixel-Art-Style Images from GPT-4o into Usable Assets

Proper Pixel Art is a tool that converts noisy, high-resolution pixel-art-style images into true pixel resolution assets, working with images generated by LLMs like GPT-4o or screenshots of pixel art assets. The tool uses a multi-step algorithm involving edge detection, morphological closing, Hough transform, and color quantization to recover the original pixel art with "true" resolution and colors.

Show HN: Built emergent AI reasoning in 9 hours

Cogency is a Python library that allows users to build multi-step reasoning agents with ease, requiring minimal configuration and setup. It provides a simple and extensible way to create agents that can perform various tasks, such as calculations and web searches, and includes features like auto-discovery, clean tracing, and support for multiple languages, including Python and upcoming JavaScript support.

Show HN: OS Yamato – A gentle digital OS inspired by wabi-sabi

OS Yamato is a unique operating system that embodies a philosophy of impermanence, care, and simplicity, where digital memories and data are designed to fade and disappear over time, encouraging users to focus on meaningful moments. The system features a range of tools, including a diary, chat, and photo storage, all of which are designed to "wilt" and eventually delete after a year if left untouched, promoting a mindful and seasonal approach to digital interaction.