Saturday — August 16, 2025

Codeberg's Anubis challenge is being solved by AI crawlers, researchers propose the Fairy±i framework for 2-bit complex LLMs, and the Nabu Android app combines Text-to-Speech and chat capabilities with on-device large language models.

News

It seems like the AI crawlers learned how to solve the Anubis challenges

Codeberg experienced a period of extreme slowness due to an influx of AI crawlers, but after adjusting their protections, performance improved, and they were able to keep up with the additional load of new users. The platform uses a tool called Anubis to protect against AI crawlers, which requires browsers to perform a proof-of-work challenge before accessing the site, but some users have raised concerns about its compatibility and potential impact on legitimate users.

Best Practices for Building Agentic AI Systems

The author has been experimenting with adding AI agents to their feedback platform, UserJot, to analyze customer feedback at scale and auto-generate changelog entries. They found that a two-tier agent model, consisting of primary agents that handle conversation and context, and subagents that perform specific tasks, works best, and have developed several key principles and patterns for building effective agent systems, including stateless subagents, task decomposition, and structured communication protocols.

Using AI to secure AI

Anthropic's Claude Code security review feature uses a specialized prompt to identify and fix security issues in code, but its effectiveness is limited and should be used in conjunction with other security measures, such as human code review and testing. The author tested the feature on their own code and found it to be useful, but not foolproof, and recommends using it as part of a larger security toolkit, rather than relying solely on it to ensure code security.

AI-induced dehumanization (2024)

Recent technological advancements have enabled nonhuman objects, such as virtual assistants and robots, to emulate human intelligence and behavior, potentially blurring the lines between human and AI. This shift may lead to a phenomenon where people perceive autonomous agents as more humanlike, but in doing so, they may also perceive actual humans as less human, a process known as dehumanization, which can have significant consequences for how people interact with and treat each other.

Some users report their Firefox browser is scoffing CPU power

Some Firefox users are reporting that the browser is consuming excessive CPU power, which appears to be caused by an "inference engine" built into recent versions of Firefox, likely related to the browser's AI-enhanced tab groups feature. Mozilla has acknowledged a performance bug and is working to improve the feature, which processes information privately on-device.

Research

IFairy: The First 2-bit Complex LLM with All Parameters in {\pm1, \pm i}

Researchers propose Fairy$\pm i$, a 2-bit quantization framework for complex-valued large language models (LLMs) that surpasses the accuracy ceiling of existing methods by leveraging the complex domain to boost full-precision accuracy. The framework achieves state-of-the-art results in terms of perplexity and downstream tasks while maintaining strict storage and compute efficiency, opening a new direction for building highly accurate LLMs under low-bit constraints.

SiLQ: Simple Large Language Model Quantization-Aware Training

Large language models can be optimized through quantization to improve performance and reduce costs, but this typically comes with a loss of accuracy. A new quantization-aware training approach has been developed that achieves state-of-the-art results with minimal loss of accuracy and can be easily applied to various model architectures without requiring significant changes.

Distillation Scaling Laws

A proposed distillation scaling law enables optimal allocation of compute budget between teacher and student models to maximize student performance, mitigating risks in large-scale distillation. The law provides guidelines for when distillation outperforms supervised learning, depending on factors such as the number of students and whether a teacher model already exists or needs to be trained.

D2F – We made dLLMs 2.5x faster than LLaMA3

This paper introduces discrete diffusion forcing (D2F), a strategy that enables diffusion large language models (dLLMs) to achieve faster inference speeds than autoregressive language models of similar size. By implementing D2F, dLLMs can achieve over 2.5 times the inference speed of certain models and up to 50 times faster than vanilla dLLMs, while maintaining comparable output quality.

Towards Memory Specialization: A Case for Long-Term and Short-Term RAM

Memory technologies like SRAM and DRAM have stopped improving in terms of cost reduction, leading to memory dominating system costs. To address this, the paper proposes a shift towards specialized memory architectures, including new memory classes like long-term RAM and short-term RAM, which can be optimized for specific application workloads and integrated into future system designs.

Code

Show HN: Nabu (TTS Reader and LLM Playground on Android)

Nabu is an advanced Android app that combines Text-to-Speech (TTS) and chat capabilities, built upon the foundation of the Kokoro-82M Android demo, with features such as dynamic model management, multi-engine TTS support, and an advanced audio book reader. The app allows users to chat with on-device large language models, manage different chat models, and customize their TTS experience with various voice characteristics and settings.

Show HN: Orca – AI Game Engine

The Orca Engine is a modified version of the Godot Engine that integrates a chatbot with complete access to Godot, allowing for advanced project management, scene and node manipulation, script generation, and more. To use the Orca Engine, users must build both the Godot editor and set up the AI backend server, following platform-specific setup instructions for macOS, Windows, or Linux.

Show HN: I love ChatGPT Memory, so I built one

There is no text to summarize. The input appears to be an error message indicating that a README file could not be retrieved.

Show HN: Agentic Sync – AI-Native Task Management Platform

Agentic Sync is a task management platform designed for developers and AI agents to collaborate, featuring a Getting Things Done (GTD) implementation with built-in AI agent integration and a user-friendly interface. The platform offers various deployment options, including web, desktop, and local database, and provides features such as instant UI feedback, task management, initiative tracking, and MongoDB integration.

Show HN: Open-Source Character.ai Alternative with Ollama and Groq API Support

ChaiChat is a simple chat client for Ollama and Groq API, built with Electron, Vite, React, and SQLite, allowing users to chat with AI models. The app is available for download on various platforms, including Windows, macOS, and Linux, with platform-specific installation files provided.