Tuesday — November 25, 2025

LLMs are stunned by thousands of invisible Unicode characters, the experimental Pulse-Field O(N) architecture is 12x faster than Transformers and RL now controls biohybrid robots with living muscle.

News

Shai-Hulud Returns: Over 300 NPM Packages Infected

This list of software package updates reveals a diverse technology stack where AI/LLM tooling is becoming a standard dependency. Alongside typical web development and API packages, it includes updates for the Voiceflow conversational AI platform, prompt engineering utilities, and libraries for interacting with models like Claude. This demonstrates the increasing integration of AI components directly within modern application development.

Show HN: I built an interactive HN Simulator

The text is a generated feed from an AI-powered Hacker News simulator, showcasing the model's ability to mimic the platform's style. The output is a chaotic mix of plausible tech articles, such as a paper on GNNs, alongside satirical, nonsensical, and offensive headlines. This highlights the varied and unfiltered nature of the data used to train such LLMs.

Show HN: Stun LLMs with thousands of invisible Unicode characters

A text obfuscation tool inserts invisible, zero-width Unicode characters between letters, making text readable to humans but incomprehensible to LLMs. This technique effectively disrupts major models like ChatGPT, Gemini, and Meta AI, causing them to crash, error out, or ignore the input. It functions as an anti-plagiarism or anti-scraping method by breaking tokenization and wasting context window tokens.

Implications of AI to schools

Andrej Karpathy argues that since AI text detection is futile, educators must assume all out-of-class work is AI-assisted. He proposes shifting evaluations to proctored, in-class settings to ensure students learn the fundamentals. The goal is for students to use AI as a powerful tool, analogous to a calculator, but retain the ability to work without it and critically verify its fallible outputs, a point underscored by new multimodal models that can solve exam questions from images.

The Bitter Lesson of LLM Extensions

The evolution of LLM customization has oscillated between complex, structured APIs like ChatGPT Plugins and MCP, and simpler, natural language-based methods. The trend is moving towards the latter, with mechanisms like repo-native .cursorrules and "Agent Skills" proving more effective as models become more capable. The current winning strategy involves providing agents with general compute access and high-level instructions in markdown, allowing the LLM to use general-purpose tools to accomplish tasks rather than relying on rigid, pre-defined APIs.

Research

A Long-Tail Professional Forum-Based Benchmark for LLM Evaluation

LPFQA is a new benchmark proposed to address the limitations of existing LLM evaluations, which often overlook long-tail knowledge and real-world complexity. Derived from authentic professional forums across 20 fields, LPFQA introduces fine-grained evaluation dimensions and a hierarchical difficulty structure to test specialized knowledge and reasoning. Evaluations of 12 mainstream LLMs on this benchmark revealed significant performance disparities, particularly in specialized reasoning tasks, demonstrating its value as a more discriminative tool for assessing model capabilities.

Careless Whisper: Exploiting Silent Delivery Receipts to Monitor Users

Researchers demonstrate a significant privacy vulnerability in messaging apps like WhatsApp and Signal that exploits delivery receipt mechanisms. By sending crafted messages at high frequency, an attacker can silently ping a target to infer private information such as online/activity status, active device count, and OS. This side-channel attack requires only the victim's phone number and can also be used for resource exhaustion, such as draining battery or data, without generating any user-side notifications.

Inside VOLT: Designing an Open-Source GPU Compiler

The Vortex-Optimized Lightweight Toolchain (VOLT) is a new compiler framework designed to simplify code generation for emerging open-source GPU architectures. It utilizes a hierarchical design that centralizes core SIMT analyses and optimizations in its middle-end, enabling reusable support for diverse front-end languages and hardware back-ends. This modular approach makes the toolchain highly extensible and easily adaptable to evolving open GPU ISAs and runtime APIs.

Counterfactual World Models via Digital Twin-Conditioned Video Diffusion

This work introduces counterfactual world models to answer "what if" queries, addressing a limitation in traditional forward-simulation models. The proposed CWMDT framework first converts a video into a structured text representation, or digital twin. An LLM then reasons over this representation to predict how a counterfactual intervention propagates through time, and a video diffusion model is conditioned on the modified text to generate the resulting visual sequence, achieving SOTA performance.

RL Control of Exercise-Strengthened Biohybrid Robots in Simulation

This work uses reinforcement learning to control and co-design biohybrid robots with living muscle actuators, which present a challenge as their force output changes with use. An RL agent successfully coordinated 42 muscles on a worm-like robot for targeted steering, demonstrating that adaptive agents outperform non-adaptive ones in rewards and training time. The RL framework also serves as a co-design tool by identifying which muscles are most critical for specific tasks.

Code

AI has a deep understanding of how this code works

An error occurred while attempting to retrieve the README file, preventing the source text from being loaded.

Show HN: Pulse-Field – O(N) AI Architecture (12x faster than Transformers)

Pulse-Field v4.0 is an experimental O(N) neuro-symbolic architecture that replaces the Transformer's O(N^2) attention with an event-driven routing mechanism integrated with SSMs. This physics-based approach claims to drastically reduce parameters and compute, showing up to 27,000x fewer FLOPS than GPT-2 at a 4k context length. Despite being unoptimized Python, it achieves faster latency than an optimized Transformer on CPU for contexts over 2k tokens.

Show HN: I built a CLI tool to map your codebase for LLMs

codemap is a command-line tool that generates a compact, token-efficient "brain map" of a codebase to provide LLMs with instant architectural context. It intelligently clusters files, flattens directories, and ignores noise to create a single, pasteable block for prompts. This eliminates the need to burn tokens on manual project structure explanations or repeatedly paste directory trees.

Show HN: Pg-aiguide – Write better PostgreSQL code with AI

pg-aiguide is a tool that improves PostgreSQL code generation from LLMs by providing version-aware RAG over official documentation and curated best-practice "skills". It integrates with AI coding agents as a public MCP server or a dedicated Claude Code plugin. This approach helps LLMs generate more robust and performant schemas by incorporating modern PG features, better indexing strategies, and proper constraints.

Show HN: DataTalk CLI, Query CSV and Excel in Plain English Using LLM and DuckDB

DataTalk CLI is a tool for querying local CSV, Excel, and Parquet files using natural language from the terminal. It leverages an LLM to generate SQL queries that are executed by an embedded DuckDB engine, ensuring data privacy by only sending the schema to cloud models. The tool integrates with over 100 models via LiteLLM and supports fully offline operation with local Ollama models, while also allowing inspection of the generated SQL for transparency.