Wednesday November 26, 2025

A mathematical ceiling may limit generative AI to amateur creativity, a new tool hands off web bugs to AI agents, and a Brain-Computer Interface protects human teams from misleading AI.

News

FLUX.2: Frontier Visual Intelligence

Black Forest Labs has released FLUX.2, an image generation model family designed for professional creative workflows. It features multi-reference support for up to 10 images, 4MP image editing, and significantly improved text rendering and complex prompt adherence. The architecture combines a Mistral-3 24B VLM with a rectified flow transformer and a new VAE for an improved latent space. Following an open core model, FLUX.2 is available as production APIs and as open-weight models, including the 32B FLUX.2 [dev] checkpoint on Hugging Face.

Show HN: A WordPress plugin that rewrites image URLs for near-zero-cost delivery

This WordPress plugin leverages Cloudflare R2 and Workers to create an image CDN, rewriting image URLs to serve them from the edge with zero egress fees. It provides a one-click managed service or a free, self-hosted option by deploying the open-source Worker to your own Cloudflare account. The system is designed for delivery, not optimization, and works alongside existing caching and image processing plugins.

Show HN: We cut RAG latency ~2× by switching embedding model

MyClone.is migrated its RAG pipeline from OpenAI's 1536-dim text-embedding-3-small to Voyage AI's 512-dim Voyage-3.5-lite to address latency and cost bottlenecks. Leveraging Matryoshka Representation Learning, the new model maintained retrieval quality despite its lower dimensionality. The switch resulted in a ~66% storage reduction, 2x faster retrieval, and a 15-20% decrease in end-to-end voice latency, improving both user experience and unit economics for their digital persona product.

Eggroll: Novel general-purpose machine learning algorithm provides 100x speed

Researchers introduce EGGROLL, a novel Evolution Strategies (ES) algorithm for backprop-free optimization of large models. It uses low-rank matrix perturbations to generate high-rank parameter updates, drastically reducing the computational and memory costs of naïve ES. This method yields a hundredfold increase in training throughput, nearly matching the speed of batched inference and effectively closing the gap between training and inference. Experiments show EGGROLL is competitive with GRPO for LLM reasoning tasks and enables training novel architectures, such as a pure integer RNN without activation functions.

A mathematical ceiling limits generative AI to amateur-level creativity

A theoretical analysis suggests LLMs have a mathematical ceiling on creativity, limiting them to an amateur level. The study posits that the probabilistic next-token prediction mechanism creates a fundamental trade-off between effectiveness and novelty. As a model optimizes for effectiveness by selecting high-probability tokens, it inherently sacrifices novelty. This inverse relationship mathematically caps the combined creativity score, preventing LLMs from simultaneously achieving the high levels of both metrics required for expert-level work under current architectures.

Research

Human-AI Decision Making Costs in Synthetic Teams

A collaborative Brain-Computer Interface (cBCI) can protect human-AI teams from catastrophic failure caused by misleading AI feedback under high cognitive load. In a VR surveillance task, a team decision aggregated from pre-response EEG signals maintained 98% accuracy despite AI deception, while a traditional majority vote collapsed to 44%. This resilience is achieved through a neuro-behavioural decoupling, where the BCI learns to interpret neural signatures of effortful deliberation, providing an accurate signal even when the operator's behavior is biased by the faulty AI.

Pre-Cache: A Microarchitectural Solution to Prevent Meltdown and Spectre

This paper presents a microarchitecture-based solution for Meltdown and Spectre that addresses the root vulnerabilities of speculative and out-of-order execution. Unlike inefficient software patches, this approach prevents flushed instructions from exposing data to the cache and other memory structures. The solution is shown to thwart new attack variants and restore secure execution with minimal performance overhead.

3I/Atlas spectrophotometric evidence: metal-bearing, carbonaceous, pristine

Analysis of the interstellar comet 3I/ATLAS indicates it is a primitive, metal-rich carbonaceous object. Its unusual coma and chemical products are attributed to Fischer-Tropsch reactions, triggered by the corrosion of fine-grained metal in the presence of abundant water ice. This process, uncommon in typical Solar System comets, makes 3I/ATLAS a valuable analog for studying pristine Trans-Neptunian and Oort Cloud objects.

NSan: A Floating-Point Numerical Sanitizer

nsan is a new LLVM sanitizer for detecting floating-point numerical issues. It uses compile-time instrumentation to check computations against a higher-precision shadow value at runtime. This approach is 1-4 orders of magnitude faster than existing methods, making it practical for routine use in unit tests and large production applications.

When Video Detailed Captioners Evolve Themselves via Agentic Self-Reflection

VDC-Agent is a self-evolving framework for video captioning that requires no human annotations or larger teacher models. It employs a closed-loop agent that iteratively generates captions, scores them with principle-guided feedback, and refines its prompts to automatically create a preference dataset from unlabeled videos. An MLLM fine-tuned on this dataset using DPO achieves state-of-the-art performance on the VDC benchmark, surpassing specialized video captioners.

Code

Show HN: We built an open source, zero webhooks payment processor

Flowglad is a developer-focused billing SDK designed to simplify complex, usage-based pricing models for AI applications. It offers a stateless, webhook-free integration that uses your app's existing user IDs as the single source of truth for billing state, eliminating the need to manage separate customer IDs. The full-stack SDK provides simple methods for feature gating and usage metering on both the frontend and backend, streamlining the implementation of tiered plans and credit-based systems.

Show HN: A better way to handoff web bugs to AI agents

FlowLens MCP provides coding agents with full browser context for debugging and regression testing. It uses a Chrome extension to record comprehensive user flows, including network, console, and DOM events, which are then shared with the agent via an MCP server. This allows the LLM to analyze issues directly from the recording, eliminating the need for manual reproduction and saving tokens.

Feedback on an open source Ruby – LLM project

An error occurred because the README file could not be retrieved.

Show HN: Open-Source Email Verifier

Email Verifier is a zero-dependency Node.js library for comprehensive email validation. It uses a pipeline of RFC 5322 format checks, DNS lookups for MX records, and non-intrusive SMTP probing via RCPT TO to determine validity. The library includes advanced features like catch-all detection, mail provider identification, caching, and a built-in token-bucket rate limiter with exponential backoff, ultimately returning a confidence score for each email.

Show HN: I vibe-coded a tool to decode a legacy system nobody understood

CodeCompass is an open-source intelligence platform for analyzing and modernizing legacy codebases. It synthesizes knowledge from code (via AST parsing), databases, and runtime data, storing structured information in PostgreSQL and embeddings in a Weaviate vector database. This enables semantic search and natural language querying to provide deep, system-level context, which can be used to enhance RAG for LLM-based developer tools.

    A mathematical ceiling may limit generative AI to amateur creativity, a new tool hands off web bugs to AI agents, and a Brain-Computer Interface protects human teams from misleading AI.