Friday — October 24, 2025

An AI gun detector mistakes a Doritos bag for a weapon, a new open-source project uses an LLM agent to build interactive UIs, and research lets LLMs autonomously optimize their own JSON schemas.

News

Armed police swarm student after AI mistakes bag of Doritos for a weapon

An AI-powered gun detection system from Omnilert triggered an armed police response at a high school after misclassifying a student's crumpled Doritos bag as a firearm. The false positive, generated from surveillance footage, highlights the real-world consequences of classification errors in high-stakes AI applications. The company defended the system, stating it "functioned as intended" by escalating the detection for rapid human verification.

Show HN: I built a tech news aggregator that works the way my brain does

Major tech companies are rapidly integrating AI, with EA partnering Stability AI for game development, Instagram launching generative AI editing tools, and Amazon deploying an AI-powered shopping assistant. Significant investments in AI infrastructure include Apple accelerating AI server production and Google securing a multi-billion dollar TPU deal with Anthropic, which also enhanced its Claude LLM with a memory upgrade. However, AI ethics and data usage face increasing scrutiny, evidenced by a lawsuit against OpenAI concerning self-harm guardrails and Reddit suing Perplexity AI for alleged large-scale data scraping. Separately, Google achieved a quantum computing milestone with its Willow chip, demonstrating practical quantum advantage for molecular mapping.

New updates and more access to Google Earth AI

Google is enhancing Earth AI with "Geospatial Reasoning," a Gemini-powered framework that connects multiple foundation models like weather and satellite imagery to answer complex queries. New Earth AI models are also being integrated into Google Earth, enabling users to find objects and patterns in satellite imagery using natural language. Additionally, core Earth AI models are now available to Trusted Testers on Google Cloud, allowing businesses to combine them with their own data for tasks like environmental monitoring and disaster response.

Show HN: Git for LLMs – A context management interface

Linear chat interfaces for LLMs suffer from context clutter and an inability to explore parallel ideas, leading to degraded performance and increased token costs. Twigg is a tool that reframes LLM interaction into a non-linear, tree-based structure. This allows users to branch conversations and manage context with precision, improving efficiency and coherence for complex or long-term projects.

We tested 20 LLMs for ideological bias, revealing distinct alignments

An experiment measured ideological bias in various LLMs by prompting them to choose between two opposing socio-political statements. Each prompt was run 100 times with a temperature of 1.0 to analyze the distribution of responses as a black-box test. The results reveal that LLMs are not ideologically uniform, with different models exhibiting distinct and consistent biases or "personalities." This demonstrates that the choice of model is a critical factor that can shape the information a user receives.

Research

Fast-DLLM: Training-Free Acceleration of Diffusion LLM

This work accelerates Diffusion LLM inference by introducing a block-wise approximate KV Cache for bidirectional models and a confidence-aware parallel decoding strategy. The KV Cache enables reuse with minimal performance drop, while the decoding strategy mitigates quality degradation by selectively generating high-confidence tokens to preserve dependencies. These techniques achieve up to a 27.6x throughput improvement, closing the performance gap with autoregressive models.

Parse: LLM Driven Schema Optimization for Reliable Entity Extraction

The paper introduces PARSE, a system that improves structured information extraction by treating JSON schemas not as static contracts, but as artifacts that LLMs can autonomously optimize for their own consumption. PARSE uses a component called ARCHITECT to refine schemas for LLM use and another called SCOPE for reflection-based extraction with guardrails, while maintaining backward compatibility. On datasets like SWDE, the system demonstrates up to a 64.7% improvement in extraction accuracy and reduces errors by 92% within the first retry.

CausalRAG: Integrating Causal Graphs into RAG

Traditional RAG systems are limited by context fragmentation from text chunking and an over-reliance on semantic similarity. CausalRAG is a novel framework that addresses these issues by incorporating causal graphs into the retrieval process. By constructing and tracing causal relationships, it preserves contextual continuity and improves retrieval precision, demonstrating superior performance over standard and other graph-based RAG methods.

Enhancing Transformer-Based Rerankers with Synthetic Data and LLM Supervision

To address the computational cost of using LLMs for document reranking, a novel pipeline is proposed that eliminates the need for human-labeled data. This method leverages LLMs to generate synthetic queries and label positive/hard-negative pairs, which are then used to fine-tune a smaller transformer model with contrastive learning and LCE loss. This approach significantly boosts in-domain performance and generalizes well, effectively reducing computational costs by using LLMs for data generation and supervision rather than inference.

Antislop: A framework for eliminating repetitive patterns in language models

Antislop is a framework for detecting and eliminating repetitive "slop" in LLM outputs. It introduces an inference-time sampler with backtracking and a novel fine-tuning method, Final Token Preference Optimization (FTPO), which surgically adjusts token logits. FTPO achieves a 90% reduction in slop while maintaining or improving performance on benchmarks like GSM8K and MMLU, significantly outperforming DPO which degrades quality.

Code

Show HN: Deta Surf – An open source and local-first AI notebook

Deta Surf is an open-source, local-first AI notebook built with Svelte, TypeScript, and Rust. It features RAG-like capabilities over local files, PDFs, YouTube videos, and web searches, generating in-line responses with deep-linked citations. The platform supports using your own cloud model API keys or local LLMs and includes "Surflets" for generating interactive applets from natural language prompts.

Show HN: OpenAI ChatGPT App starter DevXP feels like 2010, I built a better one

This repository is a minimal TypeScript starter for building ChatGPT Apps using the OpenAI Apps SDK and the Model Context Protocol (MCP). It provides an Express-based MCP server that exposes tools capable of rendering React UI widgets directly within the ChatGPT interface. The project is configured with Vite for Hot Module Replacement (HMR), allowing developers to see real-time updates to widgets during a conversation without restarting the server.

Show HN: Comfy Nodekit – build/serialize ComfyUI workflows in Python

Comfy Nodekit is a Python library for programmatically building ComfyUI workflows. It generates typed Python functions that mirror the nodes in a running ComfyUI instance, allowing developers to define and connect nodes in code. The resulting graph can be serialized into the standard JSON format for use with the ComfyUI API.

Show HN: Generative UI: OSS "Imagine with Claude"

"Generative UI" is an open-source project where an LLM agent builds interactive applications on a canvas. The agent operates primarily through tool-use, creating and modifying UI within sandboxed iframes via direct DOM mutations. A key feature is its self-invoking loop, where user interactions in the generated UI can re-invoke the agent to continue the workflow. The architecture is entirely browser-based, making direct, streaming API calls to LLM providers with API keys stored in localStorage.

Tweakcc

tweakcc is an interactive CLI tool for customizing the Claude Code client by patching its minified cli.js file. It allows users to modify all system prompts, create custom UI themes, and change the context limit when using custom Anthropic-compatible APIs. User configurations are saved and can be reapplied after Claude Code updates.