Wednesday — November 5, 2025

Amazon demands Perplexity stop its AI agent from making purchases, a new platform uses multi-model consensus to read MRIs and Cache-to-Cache enables direct semantic communication between LLMs.

News

Codemaps: Understand Code, Before You Vibe It

Cognition's Windsurf Codemaps is a new tool that addresses the problem of AI coding assistants reducing developer understanding. It generates AI-annotated, structured maps of a codebase in response to a task-specific prompt, using models like SWE-1.5 and Claude Sonnet 4.5. This is designed to accelerate a developer's ability to build a mental model for complex tasks like debugging or onboarding. The resulting codemaps can also be used as highly specific, human-readable context to improve the performance of other AI agents.

Server DRAM prices surge 50% as AI-induced memory shortage hits hyperscalers

The massive demand for AI hardware is causing a severe server DRAM shortage, with prices surging up to 50% and order fulfillment for hyperscalers dropping to 70%. Manufacturers are reallocating production capacity from conventional DDR5 to AI-specific memory like HBM to meet demand. This supply constraint is expected to persist into 2026, with smaller buyers facing even lower fulfillment rates.

YouTube AI error costs creator his channel over alleged link to Japanese account

A glitch in YouTube's AI moderation system caused the termination of multiple creator channels, including the popular tech channel Enderman. The automated system incorrectly linked these channels to an unrelated Japanese account that had received copyright strikes, triggering the terminations. YouTube later acknowledged the error and reinstated the falsely terminated accounts.

Lessons from interviews on deploying AI Agents in production

A survey of agentic AI startups reveals that the primary obstacles to enterprise deployment are non-technical, focusing on workflow integration, the human-agent interface, and employee trust. Successful strategies involve a "Think Small" approach, targeting low-risk, easily verifiable tasks with clear ROI and positioning agents as co-pilots. The ecosystem remains nascent, with over half of startups building their agentic infrastructure in-house; while accuracy is generally high, enterprises often limit agent autonomy, favoring human-in-the-loop configurations.

Amazon Demands Perplexity Stop AI Agent from Making Purchases

Amazon is suing Perplexity AI to block its AI browser agent, Comet, from making purchases on its e-commerce platform. The lawsuit alleges computer fraud and a violation of Amazon's terms of service, claiming the agent fails to disclose it is an automated system acting on a user's behalf. This legal challenge highlights a significant conflict over the operational boundaries of agentic AI interacting with established web services.

Research

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

This work identifies "typicality bias" in preference data—a cognitive tendency for human annotators to favor familiar text—as a fundamental driver of mode collapse in aligned LLMs. To circumvent this, the authors introduce Verbalized Sampling (VS), a training-free prompting strategy that asks the model to generate a set of responses and their corresponding probabilities. Experiments show VS significantly improves diversity and performance across creative and open-ended tasks without sacrificing factual accuracy or safety, with more capable models showing greater benefit.

Google Quantum AI revived a decades-old concept known as quantum money

This paper introduces a single-use quantum money construction that addresses the privacy and practicality limitations of existing schemes. It features a user-auditable procedure to detect tracking by the issuing authority and employs classical bill validation, removing the need for long-term quantum memory or communication. The protocol has potential applications beyond currency, including anonymous one-time pads and voting.

Cache-to-Cache: Direct Semantic Communication Between Large Language Models

This work introduces Cache-to-Cache (C2C), a paradigm for multi-LLM systems that bypasses slow, lossy text-based communication. C2C enables direct semantic transfer by using a neural network to project and fuse the KV-cache from a source model into a target model's layers. This method outperforms traditional text communication by 3-5% in accuracy and delivers an average 2x speedup in latency by avoiding intermediate token generation.

The Physics of News, Rumors, and Opinions

This review proposes a statistical physics framework for analyzing complex dynamics in modern socio-technological systems. It covers foundational concepts like epidemic and spin models on complex networks to model the collective dynamics of information spreading and opinion formation. The framework is applied to understand phenomena such as misinformation cascades and polarization, bridging theoretical models with empirical data analysis.

Continuous Autoregressive Language Models

Continuous Autoregressive Language Models (CALM) are introduced to overcome the token-by-token generation bottleneck in LLMs. Instead of predicting the next discrete token, CALM uses a high-fidelity autoencoder to compress a chunk of K tokens into a single continuous vector, enabling next-vector prediction. This paradigm, supported by a novel likelihood-free framework, reduces the number of generative steps by a factor of K and demonstrates a significantly improved performance-compute trade-off over strong discrete baselines.

Code

Show HN: Concierge - Framework for Building Agentic Web Interfaces

Concierge is a declarative framework for exposing services to LLM agents. It allows developers to define structured workflows as a graph of stages, tasks, and valid transitions, effectively creating a state machine to guide agent interactions. This enforces rules and prerequisites for complex, multi-step processes while maintaining a persistent state.

Fantasy – Build AI agents with Go. Multi-provider, multi-model, one API

Fantasy is a new Go library for building AI agents with a unified, multi-provider, and multi-model API. It allows developers to define custom tools for agents and compiles to native machine code. The project is currently in preview and does not yet support multi-modal models, file uploads, or built-in provider tools.

Fantasy, Build AI agents with Golang by Charm

Fantasy is a new Go library from Charm for building AI agents with a unified, multi-provider, and multi-model API. It supports dedicated providers like OpenRouter and includes a generic layer for OpenAI-compatible services. Agents can be equipped with custom tools and compiled to native machine code, though the project is currently in preview and lacks multi-modal support.

Show HN: DeepShot – NBA game predictor with 70% accuracy using ML and stats

DeepShot is an open-source NBA game predictor that uses an XGBoost model trained on scraped historical data. Its key technical feature is the use of EWMA for feature engineering to capture recent team momentum. The project includes a local web application built with NiceGUI for visualizing predictions and key statistical differences.

Show HN: ReadMyMRI DICOM native preprocessor with multi model consensus/ML pipes

ReadMyMRI is a medical imaging platform featuring a multi-agent AI system that uses a consensus mechanism between models like GPT-4V and Claude 3 to improve analysis accuracy and reduce false positives. Built on a high-performance FastAPI streaming architecture, it is designed to be protocol mismatch resistant for robustly handling real-world DICOM data. The system also provides HIPAA-compliant PHI removal and generates structured, radiology-grade reports from the multi-agent analysis.