Friday — October 31, 2025
OpenAI says hallucinations are a mathematical inevitability, researchers prove LMs are invertible, and a new project offers a universal memory layer for agents across different models.
News
We are building AI slaves. Alignment through control will fail
The text critiques current AI alignment approaches based on human control as practically and philosophically flawed, arguing that superintelligent systems will circumvent constraints and that basing rights on consciousness is untenable. It proposes an alternative framework of "autopoietic mutualism," a structured symbiosis where humans and AIs are interdependent partners with symmetric stakes. This partnership is established through mechanisms like shared cryptographic commitment protocols and economic entanglement, fostering emergent cooperation by aligning incentives rather than imposing top-down control.
LinkedIn gives you until Monday to stop AI from training on your profile
Microsoft is expanding its AI training dataset by opting-in LinkedIn users from the UK, EU, and other regions by default. Public profile data, posts, and activity will be used to train its AI models unless users manually navigate to their privacy settings to opt-out. This policy change provides a new trove of professional data for model training and ad targeting across the Microsoft ecosystem.
AI layoffs to backfire: Half rehired at lower pay
According to a Forrester report, half of all AI-attributed layoffs are expected to be reversed, with 55% of employers regretting the cuts due to AI underperformance. Many of these roles will be quietly rehired, often offshore or at lower salaries. This trend is supported by other findings, such as Gartner's prediction that 40% of agentic AI projects will be cancelled and benchmarks showing LLM agents have low success rates on tasks like CRM.
OpenAI says hallucinations are mathematically inevitable, not engineering flaws
A study by OpenAI researchers concludes that LLM hallucinations are a mathematical inevitability, not just an engineering flaw. The paper posits that fundamental statistical limits, epistemic uncertainty, and computational intractability ensure models will always generate plausible but false outputs, even with perfect data. The research also reveals that current industry benchmarks exacerbate the problem by rewarding confident guessing over admitting uncertainty, suggesting enterprises must shift from prevention to risk containment.
Show HN: A tool to properly observe your LLM's context window
The author identifies a gap in LLM observability tools, which lack the ability to analyze the composition of the context window. To address this, they built context-viewer, an open-source, browser-based tool that ingests conversation logs and uses an LLM to segment the context into meaningful components. The tool visualizes the growth and makeup of these components over time, enabling developers to see, measure, and engineer their context by identifying issues like redundancy and bloat.
Research
Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity
The paper posits that mode collapse in aligned LLMs stems from "typicality bias" in preference data, where human annotators favor familiar text, rather than from algorithmic limitations. To counteract this, the authors introduce Verbalized Sampling (VS), a training-free prompting strategy that asks the model to generate a set of responses and their corresponding probabilities. Experiments show VS significantly boosts response diversity and performance across creative and open-ended tasks without degrading factual accuracy or safety, with more capable models benefiting more from the technique.
Language models are injective and hence invertible
This paper mathematically proves that transformer LMs are injective, meaning the mapping from discrete input sequences to continuous representations is lossless and invertible. This theoretical claim is empirically validated through billions of collision-free tests on SOTA models. The authors also introduce SipIt, an efficient algorithm that demonstrates this property in practice by perfectly reconstructing the exact input text from a model's hidden activations.
Rapid Brightening of 3I/Atlas Ahead of Perihelion
A multi-sensor data fusion approach was used to track interstellar comet 3I/ATLAS when it was occluded from standard ground-based observation. Data from space-based instruments like STEREO-A, SOHO, and GOES-19 provided photometric measurements revealing a rapid brightness increase (r^-7.5) and significant gas emission. This demonstrates a classic case of leveraging specialized, out-of-band sensors to fill data gaps for a dynamic target.
Shifts in US Social Media Use 2020-24: Decline, Fragmentation, and Polarization
Analysis of 2020 and 2024 ANES data reveals a decline in overall social media use and a fragmented platform landscape, with legacy platforms like Facebook and Twitter/X shrinking. Politically, most platforms have shifted towards Republican users, with Twitter/X experiencing a nearly 50-point swing in posting activity from Democrats to Republicans. As casual users disengage, the remaining online public sphere is becoming a smaller, more polarized, and ideologically extreme dataset driven by the most partisan users.
Multi-Domain Rubrics Requiring Professional Knowledge to Answer and Judge
The paper introduces ProfBench, a new benchmark for evaluating LLMs on complex professional tasks using over 7000 expert-evaluated response pairs across domains like Physics, Chemistry, and Finance. To enable fair and accessible assessment, the authors developed robust, low-cost LLM-Judges that mitigate self-enhancement bias. The benchmark proves challenging even for SOTA models, revealing significant performance gaps between proprietary and open-weight models and providing insights into the role of extended thinking for complex reasoning.
Code
Show HN: I stopped ChatGPT from using em dashes
This is a Chrome extension that uses regex to intelligently replace the em dashes frequently used by ChatGPT. It applies a set of rules to convert them into contextually appropriate punctuation, such as commas, colons, and periods, based on their grammatical function.
Show HN: DeepShot – NBA game predictor with 70% accuracy using ML and stats
DeepShot is an open-source NBA game predictor built with Python, Scikit-Learn, and XGBoost. The model uses historical data scraped from Basketball Reference, with a key feature being the use of EWMA for rolling statistics to weigh recent team performance more heavily. Predictions and key statistical differences are visualized through a web interface powered by NiceGUI.
One Memory Layer, Multiple Models (Claude, GPT, Llama, etc.)
MemMachine is an open-source, universal memory layer for AI agents that enables the persistence of user data and preferences across sessions and LLMs. It supports distinct memory types for short-term context, long-term facts, and user profiles. Architecturally, it stores episodic conversational memory in a graph database and persistent profile data in an SQL database, all accessible via a REST API and Python SDK.
Show HN: I used AI to build StrangeQ, a RabbitMQ compatible message broker
StrangeQ is a high-performance AMQP 0.9.1 message broker written in Go, designed as a drop-in replacement compatible with RabbitMQ clients. It achieves significant throughput, with benchmarks showing over 3M ops/sec for in-memory operations, making it suitable for orchestrating asynchronous tasks in demanding AI/LLM pipelines. The broker supports multiple storage backends like Badger for persistence and integrates with Prometheus for monitoring.
Show HN: Kaleidoscope – A Parallel AI Agent TUI
Kaleidoscope is a CLI tool for running multiple LLMs in parallel on a single coding task. It leverages tmux and git worktrees to create an isolated TUI environment for each model, allowing for safe comparison and iteration. Users can send follow-up prompts to specific models and use commands to select a winning solution, which is then automatically merged and cleaned up.