Tuesday — October 14, 2025

SOTA LLMs are solving multi-layered ciphers, Andrej Karpathy releases nanochat to train a ChatGPT-like model for ~$100, and a security audit finds leaked API keys in arXiv source files.

News

America is getting an AI gold rush instead of a factory boom

A significant economic divergence is occurring in the US, with the AI industry experiencing an unprecedented investment boom while the manufacturing sector enters a slump. The AI boom is driving massive spending on data centers, high-end chips, and power systems, but experts question its long-term job creation potential compared to traditional industries. Concerns are also rising about a potential AI investment bubble, as current revenue generation may not yet justify the hype, and the buildout could represent a short-term net loss for productivity.

America's future could hinge on whether AI slightly disappoints

Despite several negative economic indicators, the US economy is being propped up by a massive AI investment boom, which some economists believe is masking underlying weaknesses. This reliance creates a significant risk of an "industrial bubble" that could burst if the technology merely disappoints optimistic expectations. Potential triggers for such a crash include diminishing returns from scaling laws, low enterprise ROI on AI initiatives, and infrastructure constraints like power consumption.

LLMs are getting better at character-level text manipulation

Recent LLMs like GPT-5 and Gemini 2.5 demonstrate a significant generational improvement in character-level manipulation, a traditional weakness caused by tokenization. Experiments show these SOTA models can now reliably perform complex character substitution and solve multi-layered ciphers like Base64 followed by ROT20, often without explicit reasoning. This suggests models are developing a more generalized, algorithmic understanding of text operations, moving beyond simple memorization of common language patterns.

AI and the Future of American Politics

The article details how various political actors are operationalizing AI beyond simple misinformation. Professional campaigners are using LLMs to scale traditional tactics like fundraising, ad variant generation, and microtargeting. In contrast, organizers are exploring more novel applications, including AI-facilitated deliberation, "sensemaking" for policy formation, and building "Public AI" alternatives. Citizens are leveraging AI as a force multiplier for both grassroots activism, such as automated voter challenges and disinformation detection, and civic engagement, with the text concluding that minimal regulation will make the impact of these diverse uses highly unpredictable.

AI Is Too Big to Fail

The current AI investment climate is not a simple bubble but a geopolitically-driven arms race against China, framed as a national security imperative. This narrative justifies massive, seemingly irrational capital expenditures, creating a "too big to fail" dynamic where key AI players are implicitly backstopped by the US government. The urgency is fueled by China's long-term structural advantages in energy and robotics, forcing a high-stakes bet on a rapid US victory. The author concludes that if AI fails to deliver a transformative economic revolution, the resulting capital misallocation will trigger a catastrophic economic collapse.

Research

Stronger Adaptive Attacks Bypass Defenses Against LLM Jailbreaks

Current LLM defense evaluations are flawed because they use static, non-adaptive attacks. The paper argues for evaluating against adaptive attackers who specifically optimize against a given defense's design. By applying scaled optimization techniques like gradient descent and RL, the authors bypassed 12 recent defenses with over 90% success, demonstrating that future defenses must benchmark against such strong, adaptive attacks to make credible robustness claims.

AI Where It Matters: Where, Why, and How Devs Want AI Support in Daily Work

A large-scale study of 860 developers maps task perceptions to AI adoption patterns, revealing distinct needs. Developers show strong AI use for core work like coding and testing, high demand for reducing toil in documentation and operations, and clear limits for relationship-centric tasks like mentoring. Responsible AI priorities are context-dependent: reliability and security for systems-facing tasks, transparency and steerability for maintaining control, and fairness for human-facing work.

Barbarians at the Gate: How AI Is Upending Systems Research

The paper proposes AI-Driven Research for Systems (ADRS), an approach that leverages reliable verifiers, like performance benchmarks, to automate algorithm discovery. Using an iterative generate-and-evaluate loop, an ADRS framework discovered novel algorithms for tasks like load balancing and MoE inference that significantly outperform state-of-the-art human designs. The authors argue this will shift the human researcher's role from algorithm design to problem formulation and strategic guidance for the AI.

Modern iOS Security Features – A Deep Dive into SPTM, TXM, and Exclaves

This analysis details Apple's XNU kernel's shift from a monolithic to a compartmentalized, microkernel-like architecture. The paper examines the SPTM security mechanism, which uses memory retyping to create isolated trust domains, gapping sensitive functionalities like TXM. This foundation enables the new Exclaves feature, which communicates via mechanisms like xnuproxy and the Tightbeam IPC framework. By moving sensitive components out of XNU's direct reach, these changes significantly increase system security and mitigate the impact of a kernel compromise.

LaTeXpOsEd: A Systematic Analysis of Information Leakage in Preprint Archives

A large-scale security audit of 100,000 arXiv submissions reveals that unsanitized source files leak sensitive data, including PII, API keys, and private credentials. The researchers introduced LaTeXpOsEd, a framework that leverages LLMs to uncover hidden disclosures within LaTeX comments and non-referenced files. To evaluate this capability, they also created LLMSec-DB, a new benchmark for testing LLMs on secret detection, and found thousands of leaks, highlighting a significant security risk in preprint repositories.

Code

NanoChat – The best ChatGPT that $100 can buy

nanochat, by Andrej Karpathy, is a minimal, full-stack implementation of a ChatGPT-like LLM in a single, hackable codebase. It covers the entire pipeline from tokenization and pretraining to inference with a web UI. The project is designed to run end-to-end on a single 8xH100 node, enabling the training of a small model in about 4 hours for around $100.

Show HN: docker/model-runner – an open-source tool for local LLMs

Docker Model Runner (DMR) is a tool for managing and running LLMs within the Docker ecosystem, pulling models from any OCI-compliant registry. It operates on a client-server model with a model-runner daemon and a model-cli client, using llama.cpp as the serving backend. DMR exposes a REST API, including an OpenAI-compatible chat completions endpoint, and provides a Prometheus metrics endpoint for monitoring. Experimental Kubernetes support is available via a Helm chart.

Show HN: No-Code REST APIs (and LLM Tools/MCPs) for Postgres

QueryDeck is a no-code platform for generating REST APIs and LLM agent tools directly from a Postgres database. Its visual builder allows for creating complex SQL queries with deep joins and nested inserts without writing code. The platform features instant deployment, automated schema analysis, RBAC, and the ability to export the generated API as a standalone Node.js application to a GitHub repository.

Show HN: AI Chat Watch – analyze what AI say about brands

AI Chat Watch (AICW) is a free, open-source CLI tool for tracking and analyzing entity mentions across various LLMs like ChatGPT, Claude, and Gemini. It runs locally, executing a multi-stage pipeline that queries models with user-defined questions, performs entity extraction, and calculates metrics like influence scores and historical trends. The tool generates interactive HTML reports to visualize how brand and topic positioning changes over time within LLM responses.

Show HN: Tweaks – Copy → Tweak – 2 MB macOS AI Clipboard App (MIT License)

Tweaks is an open-source macOS utility that uses a global hotkey to rewrite clipboard text with a local LLM. It integrates with Osaurus, a native MLX-based server, for optimized performance on Apple Silicon. The app features configurable prompts and uses streaming to paste results for minimal perceived latency.