Tuesday — September 30, 2025
California governor signs AI transparency bill into law, researchers find that Large Language Models can hallucinate critical problem features, and developers release DeepSeek-v3.2-Exp, an experimental model with improved training and inference efficiency.
News
Claude Code 2.0
@anthropic-ai/claude-code is a terminal-based coding tool that helps developers code faster by executing routine tasks, explaining complex code, and handling git workflows through natural language commands. The package can be installed via npm using the command npm install -g @anthropic-ai/claude-code, and provides features such as code completion, debugging, and project management, with data collection and usage policies outlined in the official documentation.
F-Droid and Google’s developer registration decree
F-Droid, a platform for distributing free and open-source Android apps, is under threat from Google's new requirement that developers register centrally, which could end F-Droid and other alternative app stores. This move is seen as an attempt to consolidate power and control over the Android ecosystem, rather than a genuine effort to improve security, and could deprive users of a safe and trustworthy source of apps.
California governor signs AI transparency bill into law
Governor Newsom has signed the Transparency in Frontier Artificial Intelligence Act, which establishes new requirements for artificial intelligence developers in California, including transparency, innovation, safety, accountability, and responsiveness, to promote public trust and safety while fostering innovation in the AI industry. The law makes California a leader in responsible and ethical AI, building on the state's position as a global hub for technology and innovation, with 32 of the top 50 AI companies worldwide based in the state.
Claude Sonnet 4.5
Claude Sonnet 4.5 is a state-of-the-art coding model that excels in building complex agents, using computers, and demonstrating substantial gains in reasoning and math. The model is now available, along with major upgrades to related products, and has shown significant improvements in domain-specific knowledge, coding performance, and alignment, making it a powerful tool for developers and users across various industries.
Vercel CEO meets with Netanyahu to discuss AI education
Guillermo Rauch had a discussion with Israeli Prime Minister Netanyahu about the importance of AI education and literacy in keeping free societies ahead, and expressed optimism for peace and greatness for Israel and its neighbors. The conversation highlighted the potential of AI to empower people to build software and drive progress, with Rauch sharing a photo of the meeting on X.
Research
AI-Driven Automation Can Become the Foundation of Next-Era Science of Science
The Science of Science (SoS) is being transformed by the integration of artificial intelligence (AI), which enables the automation of large-scale pattern discovery and provides valuable insights for enhancing scientific efficiency and innovation. This integration offers numerous advantages over traditional methods, but also presents challenges that can be addressed through proposed pathways and innovative applications, such as multi-agent systems that simulate research societies.
Reasoning LLM Errors Arise from Hallucinating Critical Problem Features
Large language models, even those trained with chain-of-thought strategies, can still make mistakes, with a notable issue being the "hallucination" of unspecified information, such as graph edges. This problem persists across different models and complexity levels, and may be a broader issue with how these models represent problem specifics, highlighting a need for design changes to mitigate this weakness.
SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly
The Counterfactual Regret Minimization algorithm has limitations in multiplayer poker games, prompting the development of SpinGPT, a Large Language Model tailored to three-player Spin & Go games. SpinGPT, trained on expert decisions and solver-generated hands, achieves promising results, matching a solver's actions in 78% of decisions and performing competitively against a strong opponent in heads-up games.
Ten Principles of AI Agent Economics
The rapid advancement of AI-based autonomous agents is transforming society and economies, posing questions about their integration, ethics, and safety as they increasingly exhibit human-like intelligence and participate in social and economic systems. A framework of ten principles of AI agent economics is presented to understand and address these challenges, providing a foundation for responsible integration and highlighting the need for future research into trustworthiness, ethics, and regulation.
What Is Artificial General Intelligence?
Artificial general intelligence (AGI) is a field of research that has been subject to hype and speculation, but its meaning and development will be settled through long-term scientific investigation. A comprehensive approach to AGI involves combining foundational tools like search and approximation, as well as meta-approaches that maximize resources, simplicity, or functionality, and its development is expected to be a fusion of these tools and approaches, with current bottlenecks including sample and energy efficiency.
Code
DeepSeek-v3.2-Exp
DeepSeek-V3.2-Exp is an experimental version of the DeepSeek model, introducing DeepSeek Sparse Attention to improve training and inference efficiency in long-context scenarios. The model achieves substantial improvements in efficiency while maintaining similar performance to its predecessor, V3.1-Terminus, and is available for use through various platforms, including HuggingFace, SGLang, and vLLM.
Macintosh System 7 Ported To x86 With LLM Help in 3 days
System 7 is an open implementation of Apple Macintosh System 7 for modern hardware, featuring a classic Mac OS interface, desktop icons, and QuickDraw graphics, and can be bootable via GRUB2/Multiboot2. The project is currently in iteration 2, with many features already implemented, including PS/2 input support, event handling, and Finder integration, and has a roadmap for future development including dropdown menus, window dragging, and file system integration.
Show HN: Cap'n-rs – Rust implementation of Cloudflare's Cap'n Web protocol
This is a Rust implementation of the Cap'n Web protocol, a capability-based RPC system with support for promise pipelining and multi-transport protocols, including HTTP batch, WebSocket, and WebTransport. The implementation is divided into several crates, including capnweb-core, capnweb-transport, capnweb-server, and capnweb-client, and provides a production-ready server and client with comprehensive error handling, zero-panic code, and support for complex data structures and JavaScript interoperability.
Show HN: Reddit browser for MCP clients – works with any AI assistant
Reddit MCP Buddy is a server that enables AI assistants like Claude Desktop to browse Reddit, search posts, and analyze user activity without requiring API keys. It offers a three-tier authentication system, allowing for up to 100 requests per minute, and provides various tools for browsing subreddits, searching posts, analyzing user profiles, and explaining Reddit terms.
Tile Language: DSL for High-Performance GPU/CPU/Accelerators Kernels
Tile Language is a domain-specific language designed to streamline the development of high-performance GPU/CPU kernels, allowing developers to focus on productivity without sacrificing low-level optimizations. It has been tested and validated on various devices, including NVIDIA and AMD GPUs, and provides building blocks to implement a wide variety of operators, with examples including matrix multiplication, dequantization GEMM, and flash attention.