Wednesday — September 17, 2025

CrowdStrike's npm packages are compromised in the Shai-Hulud malware attack, AI Code Detector can identify AI-generated code with 95% accuracy, and researchers develop the Atlas of Human-AI Interaction to systematically map human-AI interaction findings.

News

Shai-Hulud malware attack: Tinycolor and over 40 NPM packages compromised

Multiple CrowdStrike npm packages were compromised as part of an ongoing malicious supply chain campaign known as the "Shai-Halud attack", which had previously affected over 40 other packages. The malware, which is designed to steal sensitive data, was quickly removed by the npm registry after its discovery.

Generative AI as Seniority-Biased Technological Change

Researchers analyzed US résumé and job posting data from 2015-2025, finding evidence that generative AI adoption has a seniority-biased impact on the labor market, with junior employment declining sharply in firms that adopt AI, while senior employment continues to rise. The decline in junior employment is driven primarily by slower hiring, with mid-tier graduates being the most affected, suggesting that AI adoption may exacerbate existing labor market inequalities.

Show HN: AI Code Detector – detect AI-generated code with 95% accuracy

AI Code Detector uses a machine learning model to detect AI-generated code with 95% accuracy, allowing engineering leaders to track the use of AI coding tools and their impact on code quality and team productivity. The tool can be used as part of a larger developer intelligence platform to monitor metrics such as code quality, adoption rates, and defect rates, providing data-driven insights to inform AI transformation strategies.

September 15, 2025: The Day the Industry Admitted AI Subscriptions Don't Work

Cursor and Kiro, two AI model providers, have rolled out changes to their pricing models, abandoning "unlimited" access and introducing variable token costs, with Cursor's auto mode now contributing to monthly usage at "competitive token rates" and Kiro introducing a complex pricing model based on "spec requests" and "vibe requests". These changes have led to user backlash, highlighting the unsustainable nature of offering unlimited access to expensive AI models at consumer price points and the need for transparent, usage-based pricing.

Will I run Boston 2026?

The authors of the blog have updated their model to predict the cutoff time for the 2026 Boston Marathon, and based on 33,267 applicants and an assumption of 24,000 time-qualified runners, they predict a buffer of 5 minutes and 16 seconds. The authors have also created an interactive tool that allows users to explore how different numbers of time-qualified runners would affect the predicted cutoff time, with a range of 4:12 to 6:21 for 24,000 runners.

Research

"My Boyfriend Is AI": Computational Analysis of Human-AI Companionship

Researchers have developed the Atlas of Human-AI Interaction, an interactive web interface that systematically maps empirical findings from over 1,000 papers on human-AI interaction, using AI-powered knowledge extraction to identify causal relationships. The atlas provides a navigable knowledge graph of 2,037 empirical findings, revealing research clusters, themes, and gaps, and has been shown to be effective in discovering research gaps through evaluation with 20 researchers.

Human+AI loops stay stable even with quantization

A rigorous framework is developed for analyzing fixed points of nonexpansive maps in $L^1(\mu)$, considering quantization errors and proving the fixed point property for measure-compact subsets. The framework is then applied to a human-in-the-loop co-editing system, demonstrating the existence of a stable consensus artefact that remains an approximate fixed point even under quantization errors.

Fundamental Trade-Off Between Certainty and Scope in Symbolic and Generative AI

A conjecture has been introduced that formalizes the trade-off between an AI system's ability to provide guarantees of correctness and its capacity to handle complex data, suggesting that systems with narrow, pre-structured domains can offer error-free outputs, while those that can process high-dimensional data must accept some risk of error. The conjecture has significant implications for AI engineering, epistemology, and governance, and its proof or refutation is crucial for the development of trustworthy AI systems.

A Survey on Retrieval and Structuring Augmented Generation with LLMs

Large Language Models (LLMs) face challenges in real-world applications due to limitations such as hallucination generation and outdated knowledge, which can be addressed by integrating dynamic information retrieval and structured knowledge representations through Retrieval And Structuring (RAS) Augmented Generation. This approach combines retrieval mechanisms, text structuring techniques, and knowledge integration methods to enhance LLMs, and research opportunities exist in areas such as multimodal retrieval and interactive systems.

FM4NPP: A Scaling Foundation Model for Nuclear and Particle Physics

Scientists have developed a foundation model for experimental particle physics, training it on a large dataset of over 11 million particle collision events, and demonstrated its ability to scale and generalize across various tasks. The model outperforms baseline models and exhibits robust data-efficient adaptation, with its representations being task-agnostic yet specializable for different downstream tasks through a single linear mapping.

Code

Launch HN: Rowboat (YC S24) – Open-source IDE for multi-agent systems

Rowboat Labs offers a platform that enables users to build multi-agent workflows using AI in minutes, with features such as natural language integration, one-click tool connections, and automated workflow deployment. The platform provides a range of resources, including demos, documentation, and community support, to help users get started with building AI-powered agents and automating workflows.

Forget RAG? Introducing KIP, a Protocol for a Living AI Brain

The Knowledge Interaction Protocol (KIP) is a specification aimed at building a bridge between the neural core of Large Language Models (LLMs) and a symbolic core, enabling a two-way cognitive symbiosis that allows AI agents to have persistent and cumulative long-term memory. KIP provides a standardized set of instructions and data structures for efficient and reliable knowledge exchange, with the goal of building a unified memory brain with metabolic capabilities for AI agents, enabling them to learn, self-improve, and adapt to changing environments.

Chronon: A data platform for serving for AI/ML applications

Chronon is a data platform for AI/ML applications that abstracts away the complexity of data computation and serving, allowing users to define features as transformations of raw data and perform batch and streaming computation, scalable backfills, and low-latency serving. The platform provides a range of features, including online serving, backfills, observability, and monitoring tools, and supports complex transformations and windowed aggregations, enabling users to utilize all their organizational data to power AI/ML projects without worrying about complex orchestration.

Show HN: Ghostpipe – Connect files in your codebase to user interfaces

Ghostpipe is a tool that connects files in a codebase to user interfaces, allowing apps to only see explicitly shared files, with data living in the codebase and under version control. It uses yjs and webrtc to connect codebase files with applications, and can be used with various interfaces such as Excalidraw and Swagger, with features including diff mode and configuration-based usage.

Show HN: Ggplot2 chart playground – in the browser with WebAssembly

The WebR ggplot2 Playground is a web application that allows users to run R code directly in their browser, featuring live code editing, CSV upload, and interactive plots using ggplot2. It can be accessed online or installed locally with Node.js and pnpm, providing a sandboxed environment for data visualization and exploration.