Tuesday — July 8, 2025
A serial startup founder uninstalls AI coding assistants due to creative dissatisfaction, researchers propose AsyncFlow for efficient LLM post-training, and a new Git-based tool called AI-docs manages AI-generated memory files.
News
I am uninstalling AI coding assistants from my personal computer
The author, a serial startup founder, has been using AI coding assistants for the past five months to work on a proof of concept for a social network, and while it initially boosted productivity, it ultimately left them feeling empty and discontent. They've come to realize that relying on AI tools to generate code has taken away the creative fulfillment and sense of personal expression that comes from writing code themselves, leading them to uninstall the tools for personal projects.
AI cameras change driver behavior at intersections
Here is a 2-sentence summary of the text:
Cities in the US are adopting AI-powered camera systems to monitor intersections and enforce traffic laws, with the goal of eliminating traffic fatalities and severe injuries through the "Vision Zero" strategy. Companies like Stop for Kids and Obvio.ai are developing these systems, which use computer vision to detect and issue citations for violations such as rolling stops, speeding, and failure to yield, with the aim of changing driver behavior and making roads safer.
Grinding down open source maintainers with AI
Open source maintainers are being targeted with AI-generated spam bug reports, which are designed to be emotionally manipulative and time-consuming to deal with. These fake reports, often filled with excessive emojis and vague descriptions of issues, are intended to grind down maintainers and waste their time, rather than genuinely seeking help or reporting legitimate problems.
Massive study detects AI fingerprints in millions of scientific papers
A massive study analyzing over 15 million biomedical abstracts has detected the presence of AI-generated content in millions of scientific papers, with at least 13.5% of papers published in 2024 showing evidence of AI processing. The study found a significant shift in word choice patterns after the emergence of large language models, with an increase in the use of stylistic and flowery words, suggesting that AI-generated content is becoming increasingly prevalent in academic writing.
Learn how to disable Gemini AI on Android
Google's AI model, Gemini, will soon be able to access and control various apps on Android devices, including WhatsApp and phone services, even if users had previously turned off tracking for Gemini Apps Activity. To maintain privacy, users can disable Gemini in their Android settings, uninstall the app, or turn off Gemini Apps Activity, although Google will still store activity data for up to 72 hours, even with this setting turned off.
Research
AsyncFlow: An Asynchronous Streaming RL Framework for LLM Post-Training
AsyncFlow is a proposed asynchronous streaming reinforcement learning framework designed to address scalability and efficiency challenges in post-training large language models. It achieves an average 1.59 throughput improvement over state-of-the-art baselines through its distributed data storage, automated pipeline overlapping, and dynamic load balancing capabilities, while also offering a modular and customizable architecture.
Measuring AI Ability to Complete Long Tasks
Researchers have proposed a new metric, the 50%-task-completion time horizon, to quantify AI capabilities in terms of human capabilities, finding that current AI models can complete tasks with 50% success rate in around 50 minutes, a time frame that has been doubling approximately every seven months. If this trend continues, it is predicted that within 5 years, AI systems will be able to automate many software tasks that currently take humans a month to complete.
Segmentation and Representation Trade-Offs in Chemistry-Aware RAG
This study evaluates various chunking strategies and embedding models for Retrieval-Augmented Generation (RAG) systems in the chemistry domain, finding that recursive token-based chunking and retrieval-optimized embeddings outperform other approaches. The results provide guidelines for building effective and efficient chemistry-aware RAG systems, with the study releasing its datasets, framework, and benchmarks to support future development in this area.
Mercury: Ultra-fast language models based on diffusion
Mercury Coder is a new generation of large language models based on diffusion, designed for coding applications, and comes in two sizes: Mini and Small, which achieve state-of-the-art throughputs of 1109 tokens/sec and 737 tokens/sec, respectively. These models outperform speed-optimized models by up to 10x on average while maintaining comparable quality, and have been validated by developers on Copilot Arena, where they rank second in quality and are the fastest model overall.
The Lifespan of our Universe
The Dark Energy Survey and Dark Energy Spectroscopic Instrument measurements suggest that the dark energy equation of state is not equal to -1, which can be explained by the axion Dark Energy model. This model predicts a high probability of a negative cosmological constant, leading to a big crunch and a universe lifespan of approximately 33 billion years.
Code
Pangu's Sorrow: The Sorrow and Darkness of Huawei's Noah Pangu LLM R&D Process
A former employee of Huawei's Pangu Large Model Team has come forward to share their experiences and frustrations with the development process of the Pangu large model, citing issues with the model's tokenizer, excessive work pressure, and a shift from research-oriented to delivery-oriented goals. The whistleblower also alleges that the team's work has been affected by controversy surrounding plagiarism and that they have been intimidated into silence, but have decided to speak out against the company's actions.
Show HN: AI-docs (Git-based workflow to manage AI-generated memory files)
AI Docs CLI is a Go-based tool that isolates AI-generated "memory" files onto a dedicated Git branch with worktree and easy sync, supporting multiple AI agents and configurable via YAML, JSON, or TOML. The tool provides a one-command workflow to manage AI memory files, including initialization, pushing and pulling changes, and cleanup, with features like automatic updates to .gitignore and separate pull/push commands for a flexible workflow.
Show HN: Doc81 – tech documentation tool designed in AI-native mind
Doc81 is a tech documentation tool that utilizes AI to help create, manage, and use document templates, offering both local and server modes for accessing templates. It provides features such as scalable document templates, MCP integration for AI assistant compatibility, and an API server, allowing general developers, dev writers, and security-concerned software engineers to efficiently work with technical documentations.
Curated list of language modeling researches for code, plus related datasets
This repository provides a comprehensive survey of language models for code, covering various topics such as base LLMs, pretraining strategies, and downstream tasks. The survey includes a list of recommended readings and a collection of papers on language models for code, with works in each category ordered chronologically.
A LLM Plugin for OpenAI TTS
The llm-tts plugin utilizes OpenAI APIs to provide text-to-speech functionality, offering a "llm tts" subcommand and a "tts" tool. It can be installed via a git repository and used to convert text to speech, with options for customizing the output, such as speaking a short poem in a poetic tone.