Saturday — July 19, 2025
The NYPD used facial recognition software to identify a protester despite a ban, a new project called Toy LLM Daydreaming generates novel connections between random concepts using OpenAI models, and researchers are investigating AI "scheming" and its potential to pursue misaligned goals.
News
NYPD bypassed facial recognition ban to ID pro-Palestinian student protester
The New York City Fire Department (FDNY) used facial recognition software to help the NYPD identify a pro-Palestinian protester, Zuhdi Ahmed, at Columbia University, despite the NYPD being banned from using the technology under its own policies. The FDNY's use of Clearview AI software to identify Ahmed, who was accused of hurling a rock at a pro-Israeli protester, has raised concerns about government surveillance and the potential for law enforcement to circumvent policies and laws regulating the use of facial recognition technology.
AI capex is so big that it's affecting economic statistics
The author discusses the large amounts being spent on AI datacenter renovations, with estimates suggesting it could be around 2% of US GDP in 2025, and notes that this spending is coming at the expense of other investments, such as manufacturing and infrastructure. The author also touches on the Federal Reserve's costly building renovations, which have drawn criticism from administration officials, including Trump, who expressed surprise at the expense, despite renovations being a common and expected part of maintaining large buildings.
How I keep up with AI progress
Generative AI is a rapidly evolving technology that is often misunderstood, with many people either underestimating or overestimating its capabilities, leading to negative consequences. To stay informed, it's essential to follow trustworthy sources, such as official announcements from AI labs, blogs from experts like Simon Willison and Andrej Karpathy, and news from reputable outlets, to build a clear understanding of the technology and its evolving capabilities.
Exposed MCP servers across the internet
Knostic's research team conducted a study to locate and map exposed MCP servers on the internet, identifying 1,862 servers, all of which were insecure and revealed their capabilities without authentication. The team used Shodan and custom Python tools to fingerprint and map the servers, and manually verified a sample of 119 servers, finding that all granted access to internal tool listings without authentication, highlighting significant security concerns.
Exhausted man defeats AI model in world coding championship
Przemysław Dębiak, a Polish programmer, has defeated an advanced AI model from OpenAI in a 10-hour coding competition, the AtCoder World Tour Finals 2025 Heuristic contest, despite being "completely exhausted" from the marathon. Dębiak's victory represents a human expert pushing themselves to their physical limits to prove that human skill still matters in an age of advancing AI, earning him the top spot and leaving the AI model in second place.
Research
Lessons from a Chimp: AI "Scheming" and the Quest for Ape Language
Researchers are investigating whether current AI systems are developing the capacity for "scheming," or covertly pursuing misaligned goals, and are drawing lessons from historical research on non-human primates' ability to master natural language. To advance this research in a scientifically rigorous manner, researchers should avoid overattributing human traits, relying on anecdotes, and instead establish a strong theoretical framework and take concrete steps to ensure productive investigation.
Architectural Backdoors in Deep Learning: A Survey of Vulnerabilities, Detection
Architectural backdoors in deep neural networks pose a significant threat as they embed malicious logic into a model's computational graph, evading standard mitigation techniques and persisting even after retraining. Researchers have made progress in detecting and defending against these backdoors, but scalable and practical defenses remain elusive, highlighting the need for further research to strengthen supply-chain security and develop comprehensive defenses.
Two-photon 3D printing of functional microstructures inside living cells [pdf]
Researchers have successfully used two-photon polymerization to create custom-shaped microstructures directly inside living cells, achieving submicron resolution and stability. This breakthrough technique, which involves injecting a biocompatible photoresist into cells and selectively polymerizing it with a laser, could lead to new applications in intracellular sensing, drug delivery, and bioelectronics, and potentially enable the engineering of cellular properties.
BeePL: Correct-by-Compilation Kernel Extensions
eBPF is a technology that allows developers to extend kernel functionality safely, but its existing verifier has limitations, being both overly conservative and unsound in some cases. BeePL is a new domain-specific language for eBPF that addresses these challenges with a formally verified type system, ensuring key safety properties and providing a foundation for an end-to-end verifiable toolchain for safe kernel extensions.
A Survey of Context Engineering for Large Language Models
The performance of Large Language Models is determined by the contextual information provided, and Context Engineering is a discipline that optimizes this information to improve model performance. A comprehensive survey of over 1300 research papers reveals a critical research gap, where models excel at understanding complex contexts but struggle to generate sophisticated, long-form outputs, highlighting a key area for future research.
Code
Show HN: Toy LLM Daydreaming
This project, "Toy LLM Daydreaming", uses OpenAI models to generate novel connections between two random concepts pulled from Wikipedia article titles, with the goal of finding deep, non-obvious, and potentially groundbreaking relationships. The process involves a synthesizer generating a hypothesis and then a critic evaluating the hypothesis based on criteria such as novelty, coherence, and usefulness.
Show HN: Benchstreet – the stock prediction AI benchmark
Benchstreet is a collection of time series prediction models, including transformer, feedforward neural networks, convolutional neural networks, recurrent neural networks, and statistical models, trained on 20 years of S&P 500 daily closing prices to evaluate and compare their performance in one-shot and long-term financial data forecasting. The top-performing model is N-BEATS, which achieves high accuracy with extremely low training time, making it a notable choice among the various models available in the Benchstreet collection.
Show HN: AI File Sorter: Organize Files and Folders with AI (Local LLMs)
AI File Sorter is a cross-platform desktop application that uses AI integration to automate file organization, categorizing and sorting files and folders based on their names and extensions. The app features a user-friendly interface, local and remote language models, and customizable sorting rules, and is available for Windows, macOS, and Linux, with installation instructions and requirements provided for each platform.
I built a GH Action that uses AI to manually QA your PR using Magnitude/Claude
The PR Test Generator GitHub Action automatically generates and runs end-to-end tests for Pull Requests using the Claude API and Magnitude testing framework, analyzing PR changes and repository context to create comprehensive tests. The action can be easily set up by adding a workflow to the repository, configuring secrets, and opening a Pull Request, which will trigger the action to analyze, generate, execute, and comment test results on the PR.
Show HN: Cursor Autopilot – Control your Cursor chat via Telegram and more
Cursor Autopilot is an extension that allows remote control of Cursor AI coding sessions via Telegram, Gmail, and Feishu, enabling users to receive chat summaries and inject replies to continue or stop the coding session. The extension can be installed from the Extensions Marketplace or manually, and its configuration involves setting up adapter settings in a .autopilot.json file to enable remote communication.