Wednesday — June 4, 2025
Builder.ai's collapse reveals it was human-run, TLOB's dual attention improves financial predictions despite market efficiency, and robots swearing in error scenarios amuse users.
News
Deep learning gets the glory, deep fact checking gets ignored
A research paper using a deep learning model to predict enzyme functions was published in Nature Communications, receiving significant attention, but was later found to contain hundreds of erroneous predictions. A follow-up paper posted on bioRxiv, which identified the errors, received significantly less attention, highlighting the challenges of evaluating AI results in biology and the flaws in current publishing incentives.
Builder.ai Collapses: $1.5B 'AI' Startup Exposed as 'Indians'?
Builder.ai, a $1.5 billion 'AI' startup, has collapsed and is seeking bankruptcy protection after it was revealed that the company's supposedly AI-powered operations were actually being run by human developers in India pretending to be bots. The company's downfall has exposed a lack of substance behind the hype of AI startups, with major investor Qatar Investment Authority losing $250 million in the process.
AI makes the humanities more important, but also weirder
The integration of artificial intelligence (AI) into the humanities is a transformative and unavoidable reality, with AI language models already being used for tasks such as language translation, data mining, and translation of archaic languages. Humanistic skills, such as understanding the relationship between language and culture, are also becoming increasingly important in AI research itself, with engineers needing to consider these topics to fix and improve AI systems.
A deep dive into self-improving AI and the Darwin-Gödel Machine
Most AI systems are limited by their fixed architectures and lack the ability to evolve autonomously, but the concept of self-improving systems, such as the Darwin-Gödel Machine (DGM), aims to create systems that can learn and improve their own capabilities without human intervention. The DGM combines Darwinian evolution and Gödelian self-improvement, allowing an AI agent to iteratively modify its own code and test its performance in a real-world environment, retaining beneficial changes and eliminating harmful ones.
Yoshua Bengio Launches LawZero: A New Nonprofit Advancing Safe-by-Design AI
Yoshua Bengio, a renowned AI researcher, has launched LawZero, a nonprofit organization dedicated to developing safe-by-design AI systems, in response to the potential dangers of current frontier AI models. LawZero is pioneering a new approach called Scientist AI, a non-agentic AI system that prioritizes understanding the world over taking actions, with the goal of creating a safer and more secure alternative to current AI systems.
Research
TLOB: Dual Attention Transformer Predicts Price Trends from Order Book Data
Researchers have developed a transformer-based model called TLOB that uses a dual attention mechanism to predict price trends in financial markets, outperforming existing state-of-the-art methods across various datasets and horizons. The study also reveals that stock price predictability has declined over time, highlighting the growing market efficiency, and underscores the complexity of translating trend classification into profitable trading strategies.
What do software developers need to know to succeed in an age of AI?
Research with 21 cutting-edge developers found that using generative AI requires a combination of technical and soft skills across four domains, which can be applied throughout a 6-step task workflow. To prepare developers for an AI-driven future, education and training programs should focus on reskilling and upskilling in these areas to prevent deskilling and ensure long-term success.
How much do language models memorize?
Researchers propose a method to measure a language model's capacity by separating memorization into unintended memorization and generalization, estimating that GPT-style models have a capacity of approximately 3.6 bits per parameter. By training language models on datasets of varying sizes, they find that models memorize until their capacity is filled, after which they begin to generalize, and establish scaling laws relating model capacity and data size to membership inference.
Oh fuck! How do people feel about robots that leverage profanity?
Researchers investigated the use of curse words by robots in error scenarios to improve social perceptions, and found that humans generally didn't mind or were even amused by robots using expletives. The studies, which included online and in-person experiments, showed that while verbal error acknowledgment was beneficial, there was little difference between robots using non-expletive and expletive language, suggesting a potential new design space for robot character development.
Why Academics Are Leaving Twitter for Bluesky
An analysis of 300,000 academic users who migrated from Twitter to Bluesky between 2023 and 2025 found that 18% of scholars made the transition, with rates varying by discipline, politics, and Twitter engagement. The study revealed that information sources, rather than audience, drove migration, and that scholars who rebuilt their Twitter networks on Bluesky remained more active and engaged, providing new insights into platform migration and network externalities.
Code
Show HN: Ephe – A minimalist open-source Markdown paper for today
Ephe is an ephemeral markdown paper that helps organize daily tasks and thoughts in a simple and clean way using plain Markdown. It provides a single, distraction-free page to focus on your day, offering an alternative to traditional overwhelming todo apps.
Show HN: Controlling 3D models with voice and hand gestures
The 3D Model Playground is an interactive web app that allows users to control 3D models using hand gestures and voice commands in real-time, built with technologies such as Three.js, MediaPipe, and Web Speech API. The app can be accessed through a live demo, and its source code is available on GitHub, with setup instructions provided for development and a MIT License for usage.
Show HN: Localize React apps without rewriting code
There is no text to summarize. The provided input is an error message indicating that a README file could not be retrieved.
Show HN: LLMFeeder – Browser extension to extract clean content for LLM context
LLMFeeder is a browser extension that converts web page content to clean Markdown format and copies it to the clipboard with a single click, making it perfect for feeding content to Large Language Models (LLMs). The extension is available for both Chrome and Firefox, operates fully client-side with zero backend dependencies, and prioritizes user privacy and security by not transmitting any data to external servers.
Show HN: Page Magic: Use AI to customize any web page
Page Magic is a Chrome extension that uses AI to customize the appearance of any web page, utilizing the Anthropic API, for which users must provide their own API key. To use Page Magic, users must install the extension, configure their settings, and then can apply changes to web pages by telling the extension what they want to change and clicking "Apply".