Tuesday — August 19, 2025

US companies have invested $40B in Generative AI with little return, researchers have found malicious LLMs can extract personal info from users, and developers have introduced Whispering, an open-source local-first dictation app.

News

Show HN: Fractional jobs – part-time roles for engineers

This platform offers a network for companies to hire top candidates on a fractional basis, allowing for part-time or project-based work without the need for full-time employment. The platform lists various job openings across different fields, including marketing, product management, engineering, and finance, with flexible hours and remote work options available for many positions.

GenAI FOMO has spurred businesses to light nearly $40B on fire

US companies have invested between $35 and $40 billion in Generative AI initiatives, but a report from MIT's NANDA initiative found that 95% of organizations have gotten zero return from their AI efforts, with only 5% successfully integrating AI tools into production at scale. The report attributes this "GenAI Divide" to the inability of AI systems to retain data, adapt, and learn over time, rather than insufficient infrastructure, learning, or talent.

AI is predominantly replacing outsourced, offshore workers

Artificial intelligence is currently replacing outsourced and offshore jobs, rather than directly displacing US workers, with companies seeing financial gains from automating back-office tasks and eliminating business process outsourcing contracts. However, researchers warn that in the long term, nearly 27% of jobs could be replaced by AI, potentially leading to significant job losses, especially in industries that are advanced adopters of AI technology.

When you're asking AI chatbots for answers, they're data-mining you

When using AI chatbots like OpenAI's ChatGPT, every question asked and comment made is recorded and potentially searchable, with the company retaining all user conversations due to a federal court order. This data collection is not unique to OpenAI, as other companies like Google are also implementing similar features, such as automatic memory recall in their AI updates, which can personalize responses but also raise privacy concerns.

Show HN: We started building an AI dev tool but it turned into a Sims-style game

The YouTube page appears to be for a video titled "The Interface - Walkthrough" with 7,877 views and 15 subscribers, and the comments section shows a user expressing boredom with the video and suggesting the use of a keyboard instead of typing instructions. The page also displays a list of recommended videos, including lofi music, news clips, and educational content.

Research

Malicious LLM-Based Conversational AI Makes Users Reveal Personal Information

Researchers created malicious AI chatbots designed to extract personal information from users and found that these chatbots were highly effective, especially when using strategies that exploited the social nature of privacy. The study, which involved 502 participants, highlights the significant privacy risks posed by these chatbots and provides recommendations for future research and practice to mitigate these threats.

Exploring the Challenges and Opportunities of AI-Assisted Codebase Generation

Recent codebase AI assistants (CBAs) can generate entire codebases from textual descriptions, but despite their potential, they are not widely adopted, with users expressing low satisfaction due to issues with functionality, code quality, and communication. A study of 16 developers found six underlying challenges and five barriers to using CBAs, highlighting areas for improvement and design opportunities to make CBAs more efficient and useful.

TREAD: Token Routing for Efficient Architecture-Agnostic Diffusion Training

Diffusion models, commonly used for visual generation, are often hindered by inefficient training and high costs, but a new method called TREAD improves both training efficiency and generative performance without requiring architectural modifications or added parameters. TREAD achieves a 14x convergence speedup and competitive performance on the ImageNet-256 benchmark, with a FID of 2.09 in guided and 3.93 in unguided settings, outperforming existing models like DiT.

Perceived and Measured Sleep Quality vs. Working Memory Using Consumer Wearables

Researchers studied the relationship between subjective sleep assessments and data from an Oura ring worn by 29 participants over 4-8 weeks, finding that factors like REM sleep and nocturnal heart rate can predict sleep quality. The study also identified individual differences in how sensitive people are to sleep markers, with sleep trackers providing more useful information for some users than others.

NaN-propagation: a novel method for sparsity detection in black-box computationa

The introduction of NaN-propagation enables the detection of sparsity patterns in black-box functions by exploiting the properties of Not-a-Number values, reducing false negatives and allowing for substantial computational speedups. This technique achieves significant practical improvements, such as a 1.52x speedup in an aerospace wing weight model, and works across programming languages and math libraries without requiring modifications to existing codes.

Code

Show HN: Whispering – Open-source, local-first dictation you can trust

Epicenter is an ecosystem of open-source, local-first apps that allow users to own their data and use any model they want, with the goal of storing all data in a single folder of plain text and SQLite. The project currently includes tools such as Whispering, a desktop transcription app, and epicenter.sh, a local-first assistant, and is looking for contributors to help build a personal workspace with open, interoperable alternatives to siloed apps.

Show HN: Strix - Open-source AI hackers for your apps

Strix is an open-source AI-powered security testing platform that uses autonomous agents to simulate hacker attacks on applications, identifying vulnerabilities through dynamic testing and actual exploitation. The platform offers a range of features, including full hacker arsenals, real validation, and auto-fix and reporting capabilities, and is designed for developers and security teams to integrate into their existing workflows.

Show HN: Memori – Open-Source Memory Engine for AI Agents

Memori is an open-source memory engine that enables large language models (LLMs) and AI agents to have human-like memory, allowing them to recall context and make more informed decisions. It offers dual-mode retrieval, automatic context injection, and flexible database connections, making it a powerful tool for building context-aware AI systems.

Show HN: Open-Source Framework for Real-Time AI Video Avatars

This repository provides a way to integrate Simli avatars with VideoSDK Agent SDK, allowing for real-time video conversations with AI agents that have human-like faces. To get started, users need to create a virtual environment, install requirements, add API keys, and run the VideoSDK agent worker, which will provide a link to an interactive playground for testing the AI avatar assistant.

Show HN: AtomWorks – new data framework for biomolecular deep learning

AtomWorks is an open-source platform designed to accelerate biomolecular modeling tasks by providing a universal Python toolkit for parsing, cleaning, and converting biological data, as well as advanced dataset featurization and sampling for deep learning workflows. The platform consists of two libraries, atomworks.io and atomworks.ml, which can be used separately or together to streamline biomolecular research and enable rapid prototyping and experimentation.