Saturday — May 17, 2025
MIT withdraws AI paper over research integrity doubts, Solidis introduces high-performance Redis client for serverless environments, and innovative LLM framework enables secure chat-based cryptography.
News
After months of coding with LLMs, I'm going back to using my brain
The author, a software engineer, used AI tools like Claude and Gemini to rapidly build a new infrastructure for their SaaS business, but soon found that the resulting codebase was disorganized and prone to errors. They eventually realized that over-reliance on AI was hindering their own coding skills and mental sharpness, and have since taken a step back to rewrite and reorganize the code themselves, using AI only for simpler tasks and defaulting to traditional methods like pen and paper for planning and coding.
MIT asks arXiv to withdraw preprint of paper on AI and scientific discovery
MIT has requested the withdrawal of a preprint paper on artificial intelligence due to concerns about the integrity of the research, stating that they have no confidence in the data or veracity of the findings. The paper's author, a former PhD student, has been directed to submit a withdrawal request, but has not done so, prompting MIT to take formal steps to mitigate the effects of potential misconduct and maintain research integrity.
Getting AI to write good SQL
Google's Gemini technology allows for the generation of SQL queries directly from natural language, increasing productivity for developers and analysts and enabling non-technical users to interact with data. However, text-to-SQL technology faces challenges such as providing business-specific context, understanding user intent, and managing differences in SQL dialects, which Google addresses through various techniques including intelligent retrieval, in-context-learning, and data linking.
Will AI systems perform poorly due to AI-generated material in training data?
The provided text does not discuss the collapse of GPT, but rather appears to be a cookie policy page from the Communications of the ACM website, outlining the various cookies used by the site and their purposes. The page lists cookies from multiple providers, including Google, LinkedIn, and Vimeo, and explains how they are used for functions such as analytics, advertising, and user experience optimization.
Beyond Text: On-Demand UI Generation for Better Conversational Experiences
Researchers have developed a prototype that enables large language models (LLMs) to generate dynamic, interactive UI components on demand, transforming how users interact with AI systems and improving accessibility, efficiency, and user satisfaction. This approach allows AI to dynamically generate appropriate UI components based on the conversation context, bridging the gap between conversational AI and traditional application interfaces, and providing users with the flexibility and naturalness of conversation with the precision and efficiency of structured inputs.
Research
DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
The rapid growth of large language models has exposed limitations in current hardware, but the DeepSeek-V3 model demonstrates how hardware-aware design can address these challenges, enabling efficient training and inference. The model's architecture incorporates innovations such as Multi-head Latent Attention and Mixture of Experts, and its development highlights the importance of hardware and model co-design in meeting the demands of AI workloads.
AccLLM: Accelerating Long-Context LLM Inference via Algorithm-Hardware Co-Design
Large language models (LLMs) face significant challenges when deployed on edge devices due to their intensive computations, huge model sizes, and high memory and bandwidth demands. The proposed AccLLM framework addresses these challenges through algorithmic innovations, such as pruning and quantization, and a dedicated FPGA-based accelerator, achieving a 4.07x energy efficiency and 2.98x throughput improvement over state-of-the-art work.
An LLM Framework for Cryptography over Chat Channels
Recent advancements in Large Language Models have led to the development of a novel cryptographic embedding framework that enables secure and covert communication over public chat channels, producing humanlike texts that are indistinguishable from normal conversations. This framework is versatile, allowing users to employ different local models and remaining effective against both current and potential future quantum computing threats, making it a viable alternative to traditional encryption methods.
Biscuit: Scaffolding LLM-Generated Code with Ephemeral UIs in Notebooks
Programmers often struggle to understand and work with code generated by large language models (LLMs), so a new workflow called BISCUIT was developed to address this issue. BISCUIT, an extension for JupyterLab, uses ephemeral UIs to provide users with an intermediate stage between prompts and code generation, helping them understand and explore LLM-generated code, as demonstrated by a user study with 10 novices using the tool for machine learning tutorials.
Generative Ghosts: Anticipating Benefits and Risks of AI Afterlives
The development of AI systems is enabling the creation of custom agents that can interact with people after their death, known as "generative ghosts," which can generate novel content rather than just repeating pre-existing information. This technology raises both practical and ethical implications, and researchers are exploring its potential impacts on individuals and society, with the goal of understanding the benefits and risks to inform the development of this emerging technology.
Code
Show HN: KVSplit – Run 2-3x longer contexts on Apple Silicon
KVSplit is a technique for reducing memory usage in large language models (LLMs) by applying different quantization precision to keys and values in the attention mechanism's KV cache, enabling up to 72% memory reduction with minimal quality loss. By using KVSplit, users can run larger context windows and heavier LLMs on their Macs, with some configurations even improving inference speed by 5-15% compared to the standard FP16 approach.
Show HN: Rv, a Package Manager for R
rv is a tool for managing and installing R packages in a reproducible and declarative way, using a configuration file to specify project dependencies and settings. The tool has two primary commands, rv plan and rv sync, which allow users to preview and synchronize their library, configuration file, and lock file based on the specified project state.
Show HN: Workflow Use – Deterministic, self-healing browser automation (RPA 2.0)
Workflow Use is a tool for creating and executing deterministic workflows with variables, allowing users to record browser interactions once and replay them indefinitely with features like self-healing and automatic variable extraction. The project is in early development and aims to provide a reliable and scalable solution for automating workflows, with a roadmap that includes improvements to its core functionality, developer experience, and integration with other tools.
Show HN: Solidis – Tiny TS Redis client, no deps, for serverless
Solidis is a high-performance, SOLID-structured RESP client for Redis and other RESP-compatible servers, built with zero dependencies and enterprise-grade performance in mind. It supports both RESP2 and RESP3 protocols, offers advanced features like transactions, pipelines, and pub/sub functionality, and provides extensive configuration options for customization and optimization.
Show HN: Merliot – plugging physical devices into LLMs
Merliot Hub is an AI-integrated device hub that allows users to control and interact with physical devices, such as those built from Raspberry Pis, Arduinos, and sensors, using natural language from an LLM host. The hub uses a distributed architecture to ensure privacy, and it can be run locally or in the cloud, with features including a web app, AI-integration, and cloud-readiness.