Saturday — February 1, 2025
Google users can bypass AI summaries with a profane trick, "Tulu 3" achieves new heights in open model post-training, and LLPlayer revolutionizes language learning with AI-generated subtitles.
News
Add "fucking" to your Google searches to neutralize AI summaries
Internet users have discovered that including expletives in their Google search queries can prevent the return of AI-generated summaries, instead providing a standard list of search results. By adding a swear word to their search, users can bypass Google's AI Overviews and get straight to the links, although it's likely that Google will close this loophole in the future.
How to Train an AI Image Model on Yourself
The author spent a couple of hours training their own AI image model to generate pictures of themselves in various scenarios, such as dressed as Superman, using a base model called Flux and a training technique called LoRA. With minimal prior knowledge, they were able to create a functional model by uploading a set of training photos and using a service called Replicate to handle the training process, and were surprised by how easy and efficient the process was.
GenAI Art Is the Least Imaginative Use of AI Imaginable
The CEO of AI music company Suno claimed that most people don't enjoy making music, but this statement overlooks the fact that the process of creating music, despite its challenges, is often what makes it meaningful and fulfilling. The author, a professor of music and computer science, argues that using generative AI to bypass the creative process undermines the value of art and music, and that the point of creating is not just to produce a product, but to engage in the process itself.
Copyright Office suggests AI copyright debate was settled in 1965
The US Copyright Office has issued guidance stating that works entirely generated by AI cannot be protected by copyright, but works assisted by AI can be copyrighted if they contain human-authored elements. The office will review AI-generated works on a case-by-case basis to determine what parts are eligible for copyright protection, with prompting alone not considered sufficient to establish authorship, although this may change as AI technologies evolve to provide more human control over outputs.
Show HN: Simple to build MCP servers that easily connect with custom LLM calls
The MCP Server in Mirascope enables secure and controlled interactions between host applications and local services by exposing resources, tools, and prompts through a standardized protocol. The server can be used to create a variety of applications, such as a book recommendation server, which can register tools, resources, and prompts to provide book recommendations to clients.
Research
Theoretical limitations of multi-layer Transformer
This work provides the first unconditional lower bound against multi-layer decoder-only transformers, proving that they require a polynomial model dimension to perform certain tasks, and establishing a depth-width trade-off for these models. The results also demonstrate a separation between encoder and decoder capabilities, and show that certain tasks become exponentially easier with the use of chain-of-thought, providing new insights into the computational power of transformers.
Propositional Interpretability in Artificial Intelligence
Mechanistic interpretability aims to explain AI systems by understanding their internal mechanisms, with a focus on propositional interpretability, which involves interpreting a system's behavior in terms of propositional attitudes like belief and desire. A key challenge in achieving this is "thought logging," or creating systems that can log and track an AI's propositional attitudes over time, which can be addressed through various methods, including probing, sparse auto-encoders, and philosophical interpretation techniques.
"AI and the Opportunity for Shared Prosperity" by Jeff Dean, Hal Varian, et al.
Recent advancements in artificial intelligence (AI) have the potential to greatly assist people and businesses, increasing productivity and innovation, but its economic benefits will not come automatically and may exacerbate existing challenges unless addressed collectively. To realize AI's potential and mitigate its risks, a collective policy agenda is needed, involving various stakeholders, to harness its economic value and ensure it benefits society as a whole.
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
Tulu 3 is a family of open, state-of-the-art post-trained models that achieve superior results to other models, including proprietary ones like GPT-4o-mini and Claude 3.5-Haiku, by utilizing techniques such as supervised finetuning and reinforcement learning. The Tulu 3 project provides a comprehensive guide to modern post-training techniques, including open access to its data, code, training recipes, and evaluation methods, allowing for reproducibility and further adaptation to various domains.
Learning to Plan and Reason for Evaluation with Thinking-LLM-as-a-Judge
LLM-as-a-Judge models use chain-of-thought sequences to evaluate responses, but the structure of effective reasoning traces is not well understood due to a lack of human-annotated data. The proposed EvalPlanner algorithm addresses this by generating and optimizing evaluation plans, leading to state-of-the-art performance on several benchmarks, including RewardBench, with a score of 93.9.
Code
Show HN: Ldump – serialize any Lua data
Ldump is a Lua serializer that can handle complex data types, including circular references, tables as keys, and functions with upvalues, and outputs valid Lua code that can be deserialized using the load function. It supports various Lua versions, including Lua 5.1, 5.2, 5.3, 5.4, and LuaJIT, but warns that its deserialization function can load malicious code, so serialized data should be treated as arbitrary Lua code.
Show HN: Small LLM with Large Power
The Maximum-218M model is a transformer-based language model that utilizes Rotary Position Embeddings (RoPE) and Gated Exponential Linear Unit (GeGLU) activations to enhance performance. With 218M parameters and trained on 3M tokens, this model aims to improve position encoding, gradient flow, and overall language understanding capabilities.
Suggestion: Read the DeepSeek model license - note the Governing Jurisdiction
DeepSeek-V3 is a strong Mixture-of-Experts language model with 671B total parameters, achieving state-of-the-art performance while requiring only 2.788M H800 GPU hours for full training. The model pioneers an auxiliary-loss-free strategy for load balancing and a multi-token prediction training objective, and its performance is comparable to leading closed-source models, outperforming other open-source models on various benchmarks.
Show HN: LLPlayer – The media player for language learning, with AI subtitles
LLPlayer is a media player designed for language learning, offering features such as dual subtitles, AI-generated subtitles, real-time translation, and word lookup. The player supports over 100 languages and has a customizable interface, allowing users to personalize their learning experience with features like adjustable subtitle size and placement, customizable keyboard shortcuts, and a built-in cheat sheet.
AntiSlop Sampler for LLM Inference
The AntiSlop sampler is a tool that uses a backtracking mechanism to avoid generating disallowed words or phrases in text, allowing users to provide a list of "slop phrases" to avoid and adjust their probabilities. The sampler can be used with various models and interfaces, including koboldcpp and open-webui, and supports features like regex matching and JSON validation to enforce constraints and improve output quality.