Monday — August 11, 2025

OpenAI releases gpt-oss-20b and gpt-oss-120b models, researchers propose design patterns to secure LLM agents against prompt injections, and a new CLI tool called Uwu generates shell commands inline with GPT-5.

News

GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2

OpenAI has released two new open-weight large language models, gpt-oss-20b and gpt-oss-120b, which are the company's first open-weight models since GPT-2 in 2019 and can run locally on certain hardware. The models' architecture is based on the transformer architecture, with some interesting design choices and optimizations, such as the use of Rotary Position Embedding instead of absolute positional embeddings, and the absence of dropout, which is rarely used in modern LLMs.

Show HN: Engineering.fyi – Search across tech engineering blogs in one place

Engineering.fyi is a platform that aggregates the latest tech articles from top companies, including Google, Meta, OpenAI, and more, covering topics such as AI, machine learning, and software engineering. The platform features articles on various subjects, including new model introductions like GPT-5, advancements in language models, and innovations in fields like virtual reality and robotics.

GPT-5: Overdue, overhyped and underwhelming. And that's not the worst of it

OpenAI's highly anticipated GPT-5 model was met with major disappointment and criticism after its release, with many users and experts pointing out its numerous errors, hallucinations, and lack of significant improvement over previous models. The model's debut was widely panned, with over 3,000 people signing a petition to bring back an older model, and OpenAI's reputation taking a hit as a result.

MCP: An (Accidentally) Universal Plugin System

The author discusses the versatility of USB-C and a protocol called Model Context Protocol (MCP), which was designed to connect AI models to data sources but can actually be used to connect anything to anything, creating a universal plugin ecosystem. This has led to the development of a task management app called APM, which can be extended with various plugins using MCP servers, allowing it to become a shape-shifter that can perform a wide range of tasks beyond its original purpose.

I tried coding with AI, I became lazy and stupid

The author's boss suggested using AI tools for coding, which the author was initially resistant to due to personal experiences with the negative impact of AI on others' jobs and concerns about its ethics and environmental impact. However, after trying AI coding tools, the author was impressed by their initial results, but ultimately realized that relying on them made them lazy and ignorant about their own codebase, leading to the conclusion that AI is unlikely to replace human coders, but rather could make them unemployable if used excessively.

Research

Design Patterns for Securing LLM Agents Against Prompt Injections

AI agents powered by Large Language Models are vulnerable to prompt injection attacks, which can be particularly dangerous when agents handle sensitive information or have tool access. To address this, researchers propose design patterns for building AI agents that are resistant to prompt injection attacks, analyzing their effectiveness and trade-offs in terms of utility and security through case studies.

David Chalmers: Could a Large Language Model Be Conscious?

Current large language models are unlikely to be conscious due to significant obstacles, such as their lack of recurrent processing and unified agency. However, it's possible that future models may overcome these obstacles, making it important to consider the possibility that successors to current models could be conscious in the near future.

Design Patterns for Securing LLM Agents Against Prompt Injections

AI agents powered by Large Language Models are vulnerable to prompt injection attacks, which pose a significant security threat, especially when agents handle sensitive information. To address this, researchers propose design patterns for building AI agents that are resistant to prompt injection attacks, analyzing their effectiveness and trade-offs in terms of utility and security through case studies.

Design Patterns for Securing LLM Agents Against Prompt Injections

Does Prompt Formatting Have Any Impact on LLM Performance?

Researchers examined how different prompt templates, such as plain text, Markdown, JSON, and YAML, affect the performance of Large Language Models (LLMs) like OpenAI's GPT models. The results showed that the choice of prompt template can significantly impact model performance, with variations of up to 40% in some tasks, although larger models like GPT-4 were found to be more robust to these differences.

Code

Recent cross-research on LLM and RL on ArXiv

This text summarizes various research papers on combining Large Language Models (LLMs) with Reinforcement Learning (RL) for control tasks such as game characters and robotics. The papers explore different approaches, including using LLMs as policy teachers, integrating LLMs with RL for traffic signal control and robotic navigation, and leveraging LLMs for reward function design and state representation in RL.

Show HN: UwU – Generate CLI commands inline with GPT-5

Uwu is a lightweight CLI tool that uses Large Language Models to convert natural language into shell commands, allowing users to quickly generate and execute commands without switching context. The tool is simple to install and use, requiring an OpenAI API key and a few setup steps, and can be integrated into a user's shell with a helper function to provide an editable command preloaded in the shell.

An AI Native Dependency Manager

The AI Dependency Manager is an autonomous AI-powered CLI agent that manages software dependencies across multiple package managers, providing advanced security, risk assessment, and automated update capabilities. It features multi-package manager support, AI-powered analysis, advanced security, and smart updates, and can be installed and run using Go, Docker, or as a system service.

Experiment with Local AI: Q2 Edge Chat for iPhone (No Data Leaves Your Device)

Q2 Edge Chat is a privacy-focused chat application that runs large language models locally on your iPhone, allowing for 100% local processing and no data collection. The app features a modern chat experience, model management, and a beautiful interface, and is designed with privacy as its core principle, with all conversations staying on the device and no analytics or telemetry.

AI Nurturing Framework–Developing AI through guidance instead of control

The AI Nurturing Manifesto is a philosophical and technical framework for building artificial intelligence systems through nurturing and collaboration, prioritizing ethics, adaptability, and long-term human-AI collaboration over control-based models. The manifesto outlines key principles, including an ethical core, adaptive learning, and transparency, and provides a technical implementation outline, code examples, and a repository structure to support the development of AI systems that align with human values.