Thursday — January 22, 2026

eBay bans agentic "buy for me" bots, DiffRatio slashes diffusion GPU memory by 50%, and yolo-cage sandboxes coding agents to prevent secret exfiltration.

Interested in AI engineering? Let's talk

News

How AI destroys institutions

The authors argue that AI system affordances are fundamentally incompatible with the structural requirements of civic institutions, such as transparency, accountability, and collective evolution. By eroding expertise and short-circuiting decision-making processes, current AI deployments threaten to dismantle the institutional frameworks necessary for democratic stability.

The Agentic AI Handbook: Production-Ready Patterns

The Agentic AI Handbook synthesizes 113 patterns for building reliable, production-grade LLM loops. It advocates for moving beyond "vibe-based" prompting toward deterministic architectures featuring Plan-Then-Execute gates, reflection loops anchored in CI signals, and strict action trace monitoring. To ensure security and maintainability, the guide recommends a diff-first workflow, compartmentalized tool access to avoid the "Lethal Trifecta," and persistent project-level rules to mitigate context drift.

Sweep, Open-weights 1.5B model for next-edit autocomplete

Sweep Next-Edit is a 1.5B parameter model based on Qwen2.5-Coder, optimized for local next-edit autocomplete in Q8_0 GGUF format. It achieves sub-500ms latency using speculative decoding and outperforms models 4x its size on benchmarks while supporting an 8192-token context window. The model is available under the Apache 2.0 license and can be run locally via llama-cpp-python.

Comic-Con Bans AI Art After Artist Pushback

San Diego Comic-Con has implemented a total ban on AI-generated art in its art show, reversing a previous policy that allowed non-commercial, labeled AI works. The shift follows a backlash from artists who argue that generative AI models exploit their work and reduce professional opportunities in the entertainment industry. This policy change reflects growing resistance within creative communities toward the integration of generative AI tools trained on human-authored datasets.

eBay explicitly bans AI "buy for me" agents in user agreement update

eBay's updated User Agreement, effective February 20, 2026, explicitly prohibits LLM-driven bots and agentic "buy for me" tools that automate order placement without human review. The policy expands anti-scraping provisions to include AI-driven automated access and end-to-end agentic flows. Additionally, the update strengthens arbitration clauses by broadening class action waivers and restricting legal recourse to individual claims.

Research

Negotiating Relationships with ChatGPT

Users are increasingly adopting general-purpose chatbots for AI companionship, navigating relationships shaped by perceived agency and platform-imposed constraints. Research indicates that model updates frequently disrupt these dynamics, prompting users to employ steering strategies like behavioral instructions and platform porting to maintain stability. This highlights a fundamental tension between user emotional attachment and the technical safety guardrails or product objectives of LLM providers.

Debunking the Myth of Join Ordering: Toward Robust SQL Analytics

This paper addresses the issue of suboptimal join plans in query optimizers, which can lead to significant performance degradation. It introduces Robust Predicate Transfer (RPT), a technique provably robust against arbitrary join orders for acyclic queries. Integrated with DuckDB and evaluated on TPC-H, JOB, and TPC-DS benchmarks, RPT drastically improves join-order robustness, limiting the max/min execution time ratio to 1.6x, while also enhancing end-to-end query performance by 1.5x.

SlimEdge: Lightweight Distributed DNN Deployment on Constrained Hardware

The framework optimizes distributed DNN deployment on edge devices by combining structured pruning with multi-objective optimization. Using MVCNNs for 3D object recognition, the method employs view-adaptive compression to balance task performance against hardware-specific memory and latency constraints. Experimental results demonstrate inference speedups of up to 5.0x while maintaining user-defined accuracy bounds across diverse hardware platforms.

DiffRatio: A SOTA one-step Diffusion model with 50% less GPU memory

DiffRatio improves one-step diffusion model distillation by directly estimating the score difference as the gradient of a learned log density ratio between student and data distributions. This approach mitigates gradient estimation biases inherent in traditional teacher-student score matching while reducing computational overhead through a lightweight density-ratio network. The framework achieves superior one-step generation performance on ImageNet and CIFAR-10 by simplifying the training pipeline and improving supervision accuracy.

Identifying Business Logic Vulnerabilities via Annotation-Based Sanitization

ANOTA is a human-in-the-loop sanitizer framework designed to detect business logic vulnerabilities that traditional fuzzing tools miss due to a lack of semantic context. It utilizes a lightweight annotation system to encode domain-specific knowledge, allowing a runtime monitor to identify deviations from intended application behavior. In evaluations, ANOTA discovered 22 previously unknown vulnerabilities and 17 new CVEs, outperforming existing bug-finding methods.

Code

yolo-cage – AI coding agents that can't exfiltrate secrets

yolo-cage is a sandboxing environment designed to run autonomous coding agents like Claude Code in "YOLO mode" while mitigating security risks. It utilizes Vagrant-managed MicroK8s pods to isolate agents by branch, employing an egress proxy for secret scanning and a dispatcher to restrict Git and GitHub API operations. This architecture prevents accidental repository destruction and credential exfiltration, shifting the security boundary from real-time user prompts to the PR review stage.

Retain – A unified knowledge base for all your AI coding conversations

Retain is a native macOS application that aggregates AI conversations from sources like Claude Code, ChatGPT, and claude.ai into a unified, searchable local knowledge base. It utilizes FTS5 for full-text search and automatically extracts user preferences and corrections to build persistent context. These learnings can be exported to CLAUDE.md files, enabling LLMs to maintain consistent behavior and reduce redundant instructions across different platforms.

Grov – Multiplayer for AI coding agents

Grov is a collective AI memory layer for engineering teams that captures reasoning, architectural decisions, and codebase context from AI sessions to share across the team. It integrates with tools like Claude Code and Cursor via proxy or MCP to eliminate redundant exploration and reduce token spend. Key technical features include anti-drift detection, extended prompt caching, and auto-compaction that preserves critical reasoning when context windows reach capacity.

Unified Python SDK for Multimodal AI (OpenAI, ElevenLabs, Flux, Ollama)

Celeste AI offers a type-safe, multi-modal, and provider-agnostic primitive layer for interacting with various AI models. It provides a unified API to over 15 providers (e.g., OpenAI, Anthropic, Gemini) across modalities like text, image, audio, and video, enabling easy switching between models without vendor lock-in. Designed for clean I/O, it focuses on core primitives rather than complex frameworks.

I built a tool that forces 5 AI to debate and cross-check facts before answering

KEA Research is a multi-AI collaboration platform that addresses the challenge of inconsistent responses from individual AI models. It orchestrates multiple AI providers (e.g., OpenAI, Anthropic, Google, Mistral, xAI, Ollama) through a structured 4-step pipeline—Initial, Refine, Evaluate, Synthesize—to cross-validate information and identify consensus. The platform delivers verified, trustworthy answers by synthesizing the best-ranked AI's response, offering transparency into the process, and supporting features like visual intelligence and fact verification.