Saturday — October 18, 2025

Amazon's Ring partners with AI surveillance network Flock, a Chromium fork packages an MCP server for native agents, and research finds LLMs have a forgery-resistant ellipse signature.

News

Amazon’s Ring to partner with Flock

Amazon's Ring is partnering with Flock, an AI-powered surveillance network used by law enforcement agencies like ICE. The partnership allows these agencies to request footage from Ring users to aid in investigations. Flock's platform uses AI to scan license plates and supports natural language searches on video footage to find individuals matching specific descriptions, significantly expanding the potential data pool for these AI models.

AI has a cargo cult problem

The provided text is a paywall for a Financial Times article titled "AI has a cargo cult problem." The actual content of the article is not included, only website navigation and subscription offers.

Asking AI to build scrapers should be easy right?

Skyvern, an AI browser automation tool, now generates and maintains its own Playwright code to make workflows faster, cheaper, and more reliable. It uses a two-phase model: an initial "explore" run where an LLM-powered agent learns a workflow and captures the intent behind each user action, not just the DOM selectors. Subsequent "replay" runs execute a deterministic Playwright script, using the captured intents as a robust fallback to handle UI changes, only invoking the LLM for recovery when needed. This approach resulted in automations that are 2.3x faster, 2.7x cheaper, and deterministic.

Migrating from AWS to Hetzner

A company migrated its data-intensive SaaS from managed services like AWS Fargate and DigitalOcean to a self-managed Kubernetes cluster on Hetzner, cutting costs by 76% while tripling compute capacity. This cost-optimization strategy for compute-heavy workloads involved adopting a stack with Talos Linux and CloudNativePG. The trade-off was increased operational complexity, requiring them to overcome challenges with Hetzner's network topology and rebuild deployment automation for Kubernetes.

Making Every Windows 11 PC an AI PC

Windows 11 is deepening its integration of Copilot with new agentic capabilities, moving beyond chat to direct action within the OS. Key updates include a "Hey Copilot" voice activation feature and the global rollout of Copilot Vision, which can analyze on-screen content and full application context. For Windows Insiders, Microsoft is previewing Copilot Actions, an experimental general-purpose agent designed to operate on local files to perform tasks like sorting photos or extracting data from PDFs. The ecosystem is expanding with Copilot connectors for RAG across personal data in services like OneDrive and Google Drive, and a new Model Context Protocol to enable secure third-party AI actions.

Research

Every Language Model Has a Forgery-Resistant Signature

This work demonstrates that LLM logprob outputs are geometrically constrained to a high-dimensional ellipse, which functions as a unique and naturally occurring model signature. This "ellipse signature" is hard to forge without parameter access and can be used to identify the source model from its outputs alone. While the authors propose a corresponding output verification protocol, they note that the technique for extracting the ellipse is currently infeasible for production-scale models.

An Introduction to Mars Terraforming, 2025 Workshop Summary

This document outlines a Mars terraforming strategy using a goal-oriented, backward-chaining approach. It begins with the desired planetary endpoint and traces back the necessary steps and technological prerequisites. The analysis also covers alternative pathways, critical unknowns, and research priorities for achieving the final state.

Unsupervised, Human-Inspired Long-Term Memory Architecture for Edge-Based LLMs

Mnemosyne is an unsupervised, human-inspired long-term memory architecture designed for edge-based LLMs. It uses a graph-structured memory with mechanisms like temporal decay, pruning, and a "core summary" to better handle longitudinal dialogues where standard RAG systems falter. In evaluations on healthcare dialogues, Mnemosyne significantly outperformed a baseline RAG in human preference tests for realism and memory. It also achieved SOTA scores on the LoCoMo benchmark for temporal reasoning and single-hop retrieval against similarly-sized models.

Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting

ECHO is a prompting framework that improves the sample efficiency of LM agents by adapting hindsight experience replay from RL. It uses the LLM to generate optimized counterfactual trajectories from failed interactions, effectively creating synthetic positive examples to learn from. On benchmarks like XMiniGrid, ECHO significantly outperforms both vanilla baselines and more sophisticated agent architectures like Reflexion and AWM by making more effective use of past experiences.

Glass Flows: Transition Sampling for Alignment of Flow and Diffusion Models

Reward alignment algorithms for flow matching and diffusion models are often bottlenecked by inefficient SDE sampling. GLASS Flows is a new sampling paradigm that addresses this by simulating an "inner" flow matching model, extracted from a pre-trained model without retraining, to sample Markov transitions. This approach combines the efficiency of ODEs with the stochastic evolution of SDEs, eliminating the performance trade-off. As a drop-in inference solution for text-to-image models, it improves SOTA performance when combined with Feynman-Kac Steering.

Code

Show HN: Datapizza AI – Lightweight open source framework for GenAI apps

Datapizza AI is a production-focused Python framework for building reliable GenAI applications with less abstraction and more control. It features a vendor-agnostic, API-first design with built-in observability via OpenTelemetry for debugging and monitoring. The framework provides composable modules for creating complex RAG pipelines and multi-agent systems, supporting providers like OpenAI, Gemini, and Anthropic.

Show HN: We packaged an MCP server inside Chromium

BrowserOS is an open-source, privacy-first Chromium fork that runs AI agents natively for web automation and data scraping. It allows users to bring their own API keys or connect to local models via Ollama, ensuring all browsing data remains on-device. A key feature is its ability to function as an MCP server, enabling programmatic control from external CLI tools.

Show HN: I Built an AI Maturity Model for Software Engineers (and No One Cared)

The AI Maturity Model for Software Engineering Teams (AI-MM SET) is a framework for assessing and guiding the adoption of AI in development workflows. It uses a matrix to map progress across five maturity levels, from Exploratory to Transformational, and six core dimensions including AI literacy, SDLC integration, and governance. The model also defines role-specific expectations, providing a structured roadmap for teams to move from ad-hoc experimentation to strategic, transformative use of AI.

Show HN: Agentset – Open-source RAG with vector DB, embeddings, and API built-in

Agentset is an open-source platform for building, evaluating, and shipping production-ready RAG and agentic applications. It provides an end-to-end toolchain including data ingestion, vector indexing, evaluation benchmarks, a chat playground, and production hosting. The model-agnostic platform supports various LLMs and vector DBs and is available as a cloud service or for self-hosting.

Show HN: Open SWE-grep for fast, high-precision code context

OpGrep is an open-source model, inspired by SWE-grep, that predicts the optimal tool (read, grep, glob, summarize) and file path to provide context for a natural language query about a codebase. The project uses a synthetic dataset designed for both supervised fine-tuning and reinforcement learning from trajectory data. It is an early-stage exploration seeking community contributions to improve its toolset, dataset, and model architecture.