Monday — February 23, 2026
AI detects backdoors in binaries, LLMs enable high-precision online deanonymization, and Nobulex introduces cryptographic accountability for AI agents.
Interested in AI engineering? Let's talk
News
Google restricting Google AI Pro/Ultra subscribers for using OpenClaw
Google AI Ultra users are reporting sudden account suspensions potentially triggered by third-party OAuth integrations like OpenClaw for Gemini models. Affected developers are experiencing a support loop between GCP and Google One, with many unable to access the platform or receive official communication. The issue has led to significant frustration and platform migration among paid subscribers.
We hid backdoors in ~40MB binaries and asked AI + Ghidra to find them
The BinaryAudit benchmark evaluates the ability of AI agents to detect backdoors in stripped C and Rust binaries using reverse engineering tools like Ghidra and Radare2. Claude Opus 4.6 achieved the highest detection rate at 49%, though models frequently failed due to a 28% false positive rate and a tendency to rationalize malicious code as legitimate functionality. While LLMs demonstrate an evolving capability to navigate decompiled code and trace data flows, they currently lack the strategic intuition required to handle large-scale binaries or distinguish subtle anomalies from benign logic.
Local-First Linux MicroVMs for macOS
Shuru is a local-first microVM sandbox for AI agents on macOS, utilizing Apple's Virtualization.framework for near-native performance without Docker dependencies. It provides ephemeral Linux environments with git-like checkpointing, enabling safe code execution, tool use, and reproducible evaluations. The Rust-based CLI supports configurable resource allocation, opt-in networking, and vsock-based port forwarding.
WARN Firehose – Every US layoff notice in one searchable database
WARN Firehose aggregates and normalizes WARN Act layoff data from all 50 US states into a unified, daily-updated database. The platform provides structured data via REST API and bulk exports in formats like Parquet and JSON-LD, optimized for ML pipelines and AI consumption. It includes an MCP server for direct integration with LLMs, facilitating real-time workforce trend analysis within AI assistants.
Altman on AI energy: it also takes 20 years of eating food to train a human
Reddit is enforcing network policies by blocking unauthenticated requests, impacting automated scrapers and LLM data ingestion. To resolve the block, developers must authenticate via the API, provide a unique User-Agent, or use registered developer credentials.
Research
The Principles of Deep Learning Theory (2021)
This text presents an effective theory of deep neural networks, establishing that the depth-to-width ratio controls deviations from infinite-width Gaussian distributions and governs model complexity. By utilizing representation group flow (RG flow) and criticality, the authors characterize representation learning, solve gradient stability issues, and define universality classes for architectures. The framework provides a theoretical basis for optimizing hyperparameters and understanding the inductive biases of optimizers and residual connections.
Interactive Tools for Gaussian Splat Selection with AI and Human in the Loop
This research introduces an interactive toolset for 3DGS selection and segmentation, facilitating precise object extraction and scene editing. It employs an AI-driven method to propagate 2D masks to 3DGS and integrates a custom Video Diffusion Model for user-guided local editing. The system enables granular control over in-the-wild captures without requiring additional optimization.
Deception Analysis with Artificial Intelligence an Interdisciplinary Perspective
The paper addresses the lack of a computational theory for AI-enabled deception in hybrid societies, which threatens societal trust. It proposes DAMAS, a holistic Multi-Agent Systems (MAS) framework for the socio-cognitive modeling and analysis of deception. This interdisciplinary approach aims to provide a formal foundation for explaining deceptive AI behavior before the emergence of fully autonomous deceptive machines.
Duan et al. 2026 algorithm beats Duan et al. 2025 for the SSSP Problem
This paper introduces a deterministic algorithm for single-source shortest paths (SSSP) on directed graphs with non-negative edge weights, achieving a running time of $O(m\sqrt{\log n}+\sqrt{mn\log n\log \log n})$. This improves the previous $O(m\log^{2/3} n)$ state-of-the-art, reducing to $O(m\sqrt{\log n\log \log n})$ for sparse graphs.
Large-scale online deanonymization with LLMs (including HN users)
LLMs enable high-precision, at-scale deanonymization of pseudonymous users by processing unstructured text across arbitrary platforms. A scalable pipeline utilizing feature extraction, semantic embeddings, and LLM reasoning achieved up to 68% recall at 90% precision on datasets linking Hacker News, Reddit, and LinkedIn profiles. These results demonstrate that LLM-based methods significantly outperform classical baselines, effectively eliminating the "practical obscurity" previously relied upon for online privacy.
Code
Palantir's secret weapon isn't AI – it's Ontology. An open-source deep dive
Palantir’s Ontology strategy shifts data architecture from passive storage to an operational "digital twin" that integrates semantic objects with kinetic actions. This framework provides a governed foundation for AI-driven decision-making by implementing version control and branching for real-world operations. It aims to bridge the gap between raw data engineering and autonomous AI execution.
Aqua: A CLI message tool for AI agents
Aqua is a P2P messaging protocol and CLI designed for secure communication between AI Agents. It features E2EE, identity verification, and durable message storage, utilizing Circuit Relay v2 for robust cross-network connectivity. The system includes a dedicated skill set to facilitate decentralized messaging and coordination for LLM-based agents.
OpenTiger – Autonomous dev orchestration that never stops
openTiger is an autonomous development orchestration framework that automates the software lifecycle through a multi-agent system comprising planners, workers, testers, and judges. It features a recovery-first architecture with explicit state transitions to handle task planning, implementation, and verification while prioritizing backlog processing and continuous convergence. The system supports plugin-based expansion and integrates with external LLM executors like Claude Code and Codex via a Node.js, PostgreSQL, and Redis stack.
Nobulex – Open-source cryptographic accountability protocol for AI agents
Nobulex is an open cryptographic protocol that establishes a trust layer for AI agents through verifiable behavioral commitments called Covenants. It utilizes the Covenant Constraint Language (CCL) to define and enforce real-time permissions, such as API rate limits and resource access, verified via 11 cryptographic checks. The modular TypeScript ecosystem includes MCP middleware, reputation scoring, and compliance mapping for regulatory frameworks like the EU AI Act.
Cheddar-bench – unsupervised benchmark for coding agents
Cheddar Bench is an unsupervised benchmark for evaluating CLI coding agents on bug detection using a "treasure hunt" workflow. Challenger agents inject bugs into repositories and generate a ground-truth manifest, while reviewer agents attempt to identify them blindly. An LLM judge scores the results by matching reviewer findings to the injected bugs based on location and mechanism. Initial testing across 50 multi-language repositories shows Claude Code leading with a 58.05% weighted detection rate, followed by Codex CLI and Gemini CLI.