Saturday — April 18, 2026

AI companies buy Slack data from failed startups for training, Claude Code mechanizes complex compiler proofs, and Agent Armor secures AI agent actions via a Rust runtime.

Interested in AI engineering? Let's talk

News

Scan your website to see how ready it is for AI agents

Cloudflare’s "Is Your Site Agent-Ready?" tool evaluates website compatibility with AI agents across five technical categories: discoverability, content accessibility via Markdown negotiation, bot access control, protocol discovery (including MCP and Agent Skills), and agentic commerce standards like x402. The scanner provides actionable recommendations for developers to optimize their infrastructure for LLM-driven browsing, interaction, and automated transactions.

AI companies are buying the Slack data of failed startups

AI labs are purchasing internal communication data, such as Slack, Jira, and email threads, from defunct startups to populate "reinforcement learning gyms." This data is used to create high-fidelity simulated work environments for training and benchmarking AI agents.

Maine Said No to New Data Centers. Other States Are Racing to Follow

Maine has enacted the nation's first state-level moratorium on hyperscale data centers, halting approvals for facilities exceeding 20MW for 18 months. This legislative pushback, driven by concerns over grid stability and rising energy costs, targets the massive infrastructure required for training and deploying AI models. At least 12 other states are considering similar measures as the projected 165% increase in AI-driven power demand by 2030 faces growing public and political scrutiny.

Shuttered startups are selling old Slack chats and emails to AI companies

Defunct startups are monetizing their internal communication archives—including Slack logs, emails, and Jira tickets—by selling them to AI labs for LLM training. Facilitated by services like SimpleClosure, these deals provide high-quality workplace data but raise critical concerns regarding PII leakage and the efficacy of anonymization. This emerging market highlights the increasing demand for proprietary datasets to refine enterprise-grade AI models.

Atlassian is changing how we use customer data on August 17, 2026

Atlassian is updating its data practices on August 17, 2026, to utilize de-identified and aggregated customer metadata and in-app data to improve AI-driven features like Rovo and enterprise search. New administrative controls will allow organizations to manage data contribution, though metadata opt-out is exclusive to Enterprise plans. These changes leverage cross-customer usage patterns and common content structures to refine LLM performance while implementing safeguards to prevent re-identification.

Research

From Future of Work to Future of Workers: Addressing Asymptomatic AI Harms

This study explores the "AI-as-Amplifier Paradox," where AI-driven productivity gains mask long-term "intuition rust" and skill atrophy in high-stakes professional environments. Based on longitudinal research with cancer specialists, the authors propose a sociotechnical immunity framework to preserve human expertise and agency. The framework introduces mechanisms for workers in domains like healthcare and software engineering to detect and recover from AI-induced erosion while maintaining institutional performance.

Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs

The paper evaluates the information-preserving capacity of structured representations by fine-tuning a lightweight LLM with a novel structural loss function to generate hierarchical JSON from scientific text. Semantic and lexical similarity analysis of sentences reconstructed from these JSONs demonstrates that hierarchical formats effectively retain the meaning of complex scientific discourse.

Machine Generated and Checked Proofs for a Verified Compiler (Experience Report)

Researchers used Claude Code (Opus 4.6) to mechanize a 7,800-line Rocq correctness proof for CertiCoq’s ANF transformation. By adapting a human-written CPS proof template, the agentic LLM completed the task in 96 hours with human guidance but no manual proof writing. This demonstrates the potential for LLMs to significantly accelerate formal verification and semantic preservation proofs in compiler construction.

Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel

Event Tensor is a compiler abstraction for dynamic megakernels that addresses kernel launch overheads and synchronization bottlenecks in LLM inference. By encoding dependencies between tiled tasks, it enables support for shape and data-dependent dynamism where traditional fusion techniques fail. The Event Tensor Compiler (ETC) leverages this abstraction to generate persistent kernels that achieve state-of-the-art serving latency and reduced system warmup overhead.

ReSyn: A Generalized Recursive Regular Expression Synthesis Framework

ReSyn is a synthesizer-agnostic divide-and-conquer framework designed to handle the structural complexity of real-world regex synthesis in PBE systems. When combined with Set2Regex, a parameter-efficient synthesizer that exploits permutation invariance, the framework achieves SOTA results on complex benchmarks by decomposing synthesis tasks into manageable sub-problems.

Code

Agent Armor, a Rust runtime for enforcing policies on AI agent actions

Agent Armor is a zero-trust governance runtime written in Rust designed to secure AI agent tool calls through an 8-layer deterministic pipeline. It intercepts agent actions—such as shell, database, and API access—to allow, block, or flag them for human review based on risk scoring and policy evaluation. The system includes MCP-aware inspection, PII and secret response scanning, and comprehensive audit logging with support for SQLite and PostgreSQL backends.

Easy code and work AI agent system: auto, asynchronous, concurrency, efficiently

Codg is an AI agent system designed for code and work automation, featuring asynchronous, concurrent, and high-performance operations. It supports multiple LLM providers via API or OAuth, alongside local models (e.g., GGUF via openai-compat/claude-compat). The system offers an interactive TUI across various OS and terminals, including web and desktop (BETA), with features like diff viewing. Codg is highly configurable via codg.toml for providers, agents, skills, local LLMs (llama.cpp), and permissions, and provides CLI commands for managing plugins, messaging integrations, and sandbox execution.

LLM wiki daemon with per-wiki filesystem isolation

Memex is an LLM runtime that replaces ephemeral RAG workflows with a persistent, compounding personal wiki synthesized from raw sources. It uses Claude to incrementally build and maintain a structured collection of interlinked markdown files, automating the bookkeeping of cross-references, entity updates, and knowledge synthesis. The system is CLI-first, avoids vector databases by leveraging standard file tools, and ensures security through Linux mount namespace isolation.

Open Soucrce AI-powered screen recording

Coherence Studio is an open-source screen recording and editing suite that automates post-production using AI. It integrates Whisper for auto-captions and features AI-driven smart trimming, auto-narration, and cursor-tracking zoom. The tool supports native capture via ScreenCaptureKit and WGC, offering one-click workflows for speed ramps, background customization, and professional exports.

How context engineering works, a runnable reference

Context engineering treats context as a version-controlled engineering artifact rather than manual prompts to ensure LLM outputs align with organization-specific standards. It extends the RAG pattern by adding output and enforcement layers, creating a five-component pipeline: corpus, retrieval, injection, output, and enforcement. This reference implementation on Amazon Bedrock demonstrates how to move beyond generic AI generations toward governed, reviewable artifacts based on local ADRs and codebases.