Tuesday — June 9, 2026

Apple Intelligence debuts a revamped Siri with cross-app orchestration, FP8 emulates FP64 accuracy for HPC, and SoulsOnly.tff provides a font that is human-readable but breaks AI scrapers.

Interested in AI engineering? Let's talk

News

Siri AI

Apple Intelligence introduces a suite of generative AI features integrated across iOS, macOS, and iPadOS, featuring a revamped Siri AI capable of personal context awareness and cross-app orchestration. The architecture leverages on-device processing and Private Cloud Compute to execute complex LLM tasks while maintaining verifiable privacy. Key technical capabilities include multimodal Visual Intelligence, system-wide Writing Tools for text synthesis, and developer access via the Foundation Models framework and App Intents.

Apple reveals new AI architecture built around Google Gemini models

Apple has overhauled Apple Intelligence with a new architecture featuring foundation models co-developed with Google using Gemini technology. The system employs a central orchestrator to coordinate multimodal reasoning and context-aware tasks across on-device hardware and Private Cloud Compute. This update enables advanced image generation, enhanced NLU, and verifiable privacy protocols that ensure user data is never accessible to Apple or third parties.

AI is slowing down

The AI industry is facing a massive financial bubble, requiring an estimated $2 trillion in annual revenue by 2030 to justify current data center capex and the massive compute commitments made by OpenAI and Anthropic. Enterprise adoption is slowing as the shift to token-based billing highlights a lack of clear ROI, leading many companies to cap their spend on LLM services. This unsustainable trajectory relies on a "circular economy" of hyperscaler debt and unrealistic growth projections that far outpace current market demand for generative AI.

Apple Core AI Framework

Core AI is a framework for deploying and running modern AI architectures on Apple silicon, leveraging the CPU, GPU, and Neural Engine for optimized on-device inference. It features a Swift API for model management and includes tools for PyTorch model conversion to the .aimodel format, AOT compilation, and tensor-level debugging. While Core ML remains the standard for tabular and decision tree models, Core AI provides deeper control over model specialization and hardware-accelerated performance.

FrontierCode

FrontierCode is a new benchmark from Cognition designed to evaluate LLM code generation based on production-level "mergeability" and quality rather than simple functional correctness. It utilizes novel evaluation methods—including reverse-classical tests, code scope enforcement, and adaptive LLM-based grading—to achieve an 81% lower false positive rate than SWE-Bench Pro. Initial results indicate a significant performance gap in high-difficulty tasks, with Claude Opus 4.8 leading the "Diamond" subset at 13.4%, followed by GPT-5.5 and Gemini 3.1 Pro.

Research

Configuring Agentic AI Coding Tools: An Exploratory Study

Researchers analyzed configuration mechanisms for agentic AI coding tools across 2,853 GitHub repositories, identifying eight patterns ranging from static context to executable integrations. The study highlights AGENTS.md as an emerging interoperable standard for Context Files, which currently dominate the landscape over more advanced features like Skills and Subagents. Findings indicate that while adoption of complex configurations remains low, Claude Code users employ the broadest range of these mechanisms.

Do AI tutors empower or enslave learners? [pdf]

This paper examines the risks of cognitive atrophy and loss of agency resulting from unchecked AI integration in education. It advocates for intentional, transparent frameworks to ensure AI serves as a supportive tool rather than a cognitive shortcut, addressing ethical, privacy, and pedagogical concerns. The goal is to maintain critical thinking and academic integrity through informed AI usage.

FP8 Is All You Need (Part 1): Debunking Hardware FP64 as the HPC Holy Grail

The paper argues that native FP64 silicon is no longer essential for HPC, as NVIDIA’s B300 and Rubin architectures deprioritize it in favor of high-throughput FP8. By leveraging the Ozaki Scheme II and register-level fusion, FP8 tensor cores can emulate FP64 accuracy at memory-roof speeds, reaching ~500 TFLOPS on B300. The introduced Tensor-Memory Equilibrium (TME) model confirms that this FP8-based reconstruction outperforms native FP64 by over an order of magnitude, making AI-optimized hardware sufficient for all production HPC workloads.

Semantics for 2D Rasterization

μSkia is a formal semantics for the Skia graphics library mechanized in Lean, designed to optimize inefficient rasterization instruction sequences. By identifying and verifying replacements for sub-optimal patterns in Google Chrome, the authors developed an optimizer that yields an 18.7% speedup on GPU backends with negligible overhead. The framework ensures end-to-end correctness through translation validation of optimization traces within the Lean environment.

Controllable Generative Modeling in Minecraft by Training on Billions of Cubes

Dream-Cubed introduces a large-scale dataset of Minecraft worlds at voxel resolution, used to train models for efficient, semantically grounded generation of interactive 3D environments. This work presents the first large-scale study of 3D diffusion models for voxel generation, analyzing discrete and continuous formulations, data compositions, and architectural designs. The models operate directly in block space, supporting interactive workflows like inpainting and outpainting, and are evaluated using an adapted FID metric and human preference studies.

Code

SoulsOnly.tff – A font for humans not AI and keyboard firmware to type in it

Souls Only is a project designed to create human-readable but machine-illegible content through font and audio ciphers. The font system decouples the stored character stream from the rendered glyphs using GSUB ligatures and a variable font axis (REVL) to reconstruct text from randomized ASCII noise, effectively breaking scrapers and OCR. The audio sibling utilizes phase-scrambling and targeted adversarial perturbations to mislead STT models like Whisper-tiny while remaining clear to human ears.

AI Pair Programmer for Emacs

CodeTutor is an Emacs package that integrates local LLM backends like Codex and Pi to provide a read-only pair-programming tutor experience. It leverages tree-sitter, project indexing, and unified diffs on save to offer contextual feedback and architectural guidance without directly modifying source files. The tool manages context window constraints through configurable byte limits and maintains persistent project memory via a dedicated architecture file.

Caddy Defender Plugin: return garbage responses for AI crawlers

Caddy Defender is a middleware plugin for the Caddy web server designed to filter or manipulate traffic from AI scrapers and services using embedded IP ranges for providers like OpenAI, DeepSeek, and Mistral. It offers specialized responder backends including "garbage" data injection to pollute AI training sets and "tarpitting" to stall bots via slow data streaming. The plugin utilizes an efficient routing table implementation for high-performance IP matching and is configurable via standard Caddyfile syntax.

Avibe – your AI agent lives on your machine, reachable from your phone

Avibe is a local-first Agent OS that enables remote access and orchestration of CLI-based agents like Claude Code, Codex, and OpenCode. It provides a secure control plane via a browser workbench or chat integrations (Slack, Discord, Telegram) while keeping code, keys, and execution on the local machine. Key features include an "Agent Harness" for background task scheduling and monitoring, cross-agent skill management, and secure tunneling for remote steering without exposing public inbound ports.

Deep Memory – Vocabulary-driven graph memory for AI agents

@utaba/deep-memory is a vocabulary-driven graph memory library that provides AI agents with structured, persistent knowledge through schema governance. It eliminates cold-start inconsistencies and reduces token overhead by exposing a compact schema of entity types and relationship constraints for agents to read, write, and traverse. The ecosystem includes an MCP server for tool-based graph interaction, an AI-driven document indexing pipeline, and pluggable storage providers for SQL Server, Neo4j, and CosmosDB.