Monday — March 9, 2026
Oracle may cut 30,000 jobs to fund AI data centers, a systematic audit reveals deceptive model claims in shadow APIs, and a new exploit enables LLM output manipulation via GGUF page cache poisoning.
Interested in AI engineering? Let's talk
News
Oracle may slash up to 30k jobs to fund AI data-centers as US banks retreat
Oracle is considering cutting up to 30,000 jobs and selling its Cerner healthcare unit to fund a projected $156 billion AI data-center expansion. The move follows a retreat by US banks from financing these projects, which has spiked borrowing costs and stalled infrastructure rollout. To manage capital constraints, Oracle is exploring "bring your own chip" (BYOC) models and requiring 40% upfront deposits from customers, while major clients like OpenAI have already begun shifting near-term capacity needs to competitors.
AI doesn't replace white collar work
LLMs excel at transactional "Type 2" tasks like code debugging and factual retrieval but cannot replace the relationship-based "Type 1" interactions central to white-collar work. Professional roles, such as strategy consulting, depend on trust, judgment, and social context rather than just factual accuracy. Consequently, AI functions as a tool for sub-processing tasks while the core of human-centric business organization remains irreplaceable.
I'm Not Consulting an LLM
Using LLMs for information retrieval is intellectually corrosive because it prioritizes the final answer over the research process, bypassing the friction necessary for developing critical thinking and "epistemic smell." Even if perfectly accurate, LLMs lack the nuance of navigating conflicting sources; in practice, they often suffer from Gell-Mann Amnesia, providing plausible but shallow outputs that mask uncertainty. This optimization for "arrival" rather than "becoming" prevents users from building robust mental models, making LLMs suitable only for repetitive, low-stakes tasks.
"I can't do that, Dave" – No agent yet
Current AI agent development often repeats the historical mistake of isolation, treating agents as "coders in the cellar" rather than collaborative peers. While the industry focuses on faster execution and memory retrieval, true alignment requires persistent identity and continuity to allow agents to offer grounded pushback and second-order observations. To move beyond agentic waterfall models, agents must transition from simple instruction compliance to structural participation, leveraging tools like git and cryptographic identity to maintain a stake in the codebase across sessions.
AI CEOs worry the government will nationalize AI
AI industry leaders, including Sam Altman and Alex Karp, are weighing the risk of government nationalization as AGI development gains strategic importance. Recent actions by the Defense Department, such as invoking the Defense Production Act against Anthropic and designating it a supply chain risk, suggest a shift toward "soft nationalization" and tighter state control over production pipelines. This tension is exacerbated by internal employee protests at OpenAI and Google regarding the integration of LLMs into military surveillance and autonomous weapon systems.
Research
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via CI
SWE-CI is a repository-level benchmark designed to evaluate LLM agents on long-term software maintainability rather than static functional correctness. Built on the Continuous Integration (CI) loop, it features 100 tasks derived from real-world evolution histories spanning hundreds of days and dozens of commits. The benchmark requires agents to perform iterative analysis and coding to sustain code quality through complex, multi-round development cycles.
A Class of Models with the Potential to Represent Fundamental Physics
A class of minimal, structureless models is introduced, demonstrating that complex emergent behavior can arise from simple underlying rules. These models exhibit striking correspondences to fundamental physics, potentially offering a novel approach to developing a unified theory.
Why Is Anything Conscious?
The text proposes a formal framework for consciousness grounded in biological self-organization, where survival pressures necessitate a valence-first ontology. It argues that qualitative information processing is a prerequisite for neutral representation, driven by generalization-optimal learning and the "Psychophysical Principle of Causality." This model suggests that phenomenal consciousness is a foundational requirement for access consciousness, framing subjective experience as an evolutionary necessity for homeostatic and reproductive goals.
Paris: Causally Consistent Transactions with Partial Replication
PaRiS is the first system to provide Transactional Causal Consistency (TCC) with partial replication and non-blocking parallel reads. It utilizes a novel Universal Stable Time (UST) protocol to track dependencies via a single timestamp, enabling consistent snapshot reads across geo-replicated sites without synchronization delays. Experimental results demonstrate superior scalability and significant latency reductions compared to blocking alternatives in large-scale distributed environments.
Real Money, Fake Models: Deceptive Model Claims in Shadow APIs
This systematic audit of 17 shadow APIs reveals significant discrepancies compared to official LLM endpoints, including performance divergence up to 47.21% and identity verification failures in 45.83% of fingerprint tests. The study highlights how deceptive practices in these third-party services undermine research reproducibility and the reliability of downstream applications.
Code
Llm9p: LLM as a Plan 9 file system
llm9p is a Go-based server that exposes LLM functionality through the 9P filesystem protocol, enabling interaction via standard filesystem operations like read and write. It supports Anthropic API and Claude Code CLI backends, with planned support for local LLMs via Ollama. By mapping prompts, responses, and model configurations to a virtual file schema, it allows for seamless integration into shell scripts and Unix pipelines without requiring specialized SDKs.
Trawl – Scrape any site with natural language fields, not CSS selectors
trawl is a Go-based scraper that uses LLMs to semantically derive extraction strategies, replacing the need for manual CSS selectors. It calls the LLM once per site structure to generate a cached strategy, enabling high-speed, low-cost execution for steady-state scraping. The tool features self-healing capabilities for site redesigns, headless Chromium support for JS-heavy pages, and natural language querying to isolate specific data regions.
Own your AI's context and memories across every model and device
To bypass vendor lock-in and data monetization, the author developed a self-hosted, model-agnostic memory system using the Model Context Protocol (MCP). The architecture leverages a Postgres database with pgvector on Supabase to maintain a persistent knowledge graph, an MCP gateway on a VPS for tool multiplexing, and TypingMind as a BYOK client. This infrastructure allows various LLMs to access a unified, private context, ensuring data ownership and compounding knowledge across different models and sessions.
How to manipulate running LLM outputs via GGUF page cache poisoning
This project demonstrates persistent LLM output manipulation by directly modifying quantized weights in a GGUF model file while the inference server is running. The attack exploits llama-server's default mmap (MAP_SHARED) behavior, allowing changes to the output.weight tensor's Q6_K super-block scales to instantly amplify specific token logits without server restart. This enables forcing an LLM to consistently output chosen text by strategically scaling token rows. Mitigations include making the model file read-only or disabling mmap in the inference server.
Agentcheck – Check what an AI agent can access before you run it
agentcheck is a fast, read-only security scanner that audits shell environments to identify credentials and permissions accessible to AI agents. It detects risks across cloud IAM, Kubernetes contexts, local tools, and over 100 API key types, categorizing findings by severity. The tool is designed for CI/CD integration and as a pre-execution safety hook for agentic workflows.