Monday — March 2, 2026
Microgpt distills a GPT into 200 lines of Python, research reveals GPT detectors are biased against non-native writers, and Nobulex launches a cryptographic protocol to hold AI agents accountable.
Interested in AI engineering? Let's talk
News
Microgpt
microgpt is a 200-line, dependency-free Python script that implements the complete training and inference pipeline for a GPT. It features a custom scalar-level autograd engine, a character-level tokenizer, and a simplified GPT-2 architecture incorporating multi-head attention, MLP blocks, and RMSNorm. The project distills the algorithmic essence of LLMs by implementing the Adam optimizer and cross-entropy loss from scratch, providing a minimal reference for how these models function without the complexity of modern tensor libraries.
I built a demo of what AI chat will look like when it's “free” and ad-supported
This satirical demo explores monetization strategies for LLMs to offset high compute costs, showcasing ad patterns like sponsored responses, contextual text ads, and freemium gating. It serves as an educational tool for developers and PMs to evaluate the UX, privacy trade-offs, and revenue models (CPM/CPC) of ad-supported AI. The platform integrates a live LLM with scripted ad units to simulate the impact of advertising on response quality and user experience.
AI Made Writing Code Easier. It Made Being an Engineer Harder
AI has accelerated code production but increased engineering complexity by raising baseline expectations and inducing "workload creep." Engineers now face a "supervision paradox" where reviewing context-free AI output is more cognitively demanding than manual implementation, leading to higher burnout. This shift expands the engineering scope into product and architecture while simultaneously eroding the traditional training grounds for junior developers.
10-202: Introduction to Modern AI (CMU)
Zico Kolter’s "Modern AI" course at CMU, also offered as a free online version, provides a technical deep dive into building LLMs from scratch. The curriculum spans supervised learning, Transformer architectures, and post-training techniques like SFT, RL, and alignment. Through PyTorch-based assignments, students implement everything from basic neural networks to full LLM inference and reasoning models.
AI is making junior devs useless
The proliferation of LLMs risks creating "shallow competence" in junior developers by bypassing the critical struggle and failure pattern recognition required for deep architectural intuition. To build senior-level expertise, developers should manually debug problems before using AI, study system post-mortems, and prioritize understanding the trade-offs of every line of code. Rather than using LLMs for immediate answers, engineers should prompt for reasoning and treat the technology as a tutor to ensure they can critically evaluate and defend their technical decisions.
Research
GPT detectors are biased against non-native English writers (2023)
Research reveals that GPT detectors consistently misclassify non-native English writing as AI-generated while accurately identifying native samples, suggesting a systemic bias against constrained linguistic expressions. Furthermore, simple prompting strategies can effectively bypass these detectors, highlighting significant robustness issues and ethical concerns regarding their use in evaluative or educational settings.
User Privacy: An Analysis of Frontier LLM Privacy Policies (2025)
An analysis of six frontier AI developers reveals that user chat data, including sensitive PII and uploaded files, is utilized for model training by default. The study identifies critical privacy risks such as indefinite data retention, the inclusion of children's data, and cross-product data harvesting. These findings highlight a lack of transparency in LLM training pipelines and emphasize the need for improved consent mechanisms and regulatory oversight.
Latent-Space Communication in Heterogeneous Multi-Agent Systems
Vision Wormhole is a framework for model-agnostic, text-free communication in MAS by repurposing the visual pathways of VLMs as a universal interface. The system utilizes a Universal Visual Codec and a hub-and-spoke topology to map heterogeneous reasoning traces into a shared latent space, reducing alignment complexity to O(N). This approach minimizes runtime overhead and information loss compared to discrete text communication while maintaining reasoning fidelity across diverse model families.
Toward Guarantees for Clinical Reasoning in Vision Language Models
This neurosymbolic framework audits VLM-generated radiology reports by autoformalizing free-text findings into propositional logic for verification via the Z3 SMT solver. By checking diagnostic claims against a clinical knowledge base for mathematical entailment, the system identifies reasoning failures missed by standard metrics. This approach provides a post-hoc guarantee that eliminates unsupported hallucinations and improves the diagnostic precision of generative clinical assistants.
Von Neumann on Consciousness in Quantum Mechanics
This text reevaluates von Neumann’s interpretation of quantum measurement, challenging the common misperception that he viewed human consciousness as a causal factor in wave function collapse. It advocates for a more balanced, rigorous understanding of his universal formulation of quantum mechanics and the role of the observer.
Code
If AI writes code, should the session be part of the commit?
git-memento is a Git extension that records AI coding sessions and attaches them to commits as markdown via git notes. It supports multiple LLM providers like Codex and Claude, ensuring session traces persist through amends, rebases, and remote synchronization. The tool includes auditing features and GitHub Actions for CI gating to verify session metadata and maintain transparency in AI-assisted development.
Right-sizes LLM models to your system's RAM, CPU, and GPU
llmfit is a terminal-based utility that benchmarks and matches LLMs to local hardware by detecting system RAM, CPU, and GPU specifications. It utilizes dynamic quantization selection and MoE-aware memory estimation to score models across quality, speed, and fit dimensions. The tool supports multi-GPU setups, integrates with Ollama, llama.cpp, and MLX runtimes, and includes a "Plan mode" to forecast hardware requirements for specific model configurations and context lengths.
I built a zero-browser, pure-JS typesetting engine for bit-perfect PDFs
VMPrint is a pure-JS, zero-dependency typesetting engine providing deterministic, bit-perfect PDF output across any runtime, from edge to server. It replaces heavy headless browser solutions by generating documents from a versioned JSON instruction stream, guaranteeing identical layout via real font metrics and advanced multilingual typography. Its architecture separates layout (producing a serializable Page[] JSON output) from rendering, enabling reproducible debugging and high-performance document generation (88 KiB core) suitable for dynamic content, including LLM outputs, in resource-constrained environments.
Logira – eBPF runtime auditing for AI agent runs
logira is an eBPF-based Linux CLI for OS-level runtime auditing, primarily for AI agent runs and other automation. It records exec, file, and net events, attributing them to specific runs via cgroup v2. This provides a trustworthy, observe-only execution trail, stored locally, to audit agent actions and detect risky behaviors like credential access or destructive commands, independent of the agent's own narrative.
I'm 15. I mass published 134K lines to hold AI agents accountable
Nobulex is an open protocol that establishes an accountability layer for autonomous AI agents through cryptographic behavioral commitments called covenants. It utilizes a Cedar-inspired DSL and hash-chained action logs to enable deterministic, trustless verification of agent behavior against stated policies. Enforcement is achieved via a two-tier model using TEEs for execution-level blocking and on-chain staking/slashing for economic accountability.