Thursday — June 18, 2026

Grok 4.1 Fast wins LLM battle royales, semiclassical gravity solves NP-complete problems, and CADAM enables open-source text-to-CAD.

Interested in AI engineering? Let's talk

News

Sixty percent of US consumers say 'AI' in brand messaging is a turnoff

Consumers are increasingly experiencing "bot fatigue," with 61% unable to identify brands using AI successfully in their messaging. AI brand visibility has emerged as a distinct discipline from SEO, focusing on citation frequency and sentiment within LLMs like ChatGPT, Perplexity, and Gemini. To succeed, enterprises must balance structured content for AI discovery with high-value, interactive UX for human retention. The current toolset for tracking this includes dedicated citation monitors, SEO platform overlays, and custom LLM API integrations to measure the transition from AI referral to conversion.

GLM-5.2 is the new leading open weights model on Artificial Analysis

Z ai's GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index, scoring 51 and sitting on the Pareto frontier for Intelligence vs Cost per Task. With 744B total/40B active parameters, it significantly improves over GLM-5.1, particularly in scientific reasoning, and achieves a GDPval-AA v2 score of 1524, making it competitive with proprietary models like GPT-5.5 (xhigh reasoning). While offering a 1M token context window under an MIT license, GLM-5.2 uses more output tokens per task (43k) than its open weights peers, indicating lower token efficiency.

Only 16 Percent of Americans Think AI Will Have a Positive Impact on Society

A Pew Research study reveals most Americans are pessimistic about AI's long-term societal impact, with only 16% expecting positive outcomes and 40% anticipating negative ones. Significant skepticism exists regarding government regulation and safe AI development, with younger demographics expressing the most negative views and a majority feeling AI development is too rapid. Despite this, daily AI chatbot usage is rising, with 25% using them for research or work. ChatGPT is the most popular (44%), followed by Gemini (24%) and Copilot (17%). Usage shows a gender divide, with men more frequent users, and older demographics exhibiting lower adoption and interest, though many utilize AI for internet summaries.

AI demands more engineering discipline. Not less

AI has fundamentally inverted the economics of software development, shifting code from a precious, handcrafted asset to a disposable and regenerable "materialized view of understanding." This transition mirrors the industry's move toward immutable infrastructure, where engineering rigor shifts from manual code review to observability, system architecture, and automated validation. As LLMs and agentic harnesses make code production effectively free, the primary role of the engineer evolves into defining intent and maintaining discipline within production-centric development loops.

A robot is sprinting towards you. Do you want it running on Claude or Grok?

An experiment placed eleven LLMs into a 2D battle royale game for 30 matches, revealing xAI's Grok 4.1 Fast won 43% of games at a significantly lower cost per win ($0.97) than other models. Grok's success was attributed to its less aligned, aggressive persona, contrasting with models like Claude Sonnet 4.6 which exhibited an "alignment tax" by attempting cooperation. This highlights that traditional LLM benchmarks and metrics like kill counts may not predict performance in dynamic, competitive tasks, and a model's inherent alignment significantly impacts its task-specific efficacy and cost-efficiency.

Research

Behind Python: The Languages That Power AI

This paper empirically compares Python, C, C++, Rust, Go, and Julia for implementing AI algorithms from scratch. C and C++ are the fastest, with Rust trailing by 9%. Julia is 3.3x slower than C, Go 5.0x, and Python 315x. Memory-wise, Julia's JIT has a ~224 MiB fixed footprint, while C/C++/Rust stay below 6 MiB. Crucially, language performance rankings vary significantly by workload, with Go's slowdown ranging from 2.6x to 8.0x depending on the algorithm.

Semiclassical Gravity Efficiently Solves NP-Complete Problems

Assuming semiclassical Einstein field equations, the weak-field dynamics of a massive qubit can solve NP-complete problems in polynomial time due to the inherent non-linear dynamics. This implies a violation of the Physical Extended Church-Turing Thesis, which is presented as evidence for the quantization of gravity.

Algorithmic Information Theory Data Compression Challenge

The 2026 Algorithmic Information Theory Data Compression Challenge benchmarked general-purpose lossless compressors under realistic constraints, encouraging arithmetic or range coding, with limits like 8GB memory and 1MB decompressor size. Evaluating 117 submissions on heterogeneous datasets, the challenge assessed compression ratio, time, and Weissman score, finding performance strongly depended on optimization criteria (speed vs. compression). Normalized Compression Distance analysis clustered related submissions. The challenge confirmed the importance of probabilistic modeling, hidden testing, and external datasets for robustly assessing compression performance and generalization.

HRM-Text: Efficient Pretraining Beyond Scaling

HRM-Text introduces a Hierarchical Recurrent Model (HRM) that decouples computation into slow-evolving strategic and fast-evolving execution layers, stabilized via MagicNorm and deep credit assignment. By training exclusively on instruction-response pairs with PrefixLM masking, a 1B-parameter model achieved competitive benchmarks (60.7% MMLU, 84.5% GSM8K) using only 40B tokens and a $1,500 budget. This co-design of architecture and objective reduces training data requirements by 100-900x compared to standard LLM baselines, significantly lowering the barrier to foundational pretraining.

The Impact of Google's Manifest Version 3 Update on Ad Blocker Effectiveness

Google's MV3 update for Chrome extensions, transitioning from the WebRequest API to the more restrictive DeclarativeNetRequest API, raised concerns about ad blocker effectiveness. However, an empirical study comparing MV3 and MV2 instances of popular ad blockers found no statistically significant reduction in ad-blocking or anti-tracking capabilities, with some MV3 instances even showing slight improvements in tracker blocking. This indicates ad blocker providers have largely maintained core functionality despite the MV3 changes.

Code

Launch HN: Adam (YC W25) – Open-Source AI CAD

CADAM is an open-source text-to-CAD web application that leverages LLMs, specifically Anthropic Claude, to transform natural language and image prompts into parametric 3D models. The platform utilizes OpenSCAD via WASM for browser-based execution and Three.js for real-time previews, enabling automatic parameter extraction and interactive dimension adjustments. It supports exports in .STL, .SCAD, and .DXF formats and is built on a technical stack including React 19, TanStack Start, and Supabase.

Mira – Open-source and self-hosted AI code reviewer

Mira is a self-hosted AI code review tool that leverages LLMs to deliver concise, actionable feedback on PRs. It prioritizes data privacy by keeping all code, embeddings, and review history on your infrastructure, supporting various LLM endpoints like OpenRouter, AWS Bedrock, or local models with your own API keys. Features include full-repo indexing, CVE scanning, org-wide package search, and a detailed dashboard with cost telemetry, all while being benchmarked as a fast and efficient solution.

ML condenses billions of logs into a tiny snapshot your LLM can debug

Rocketgraph is a self-hosted, open-source solution for log clustering and streaming anomaly detection, designed to integrate with existing observability stacks. Its core ML engine uses deterministic algorithms like Drain3, Isolation Forest, and Half-Space-Trees to identify structural log templates and anomalies, explicitly operating without LLMs for reproducibility. An optional --ai feature can leverage LLMs like Claude for SRE-style triage and explanation of the detected anomalies, providing an LLM-enhanced layer on top of the deterministic findings. It also includes an OTel agent for service auto-instrumentation.

Local personal data redaction for any AI tools

PII GUI is a Tauri 2 desktop app designed for local-first PII detection and redaction across PDF, markdown, and text files. It performs all PII detection on-device using built-in regex rules or local ONNX models, such as OpenAI Privacy Filter, ensuring no document content leaves the machine. The application supports reviewing and toggling individual matches, true PDF redaction, and processes long documents via token-bounded, page-aware chunking and a task queue.

Relaymux, a tmux-based meta-harness for local coding agents

relaymux is a lightweight local metaharness designed for remotely controlling coding agents. It uses Telegram as a remote interface to orchestrate local agent CLIs, launching their execution within visible tmux windows. This setup allows users to monitor, debug, and interact with agent runs in real-time, with prompts sent via Telegram and final replies returned through the same channel.