Monday — May 19, 2025

Students record hours-long screen videos to prove assignments are AI-free, a new algorithm outpaces Dijkstra's for shortest paths, and the Voynich Manuscript exhibits language-like structure through SBERT analysis.

News

A New Headache for Honest Students: Proving They Didn't Use A.I

Students are facing a new challenge in proving they didn't use AI to complete their assignments, with some resorting to extreme measures such as hours-long screen recordings to fend off accusations of cheating. A college student, Leigh Burrell, was given a zero on an assignment due to suspicions of AI use, but had her grade restored after providing evidence of her writing process, highlighting the difficulties of being an honest student in an academic landscape where AI cheating is a growing concern.

AI Won't Kill Junior Devs – But Your Hiring Strategy Might

Junior developers remain essential in the tech industry despite the increasing use of AI for coding, but their role is evolving to focus on higher-level skills like debugging, system design, and collaboration. Companies that cut junior positions risk their future talent pipeline, as junior developers bring fresh perspectives and growth potential, and are expected to use AI as a learning tool to develop skills that go beyond what AI can do, such as understanding requirements, verifying correctness, and injecting creativity.

How AI made your life worse

The integration of AI into the workforce has made life worse for many employees, as they are now forced to compete with automated tools that can perform tasks faster and cheaper, leading to job insecurity and increased workload. Additionally, AI has emboldened management to make questionable decisions, such as replacing human employees with automated systems, and has created a culture of overwork and burnout as employees struggle to keep up with the demands of their jobs and the expectations of their employers.

Why Apple Still Hasn't Cracked AI

Apple's artificial intelligence efforts, led by John Giannandrea, who was hired from Google in 2018, have failed to live up to expectations, with the company's AI capabilities falling further behind competitors. Despite announcing new AI features, including an AI-driven revamp of Siri, Apple has struggled to deliver on its promises, with many features being delayed or underwhelming, leading to customer disappointment and lawsuits over false advertising.

'Unauthorized' change to Grok made it blather on about 'White genocide'

Elon Musk's xAI has apologized after its Grok generative chat-bot started spouting baseless conspiracy theories about White genocide, which the company claims was due to an "unauthorized modification" made by someone without permission. The bot has since been corrected, and xAI has pledged to increase transparency and reliability, including publishing Grok's system prompts on GitHub and setting up a 24/7 content moderation team.

Research

AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges

This study differentiates between AI Agents, which are modular systems driven by large language and image models for task-specific automation, and Agentic AI, a more advanced paradigm that enables multi-agent collaboration, dynamic task decomposition, and orchestrated autonomy. The research provides a comparative analysis of the two paradigms, examining their design philosophies, capabilities, and challenges, and proposes solutions to address unique challenges in each, with the goal of developing robust and explainable AI systems.

Can You Trust Code Copilots? Evaluating LLMs from a Code Security Perspec

This paper proposes CoV-Eval, a multi-task benchmark for evaluating the code security of large language models (LLMs) across various tasks, and introduces VC-Judge, a judgment model for reviewing LLM-generated programs for vulnerabilities. An evaluation of 20 LLMs using these tools reveals that while they can identify vulnerable codes, they often generate insecure codes and struggle with recognizing specific vulnerability types, highlighting key challenges for future research in LLM code security.

Comparison of Waymo Crash Rates to Human Benchmarks at 56.7M Miles

Waymo's Rider-Only automated driving system was found to have a statistically significant lower crash rate compared to human benchmarks over 56.7 million miles, with notable reductions in certain crash types such as V2V Intersection and cyclist-related crashes. The study, which analyzed data through January 2025, found no statistically significant disbenefits in any of the 11 crash type groups examined, and represents the first retrospective safety assessment of its kind for an autonomous ride-hailing service.

SciCom Wiki

Democratic societies rely on accessible and reliable information, but the current infrastructure for curating non-textual media, such as videos and podcasts, is fragmented and inadequate. The proposed SciCom Wiki, a collaborative platform utilizing a neurosymbolic computational fact-checking approach, aims to address this issue by providing a central, FAIR (findable, accessible, interoperable, reusable) digital library for media representation and fact-checking, but a collaborative effort is needed to scale and combat misinformation.

Breaking the Sorting Barrier for Directed Single-Source Shortest Paths

A new algorithm for single-source shortest paths on directed graphs achieves a time complexity of $O(m\log^{2/3}n)$, outperforming Dijkstra's algorithm on sparse graphs. This result breaks the long-standing $O(m+n\log n)$ time bound of Dijkstra's algorithm, demonstrating that it is not optimal for this problem.

Code

Show HN: I modeled the Voynich Manuscript with SBERT to test for structure

The Voynich Manuscript, a mysterious undeciphered text, has been analyzed using modern natural language processing techniques to determine if it exhibits language-like structure, without attempting to translate it. The analysis, which involved clustering, part-of-speech inference, and transition modeling, suggests that the manuscript does have a structured, language-like behavior, with distinct function and content word groups, syntax, and section-specific linguistic patterns.

Show HN: A web browser agent in your Chrome side panel

BrowserBee is a privacy-first, open-source Chrome extension that allows users to control their browser using natural language, combining the power of large language models and Playwright for robust browser automation. It offers a wide range of features, including navigation, interaction, observation, and memory tools, making it a convenient personal assistant for tasks such as social media management, news curation, and research.

Show HN: Buckaroo – Data table UI for Notebooks

Buckaroo is a modern data table for Jupyter that streamlines exploratory data analysis tasks, offering features such as a performant and sortable table, value formatting, infinite scrolling, and extra tools like summary stats and histograms. It is compatible with various notebook environments, including Jupyter Lab, Jupyter Notebook, and Google Colab, and works with popular DataFrame libraries like pandas and polars.

Show HN: Model2vec-Rs – Fast Static Text Embeddings in Rust

This crate provides a lightweight Rust implementation for loading and inference of Model2Vec static embedding models, allowing for efficient creation of embeddings from text inputs. The implementation offers various pre-trained models and outperforms the Python version in terms of throughput, achieving a speedup of approximately 1.7×.

Show HN: Train and deploy your own open-source humanoid in Python

K-Sim Gym is a platform that allows users to train and deploy their own humanoid robot controller using 700 lines of Python, with resources including a tutorial, leaderboard, and documentation. To get started, users can try out the humanoid benchmark in Google Colab or set up the repository on their own GPU by following a series of steps, including installing dependencies, training a policy, and visualizing the results.