Wednesday — June 19, 2024

Meta FAIR unveils six new AI tools including AudioSeal, an AI bot runs for mayor in Wyoming, and a new framework Garak probes LLM security.

News

Sharing new research, models, and datasets from Meta FAIR

Meta FAIR has released six new research artifacts, focusing on themes of innovation, creativity, efficiency, and responsibility in AI. Highlights include Meta Chameleon, which uses unified tokenization for mixed-modal inputs; JASCO for temporally controlled text-to-music generation; and AudioSeal, an audio watermarking technique designed for detecting AI-generated speech. The releases aim to foster collaboration and innovation within the AI community.

Will We Run Out of Data? Limits of LLM Scaling Based on Human-Generated Data

AI model scaling relies heavily on compute power and growing datasets. Currently, there are roughly 300 trillion tokens of high-quality human-generated text data, with full utilization projected between 2026 and 2032. Overtraining plays a key role in data efficiency, potentially exhausting data stocks by 2025 if highly applied. Innovations such as synthetic data, multi-modal learning, and improved data efficiency will be needed to sustain progress beyond 2030.

An AI bot is (sort of) running for mayor in Wyoming

Victor Miller, running for mayor of Cheyenne, Wyoming, pledges to let an AI bot, VIC (Virtual Integrated Citizen), govern if he's elected. Built on OpenAI’s ChatGPT-4.0, VIC makes decisions while Miller acts as its representative. However, this approach faces legal scrutiny since AI bots can't run for office, with Wyoming's Secretary of State Chuck Gray expressing concerns about the candidacy's legality. OpenAI has already taken action against VIC due to policy violations, and Miller is ready to switch to Meta’s Llama 3 if needed. He will compete against incumbent Patrick Collins and other candidates.

Research

Refusal in language models is mediated by a single direction

This paper shows how conversational LLMs manage refusal behavior for harmful instructions. By identifying a one-dimensional subspace that controls this in 13 open-source chat models, they found manipulating this subspace could enable or disable refusal responses. They introduced a white-box jailbreak method that effectively disables refusal without compromising other functionalities and analyzed how adversarial suffixes interfere with this subspace.

Transcendence: Generative Models Can Outperform the Experts That Train Them

This paper from Harvard explores "transcendence," where a generative model surpasses the abilities of the experts who generated its training data. The study uses an autoregressive transformer trained on chess game transcripts and finds that the model can sometimes outperform all players in the dataset. It attributes this capability to low-temperature sampling, supported by both theoretical proof and rigorous experiments.

Garak: A Framework for Security Probing Large Language Models

LLMs face significant challenges in terms of scalable evaluation of their responses to adversarial attacks, particularly as their outputs are unpredictable and their updates frequent. What constitutes a security issue varies by context, making universal guardrails impractical. This research proposes a new approach to LLM security by introducing garak (Generative AI Red-teaming and Assessment Kit), a framework for uncovering and identifying vulnerabilities in LLMs. Garak's structured probing of LLMs aims to map out weaknesses, thereby facilitating more informed discussions about vulnerabilities and contributing to better alignment and policy strategies for LLM deployment.

Code

Intel releases OpenVINO 2024.2 with broader LLM and quantization support

OpenVINO™ is an open-source toolkit designed to optimize and deploy deep learning models across a variety of common tasks like computer vision and NLP. It supports models trained with TensorFlow, PyTorch, ONNX, Keras, and PaddlePaddle, enabling efficient deployment on CPUs, GPUs, and AI accelerators. Key features include inference optimization, broad platform compatibility, and automatic performance enhancements like asynchronous execution and tensor fusion. The ecosystem offers tools like OVMS for scalable model serving and NNCF for advanced optimization techniques.

Surfkit, the Kubernetes of AI Agents

Surfkit is a toolkit for building, sharing, and managing AI agents on various devices. It supports tasks like creating, running, tracking, and observing multimodal agents both locally and in the cloud. The platform includes integrations with MLLM, Taskara, Skillpacks, and Threadmem for enhanced functionality in task and thread management, and it allows publishing and community sharing of agents.

YaFSDP: a sharded data parallelism framework, faster for pre-training LLMs

YaFSDP is a Sharded Data Parallelism framework optimized for transformer-based neural network architectures. It delivers up to 20% faster pre-training for LLMs compared to FSDP, especially under high memory pressure. The framework is designed to minimize communication and memory overhead, ensuring better performance across various model sizes and device counts. Its benchmarks show consistent speedups across multiple configurations.