Search Articles
Gemini 3.1 Flash-Lite Cuts Inference Costs While Doubling Speed
Google shipped Gemini 3.1 Flash-Lite in March 2026 as a purpose-built model for high-volume inference workloads. The model delivers throughput roughly double that of Gemini 3.1 Flash while maintaining
MangroveGS Predicts Cancer Spread with 80% Accuracy Using Gene Patterns
A research team from Johns Hopkins University and the National Cancer Institute published results in March 2026 demonstrating MangroveGS, a machine learning framework that predicts cancer metastasis w
Qwen 3.5 9B Outperforms Larger Models on Graduate-Level Reasoning
Alibaba Cloud released Qwen 3.5 in March 2026, and its 9-billion-parameter variant is turning heads. On the GPQA Diamond benchmark, which tests graduate-level science and reasoning, Qwen 3.5 9B scores
AI Framework Accelerates Alloy Discovery Through Expert Knowledge Fusion
Researchers at MIT and the Max Planck Institute for Iron Research published a paper in March 2026 describing an AI framework that accelerates alloy discovery by fusing expert metallurgist knowledge wi
Physical AI Reaches Deployment Stage as Simulation Gap Narrows
Physical AI, the application of artificial intelligence to robots and autonomous machines that operate in the real world, reached a deployment milestone in early 2026. Companies including NVIDIA, Figu
The AI Energy Bottleneck: How Grid-Scale Batteries Fit In
AI Cannot Scale Without Solving the Power Problem First AI runs on electricity, and there is not enough of it. Goldman Sachs projects that AI data center power consumption will rise 175% by 2030. Sigh
What Trainium3’s Neuron Switches Mean for AI Infrastructure
Amazon’s Neuron Switches Are Changing How AI Chips Communicate Individual AI chips are fast. But in a data center, what matters most is how thousands of chips work together. Amazon’s custom Neuron swi
AI Data Collection at Scale: From Delivery Routes to Training Sets
Gig Workers Are Becoming AI’s Data Collection Network Training AI systems that interact with the physical world requires real-world data that cannot be scraped from the internet. DoorDash and Uber hav
Compressing Large AI Models Without Losing Performance
How Quantum-Inspired Compression Shrinks AI Models Large AI models like GPT-4 and Llama require massive computational resources. They run on GPU clusters in data centers, consuming significant power a
How Amazon Builds and Tests AI Chips from Scratch
Inside Amazon’s Chip Lab: Where Trainium Gets Built Amazon’s custom AI chip program started in 2015 when the company acquired Israeli chip designer Annapurna Labs for $350 million. More than 10 years
NVIDIA Nemotron 3 Super Targets Multi-Agent Enterprise Coding
NVIDIA launched Nemotron 3 Super in March 2026, a 253-billion-parameter language model purpose-built for enterprise software engineering. The model is trained on a curated dataset of production-grade
Dynamic Sparse Training Cuts AI Energy Use by Up to 90%
A team of researchers from the University of Edinburgh and Google DeepMind published results in March 2026 showing that dynamic sparse training can reduce AI training energy consumption by up to 90% w
On-Device AI Gets Real with Qualcomm Dragonwing Q-8750
Qualcomm announced the Dragonwing Q-8750 processor in March 2026, delivering 100 TOPS of AI compute in a chip designed for edge devices. The processor can run 7-billion-parameter language models entir