AI Radar

Real-time signals, quick thoughts, and curated picks from the AI frontier.

Latest Pulse

Karpathy's Autoresearch: AI Agents Running ML Experiments Overnight

Karpathy pushed 630 lines of Python to GitHub and went to sleep. By morning, his AI agent had run 50 experiments and committed the results to git. No human input in between.

The tool is called autoresearch — and it's now the most viral open-source AI project of the month.

The setup is almost too simple. You write research instructions in a Markdown file. An AI agent reads it, modifies a training script, runs a 5-minute experiment on a single GPU, checks if validation loss improved, keeps or reverts, and repeats. 12 experiments per hour. ~100 overnight.

After 2 days and 700 experiments: 20 genuine improvements that stacked, an 11% training speedup on code he thought was already optimized, and a bug in his own attention implementation he'd missed for months. The agent caught it.

AIAgentsML Research

2026-03-192026-03-19

Claude Code is changing how I build

Been using Claude Code for a week now and it's genuinely changing my workflow. The agentic loop + file editing combo means I can describe architecture-level changes and watch them materialize. Not perfect, but the iteration speed is wild.

AITools

2026-03-182026-03-18

Liquid AI's LFM2.5: Full AI Models Running in Your Browser

A 1.2B parameter model just ran chain-of-thought reasoning in my browser tab. No API. No server. No bill.

Liquid AI dropped LFM2.5 and the WebGPU demos are wild. Vision model does real-time webcam captioning, fully client-side. Thinking model runs chain-of-thought reasoning in a browser tab at 0.28 seconds.

Beats Llama 3.2 1B on GPQA, MMLU Pro, and IFEval benchmarks. Static deployment ships as HTML/JS/WASM — host on any CDN with zero inference cost.

This is what edge AI was supposed to look like.

AIWebGPUEdge AI

2026-03-172026-03-17

Google Quietly Built the Most Complete Agentic AI Ecosystem

Google quietly built the most complete agentic AI ecosystem in the industry. And nobody's talking about the full picture.

Models + Tools + Frameworks + Protocols. Gemini 2.5 Pro, ADK (Agent Development Kit), A2A protocol. 750M+ Gemini users, 18.3K ADK stars, 150+ A2A partners.

While everyone's focused on individual model benchmarks, Google assembled the full stack for agentic AI: the models, the developer tools, the communication protocols, and the distribution.

AIGoogleAgents

2026-03-172026-03-17

RAG evaluation is still an unsolved problem

Every team I talk to is building RAG. Almost none of them have good evaluation. We default to vibes-based testing — "does this answer look right?" — and call it done. The gap between building RAG and knowing if it works is massive.

RAGEngineering

2026-03-152026-03-15

How Alibaba's Qwen Became #1 in Open-Source AI

Alibaba's Qwen just became the #1 open-source AI family on the planet. 700M+ downloads. 90,000+ derivative models. The top 4 spots on HuggingFace. All in 3 years.

From a small Alibaba experiment to dominating every global leaderboard — this is one of the most underrated stories in AI right now.

AIOpen SourceLLMs

2026-03-132026-03-13

Cloudflare Just Launched a Web Crawling API

The company that built its reputation blocking bots just launched a web crawling API.

Cloudflare quietly dropped a /crawl endpoint this week. One API call, and you get clean, structured content from any URL. The irony is beautiful — and the implications for RAG pipelines are massive.

If you're building any kind of retrieval system, this changes the data ingestion game completely.

AIRAGTools

2026-02-272026-02-27

The State of Local LLM Inference

Every local LLM user has done this. Download a model. Wait 20 minutes. Launch it. Watch it crawl at 3 tokens per second — or not load at all.

The gap between cloud inference and local inference is still massive. But tools like llmfit are starting to close it — optimizing models for your specific hardware, quantization level, and memory constraints.

The future of local AI isn't just about smaller models, it's about smarter deployment.

AILLMsLocal AI