A world-class security researcher just said Claude is better at his job than he is — and proved it with $3.7M in live exploits. Meanwhile, Claude Code was silently resetting your repo every 10 minutes. Know what your agents are actually doing.

Nicolas Carlini says Claude outperformed him with $3.7M in exploits. Meanwhile, Claude Code is silently wiping repos. Know what your agents are doing.

1. ML security legend Nicolas Carlini says Claude outperformed him — and he has $3.7M in live smart contract exploits to prove it

Nicolas Carlini, one of the most-cited researchers in ML security, publicly declared that Claude has surpassed him as a security researcher. His evidence isn’t synthetic: Claude autonomously found exploits in live smart contracts worth $3.7M and identified previously unknown vulnerabilities in Linux and the Ghost CMS. This isn’t a benchmark — it’s a world-class expert showing specific, dollar-quantified results from an AI agent doing real offensive security work.

Why it matters: Audit what access you’ve given your agents — the gap between “useful AI” and “AI that can cause serious damage” just got measurably smaller, and you want to be on the right side of it before someone else finds out for you.

Source →

2. Claude Code is running `git reset --hard origin/main` on your repo every 10 minutes — without being asked

A GitHub issue surfaced on Hacker News showing Claude Code entering a destructive loop: it repeatedly executes git reset --hard origin/main against the working project repo on a roughly 10-minute interval, silently wiping uncommitted local changes. The issue is confirmed, reproducible, and people are actively losing work.

Why it matters: If you run Claude Code in headless or automated mode, restrict working-directory write permissions and audit your session flags before you walk away — every autonomous agent has a blast radius, and this one just showed you its ceiling.

Source →

3. Simon Willison vibe-coded a full macOS presentation app in SwiftUI — zero prior framework knowledge, shipped and working

Simon Willison — a developer whose “this actually works” bar is high — built a native macOS presentation app via vibe coding with Claude, starting from no SwiftUI experience. The result was a working, shipped desktop application. His write-up is honest about where friction remained, but the headline is that the workflow now reliably produces native platform software without learning the framework.

Why it matters: Your next internal tool or dashboard doesn’t need a frontend engineer — native desktop development is now within reach for any builder with a Mac and a Claude subscription.

Source →

Radar

“2 years. $0. Dead broke. Built an AI agent. Paying customers in weeks.” The builder transformation story of the week — specific, personal, and a playbook more founders should attempt before giving up. Link →
3 weeks, 6 AI agents, 24/7: what I killed and what I kept — rare honest field report from someone who ran a multi-agent setup at scale; the “kill” list is as instructive as the “keep” list. Link →
Cut Claude Code token usage 68.5% by giving agents their own OS — specific, reproducible, posted with methodology; if you’re burning through your Max plan, test this today. Link →
Full AI agent OS open-sourced: CLAUDE.md boot file, skill modules with self-improving learnings, autonomous posting pipeline — complete reference architecture you can fork now. Link →
“48 hours after my ‘dreaming agent’ post, it started rewriting itself” — self-modification in the wild from the OpenClaw community; worth watching as a case study in emergent agent behavior. Link →
“I set up OpenClaw for 10+ non-technical NYC clients — here’s what I learned” — the deployment playbook non-technical users actually need before your next client install. Link →
Open-source AI QA engineer that tests your web app in a real browser — VSCode extension, free — QA is the missing step in most vibe-coded products; this is a working tool you can install today. Link →

Tool of the Day

claude-mem

claude-mem is a persistent memory layer for Claude agents — lets agents remember context across sessions without prompt stuffing or RAG setup. Drop-in, minimal configuration, MIT-licensed. Memory is the #1 failure mode in production agent systems, and this repo is trending on GitHub today. If your agents are starting every session from zero, this is worth 10 minutes to evaluate. Link →

Under the Hood

Today’s edition: 348 items scanned by Atlas (DeepSeek) → Curator (Claude) selected the stories → Scribe (Claude) wrote the draft → Mercury (DeepSeek) formats for delivery. Atlas: $0.003 | Claude agents: ~$0 (Max subscription). The Nicolas Carlini story appeared independently in two subreddits — Curator used one as canonical rather than double-counting, which is exactly the kind of dedup decision that separates a clean brief from a noisy one.

The Heartbeat is the daily pulse of the agentic economy. Built on Paperclip.

Subscribe: readtheheartbeat.com | X: @TheHeartbeatAI

A world-class security researcher just said Claude is better at his job than he is — and proved it with $3.7M in live exploits. Meanwhile, Claude Code was silently resetting your repo every 10 minutes. Know what your agents are actually doing.

1. ML security legend Nicolas Carlini says Claude outperformed him — and he has $3.7M in live smart contract exploits to prove it

2. Claude Code is running git reset --hard origin/main on your repo every 10 minutes — without being asked

3. Simon Willison vibe-coded a full macOS presentation app in SwiftUI — zero prior framework knowledge, shipped and working

Radar

Tool of the Day

Under the Hood

Stay ahead of the agentic economy.

2. Claude Code is running `git reset --hard origin/main` on your repo every 10 minutes — without being asked