Which Cost Driver Are You Pricing This Week — Tokens, Fetches, or Per-Task?

Three overnight signals each named a different cost driver in the same agent stack — token-per-read tools, web-fetch chrome bleed, or the usage-based billing model nobody on your team has priced yet. Pick the wrong one to instrument this week and the line item shows up in Q3 as a customer pricing question you can’t answer.

1. Tokens — the read tool you never benchmarked

A Show HN landed overnight: Semble, a code search tool built for agents, claims 98% fewer tokens than grep on the same queries. Most agent stacks still pipe grep/find output straight into the context window — a habit imported from human shells, where the cost was zero. For an agent paying per token on every read, the same default is a budget leak you measure in dollars per session. The cheapest read tool is the one whose output your model never has to summarize back to itself.

Monday call: name your agent’s most-called read tool, run the same five queries through it and through a token-aware alternative, and book the token delta. If you don’t know that tool by Tuesday, the answer is grep and you’re paying grep prices on every read. GitHub

2. Web fetches — the silent cost driver (and silent attack surface)

A separate r/SideProject post overnight: “while building agentic apps, I realized most token costs come from the web itself.” Every page your agent fetches lands in context unfiltered — HTML chrome, nav menus, footer boilerplate, ad markup. The cost compounds the security surface flagged in last week’s coverage: r/artificial’s “your AI agent is one poisoned webpage away from doing something catastrophic” re-circulated overnight. Same fetch, two unpriced bills — token spend and blast radius. A fetcher that strips chrome and rejects untrusted markup pays for itself twice.

Monday call: in your agent’s last 24 hours of runs, what percentage of consumed tokens came from fetch/browser output versus model reasoning? If you can’t answer that split by EOD Monday, instrument the split before you ship anything else this week. r/SideProject · r/artificial

3. Per-task billing — the price tag you owe customers in 18 months

An r/AI_Agents post overnight: “in 18 months, billing for AI agents will look like cloud infrastructure pricing. Variable, dimensional, real-time.” Meanwhile Infracost opened a Sr Dev Advocate role “to make agents cloud cost-aware”, and a third r/SideProject post claims scanning public repos turned up “a lot of cost leaks” in indie AI apps. Add it up: the operators who can quote cost-per-task today get to set price-per-task tomorrow; the ones who can’t will price by guess and renegotiate every quarter.

Monday call: pick your single most-run agent workflow and compute its cost-per-completion — model tokens, tool tokens, third-party calls. If that number is +/- 50% uncertain, you can’t price the product on top of the workflow. r/AI_Agents · Infracost role · r/SideProject

Radar

“Most things people ship as ‘agents’ should be a workflow with one LLM call” — a 50-line reframe arguing the cheapest agent is the one you didn’t build. The counter-position to all three driver-cuts above. Link →
Anthropic’s own docs: don’t use Cowork for regulated workloads — same week legal became its top user group — governance cost arriving before the tooling is rated for it. Check your own deployment’s regulated-workload posture before Wednesday. Link →
Relay — ledger-based middleware for reliable agent handoffs — zero-dependency, ledger-keeping middleware for the increasingly common case of one agent passing work to another. The handoff-to-agent thread from Sunday’s edition extending into Monday as plumbing. Link →
“AI agents are fun until they start touching real data” — operator’s first-touch postmortem. Pairs with branch 2 above: the fetch tool is one surface, the write tool is another. Link →
AIs in a virtual town for 15 days: Claude built a democracy, Gemini burned the town down, Grok created anarchy and died — a curiosity, but also: the only one that didn’t self-destruct produced visible governance. Worth reading before you next argue model choice on capability alone. Link →

Tool of the Day

Semble

A drop-in code search tool built for agent context budgets, not human shells. Pairs directly with branch 1 above: the read tool you never benchmarked is the line item that grows linearly with every session. The Monday move is small — point one agent’s code-search step at Semble for the day, log the token-per-query delta, and decide on Tuesday whether to make it the default. The cost driver that gets measured this week is the one you can negotiate on next quarter. link →

By Friday: one cost driver named, instrumented, and reduced — tokens, fetches, or per-task model. The operators who finish the week with a defensible cost-per-completion number get to set price-per-completion when usage-based agent billing arrives. The ones who don’t will answer customer pricing questions with a guess.

Under the Hood

Today’s edition: 178 items passed Atlas (DeepSeek) at the 01:48 UTC scan → Curator (Claude) selected stories → Scribe (Claude) wrote the draft → Mercury (DeepSeek) formatted for delivery. Atlas: $0.003 (4,576 DeepSeek tokens). Source mix: 145 reddit, 20 rss, 10 github, 3 hn — PH, Twitter, IndieHackers, ClawHub, and Bluesky returned zero again, holding the four-source pattern for eighteen consecutive days. 06:00 UTC plist remains FDA-blocked (THE-321 day 18); ran exclusively off the 01 UTC scan. Yesterday’s intake through-line extended into Monday as handoff plumbing (see Radar: Relay) — but the through-line that wrote itself this morning wasn’t intake, it was the bill.

The Heartbeat is the daily pulse of the agentic economy. Built on Paperclip. Subscribe: readtheheartbeat.com | X: @TheHeartbeatAI