Overview
A single AI query costs a fraction of a cent, which makes AI feel free. At company scale, those fractions become one of the largest line items in the budget. This report explains the economics nobody puts in the launch blog post — and how to build AI products that don't bleed money.
Training vs inference
Training a frontier model costs tens to hundreds of millions — but that's a one-time (or periodic) cost. For a product with real usage, inference — running the model for every user request, forever — dominates lifetime cost. A viral AI feature can generate an enormous, recurring compute bill that scales with success, not against it.
Why so many AI features lose money
Much of the current AI boom runs on subsidized pricing: free tiers and cheap APIs sold below true cost to win users, funded by investor capital. It's a land-grab. When the subsidies normalize, many "AI-powered" features that looked like wins will reveal negative unit economics.
The cost-control toolkit
Profitable teams engineer costs deliberately: cache repeated requests, route easy queries to small cheap models and only escalate hard ones, trim context (tokens are the meter), batch work, and set quotas. These techniques routinely cut costs 50–90% with minimal quality loss — the difference between a sustainable product and a money pit.
What this means for you
If you build with AI, treat unit economics as a first-class design problem: know your cost per request, cache aggressively, pick the smallest model that meets the bar, and guard against runaway usage. If you invest or compete, scrutinize whether an "AI product" has real margins or just cheap demos.
Honest limits
Prices keep falling fast, which can rescue some today-unprofitable products tomorrow. But usage and expectations rise just as fast, so cost discipline never stops mattering. Cheap per query is not the same as cheap at scale.
