I’ve been thinking a lot lately about the economics of API tokens, OpenAI’s specifically.
I’m not a hardware engineer or a venture capitalist. Nevertheless, I’ve been trying to put up two very different things I’ve been reading. When you put them together, the picture for startups looks complicated.
Basically, we are building businesses on top of a price that is a fake.
The two bills
I’ve been digging into reports from SemiAnalysis (the deep technical guys) and Sequoia Capital (the big money guys), and they seem to be looking at two different realities.
- The Electricity Bill
If you believe the technical analysis, the actual cost to generate a token is dropping fast. OpenAI is getting better at this. They use “Mixture of Experts” models and quantization, which basically means the hardware has to work less to give you an answer. If you just look at the electricity and the chips, the “fair” price should probably be lower than it is today. It’s becoming a commodity. - The Credit Card Bill
But then you read Sequoia’s “The $600B Question“. The industry has spent an absolute fortune on NVIDIA chips. To make that money back, companies like OpenAI and Google can’t just cover their electric bill. They need to generate revenue on the scale of the entire internet just to break even on the capital expenditure.
So, we have a paradox: The marginal cost is dropping, but the sunk cost is massive.
The AWS Deja Vu
This whole situation feels incredibly familiar. It reminds me of the early days of Cloud.
Remember when AWS first popped up? It was aggressively cheap. It felt like a no-brainer to ditch your own servers and move to the cloud. Jeff Bezos famously said, “Your margin is my opportunity”.
They got everyone hooked. But once the whole internet was running on AWS, the game changed. They didn’t necessarily jack up the base price, but they created “sticky” services: databases and proprietary tools that you couldn’t easily leave. Now, cloud bills are massive, and companies are realizing they are locked in.
I can’t help but feel OpenAI is in that early “get them hooked” phase.
They are likely subsidizing the current price, keeping it low enough to kill off the idea of you hosting your own Llama model, but high enough to cover their basic running costs.
The “Wrapper” Trap
This is where I get worried for the “wrappers”, the ones basically reselling GPT with a nice UI. If you read Sequoia’s take on Generative AI’s Act Two, you start to see why this is a problem.
If you are building a business based on the current price of tokens, you are betting on a variable that is completely out of your control.
Scenario A: Prices crash. If the open-source models (like Llama) get good enough and cheap enough, intelligence becomes like electricity. It’s everywhere and it’s cheap. If your only value is “I give you access to AI,” your margin evaporates.
Scenario B: OpenAI gets hungry. If Sequoia is right and these companies need to make billions to justify their existence, they won’t do it by selling cheap tokens forever. They will start moving up the stack. They’ll build the features you are building.
So, what’s the play?
Again, just thinking out loud here, but it seems like the era of “API Arbitrage”: buying tokens for $1 and selling the output for $1.20 is a dangerous place to be.
If I were building something today, I’d try to make sure the “AI” part is just a swappable component. The real value has to be in the workflow, the data, or the “system of record”. If you own the customer’s data, you can swap out OpenAI for Anthropic or a local model whenever the pricing winds change. This is exactly what we do at Exelor
But if your whole business model relies on OpenAI keeping their prices exactly where they are? That feels like a risk I wouldn’t want to take.
Leave a Reply