Pricing APIs

February 2024

This post applies to APIs sold as a service (Stripe, Twilio, Stedi, etc.). It doesn’t necessarily apply to business that sell on premise software, even if their main product is an API.

Our intuitions on pricing are formed from day-to-day consumer goods and services (ie restaurants, TVs, Netflix). If you translate that intuition into APIs, you will surely regret it. For example, somebody told me once that Box, the enterprise filesystem company, initially priced their API by porting over their SaaS pricing. After a few months of usage, they quickly found themselves losing money and decided to move to usage-based pricing with some bucketing to make costs predictable. I haven’t been able to verify the story because most of the pricing pages from that time are 404s.

On the consumption habits of programs

By definition, APIs are consumed by programs. Programs have very different consumption patterns than people. Let’s start with people, and then see how programs are different.

Take shopping in an e-commerce store. How much more does the the most active shopper buy than the median shopper? Imagining some numbers:

Median shopper: 1 order a week
Very active shopper: 10 orders a week

In this example, we have 1 order of magnitude from median to most active. In other cases, (maybe Twitter) there may be 2 orders of magnitude between the median to the most active (1 tweet a day, 100 tweets a day).

Now instead of e-commerce shoppers, take payments APIs. How many payments can the most active business make compared to the median business? Shopify is one of Stripe’s marquee merchants and did ~$200B in 2021. The median business in the US does less than $1M per year. There are more than 5 orders of magnitude between Shopify's activity and the median US business.

When it comes to APIs, some of your users can be +5 orders of magnitude more active than the median, following the same power law that most business follows.

When your customers follow a power law, overall usage will be dominated by a few top users. And if your costs are proportional to usage, you have to localize those costs to those top users that can pay them. In other words, if you don’t have usage-based pricing, you are subsidizing your largest users.

But usage-based pricing is not as straightforward as it seems!

Use AWS's cost-following strategy

When AWS got started, they didn't know this either. Consider this quote from Working Backwards (p 255):

"How much does S3 cost?" ...a tiered monthly subscription service based on average storage use...
We really did not know how developers would use S3 when it launched... Since we didn't know how developers would use S3, was there a way to structure our pricing so that no matter how it was used, we could ensure that it would be affordable to our customers and to Amazon.

That is when they moved towards "cost-following": their pricing model is driven primarily¹ by their costs, which are passed to the customer:

... we'd be sacrificing the simplicity of subscription pricing, but both would benefit ... whatever the developer did with S3, they would use it in a way that would meet their requirements and they would strive to minimize their cost and, therefore, our costs too. There would be no gaming of the system, and we wouldn't have to estimate how the mythical average customer would use S3 to set our prices.

Zach Kanter, CEO of Stedi, is the person I know that has thought the most about this and concluded cost-following is the way to go. He also helped a ton with this essay and provided half the examples.

Any compute is a form of usage

The previous section is fairly intuitive. The subtlety comes in what “usage” means.

From an interview with Werner Vogels Amazon's CTO:

When we launched S3, we were charging only for data transfer and data storage. It turned out that we had quite a few customers who were storing millions and millions of thumbnails of products they were selling on eBay. There was not much storage because these thumbnails were really small, and there wasn't much data transfer either, but there were enormous numbers of requests. It made us learn, for example, when you design interfaces, and definitely those you charge for, you want to charge for what is driving your own cost.

You can imagine how S3 started pricing only GBs on disks and then realized that compute, bandwidth, IO operations, and more had their own costs. And if they didn’t price those in, clients will deviate enough from the expected usage to drive them to lose money.

Storing data is a form of usage

Every API company is a storage company somewhere. Consider the following scenario:

Stripe charges a fee per incoming payment.
There a costs associated to processing those incoming changes, thus usage-based pricing. At Stripe scale, EC2 instances that serve API requests are not fixed costs. If Stripe doubles daily processing volume, it has to double the size of its EC2 fleet.
Stripe then keeps those payments in a database forever.

Stripe charges you once for a transaction but then has to pay costs to maintain that payment in their database forever. At some point in the future, Stripe will start loose money on that transaction and will continue to lose money until it deletes it from the database. In other words, the costs are proportional to an ever growing stock while the revenue is proportional to its flow.

Having those objects available so that the user can fetch them is a form of usage, even if the user doesn’t perceive that way.

The storage costs are usually small enough compared to the margin per transaction that this is fine for several years but eventually this dynamic will catch up with you². To prevent this problem, you need to incorporate a retention policy that eventually forgets about those objects that are still yielding costs but not generating revenue anymore.

Twilio, for example, has a 13-month retention policy for its highest volume objects, messages and media.

If your growth is literally exponential then you don't need to worry about this. The exponentially growing revenue outpaces the accumulating database costs³. This works as long as the growth continues at the same pace. Once it stops, the database costs catch up to the revenue really fast.

Consider your variable and capital costs

If all your API costs are variable, then the calculation of how to pass them to your customer is somewhat straightforward⁴. But you are also likely to have capital costs: fixed investments that need to be amortized over "all users" in some way. For example, if you buy fixed compute capacity in advance (most people do), you need to utilize to the max because your revenue depends on it.

This mostly applies if capital is your main cost and your margins are relatively low (i.e. lower than 50%). Data companies like Datadog or Amplitude are probably in this category but I don't really know, having never worked in one of them.

Here is a blogpost from 2007 documenting how AWS had to price S3 to teach its customers to utilize their servers optimally, much like hotels teach customers to use the rooms optimally:

In essence, you want to reward those customers whose usage patterns allow you to use your installed capacity efficiently (by cutting their prices) while penalizing those customers whose usage patterns undermine your ability to use your installed capacity efficiently (by raising their prices).
If you do this effectively, you get the best possible return on every dollar of capital you invest in infrastructure and, as you grow, you get more profitable
In this light, Amazon’s original flat-rate pricing for its utility services, while having the advantage of simplicity, becomes unsustainable.
Amazon has announced ⁵ that it will abandon its flat-rate pricing schedule for S3 on June 1 and introduce a more complex pricing schedule with tiered fees for bandwidth usage and a new fee for the number of requests made on the system.

Check your retention policy against hardware limits

For very high volume objects, it is likely that other things will break before cost and force you to implement a retention policy. For example, in AWS PostgreSQL RDS goes up to 64 TiB per instance. You may hit that limit before cost is a problem and you may be forced to drop objects. When designing a retention policy, keep those limits into account since they may be the actual constraint to solve for.

Making retention policies less painful for your users

Apply the retention policy to high volume users

The users that bring in all the volume are also those that care most about cost and then negotiate your prices down. Those users are also likely the most sophisticated and able to deal with a retention policy⁶. So, considering applying the retention policy only to your biggest users that account for 80% of costs while keeping the API convenient for those that are getting started.

Make your objects immutable

If you make the high volume objects immutable:

Your users are going to be less interested in the objects after they become immutable.
Your users can cache the objects themselves if they are interested without any risk of their copy diverging from yours.

Only discard high volume objects

This doesn't mean that every object in your API needs a retention policy. Certain "configuration" or "settings" objects that are low cardinality and have very long lives, are likely cheap enough for you to keep them ~forever without losing money.

Just focus on the high volume ones like messages, payments, transactions, and events.

Be liberal with the retention policy

Not many businesses are interested in payments from 5 years ago, let alone 10. If the storage costs allow for it, just set your policy at 10 years and most customers will happily accept the retention policy at face value.

If the policy is liberal enough, you don’t even have to implement it yet!

Have your retention policy ready as soon as possible

You want to tell the users that will want to archive their objects about the retention policy right away. If they know about it while integrating, they can plan the archival in that first integration which is when they are most motivated to learn about your API. It is much more inconvenient for them to have to return to your API later when they learn about the retention policy.

Have an “archival” API

You can also provide slower and much cheaper APIs for retrieval from the archives (i.e. S3 Glacier). These APIs might still have the same structural cost problem as a highly available database but the constant might be different enough that you can provide this surface for 30 years without losing money and give your customers peace of mind.

Twilio for example, has a Bulk Export API which is not subject to their retention policy. I am not sure how it is implemented but I wouldn't be surprised if simply moving from an online database to S3 was enough for them to kick down the retention can another 10 years.

Beware of auxiliary objects

Not every object in an API makes money. For example, in the Stripe API you can create Customers for free. You presumably do this so that you can then Charge them, where Stripe makes money. But if Customer had enough features, somebody could easily abuse this by using Customer as their database and never paying for it.

I am not suggesting charging for every object but monitoring their relative usage and making clear in the terms of service that there are limits to the auxiliary objects.

Beware of data pipelines that recompute every day

The simple cost model described above (cost = server + database) is not taking into account data pipelines like Hadoop, which many companies use to compute more complicated metrics. Those are usually very expensive and at worse, their costs can grow faster than database costs. Consider a data pipeline to count NonRefundedPaymentsPerCustomer.

To compute it for the first time, the pipeline has to read all payments, keeping a count for each customer, skipping payments that were refunded. After that, it can incrementally incorporate one payment at a time and keep the metric fresh.

But what if your payments keep a is_refunded flag, which can change at any point in time? Then, even though you already counted a particular payment, you can’t be sure if that payment should still be counted, it could’ve been refunded right after you saw it. To keep the metric accurate, you have to revisit every payment, every time you want to compute the metric (ie daily).

You can see how every day that passes, you are paying an O(payment) cost. When tracking state in mutable objects, it is easy to find yourself in that situation. This is usually fixed with immutable objects and events (ie PaymentSucceeded and PaymentRefunded).

Thank you to Zach Kanter for thinking about this problem, providing feedback, and half the examples. Thank you to Grant Slatton for reviewing.

Footnotes

This is different from cost-plus where you simply sum up the costs and add a markup. The price is informed by the costs, not determined by them.↩
This is not unlike a ponzi scheme where you can use the incoming investments to pay off the old ones, growing a larger and larger liability. There was some post that made this same point for web services, if you find it, please email it to me at sbensu@gmail.com↩
This is because the integral of an exponential is an exponential growing at the same rate.↩
This is one of the reasons people like serverless architectures. For example, Stedi, an API company, runs entirely on serverless patterns.↩
This is the same pricing change Werner Vogels noted above.↩
Remember that your users probably have a similar dynamic going on themselves. If an e-commerce site makes money per order, they may also have a problem if they store all their orders forever and they have billions of orders.↩

« Enterprise sales meets product development Semantic gaps »