Breaking changes in JSON APIs

March 2023

For an overall product development strategy, see How to avoid breaking APIs.

When writing graphical user interfaces, the end user is a person. If the user interface changes overnight, the user might be surprised but they can adapt and react accordingly. Most software companies are used to changing their products often in various ways. API products don’t have this luxury.

APIs are used by programs which can’t adapt to change. So, whenever you change an API, the change should be backwards compatible so that the new API version can be used by existing integrations. As you make changes, you constantly need to keep track of this distinction:

Backwards-compatible changes

  • Changes you can roll out to existing integrations without causing problems for them.
  • For example, adding new endpoints or accepting new parameters for existing endpoints.

Breaking changes

  • These are changes that might break existing integrations so you can’t simply roll them out.
  • To roll them out, you need to either ask all your existing integrations to migrate or you need to create a new version of the API so that new users get the change and old users don’t.
  • For example, if you remove a property from an endpoints response, integrations that use that property will start throwing exceptions.

Here is a list of the different breaking changes one can make to an API and some advice to deal with it. It is mostly based on what I learned working with the Stripe API, so the examples are very Stripe-centric, and applies to JSON REST APIs1. If you have complementary examples from other APIs, please email them to me.

Backwards-compatible changes

These are the changes that Stripe considers as backwards-compatible:

  • Adding new API resources.
  • Adding new optional request parameters to existing API methods.
  • Adding new properties to existing API responses.
  • Changing the order of properties in existing API responses.
  • Changing the length or format of opaque strings, such as object IDs, error messages, and other human-readable strings.
    • This includes adding or removing fixed prefixes (such as ch_ on charge IDs).
    • You can safely assume object IDs we generate will never exceed 255 characters, but you should be able to handle IDs of up to that length. If for example you’re using MySQL, you should store IDs in a VARCHAR(255) COLLATE utf8_bin column (the COLLATE configuration ensures case-sensitivity in lookups).
  • Adding new event types.
    • Your webhook listener should gracefully handle unfamiliar event types.

Be careful if your proposed change is not in this list, there is probably a good reason for it.

This list makes a set of assumptions about how clients integrate with the Stripe API. For example, “Adding new properties to existing API responses” could be either backwards compatible or a breaking-change depending on how the client is integrated. Here are two Java examples, one backwards compatible and one not:

// We are trying to deserialize this response from the Stripe API
String json = "{id: 1, amount: 1000}";

// the naive way to use Jackson is to define a class:
public class PaymentIntent {
  public String id;
}
// and then read the string:
PaymentIntent pi = objectMapper.readValue(json, PaymentIntent.class);

// but then, the API responses comes with a new property, amount
String json = "{id: 1, amount: 1000}";

// The code above starts throwing because amount is not in PaymentIntent
PaymentIntent pi = objectMapper.readValue(json, PaymentIntent.class);

// By adding this annotation, new properties are ignored
@JsonIgnoreProperties(ignoreUnknown = true)
public class PaymentIntent {
  public String id;
}

// and this now works
PaymentIntent pi = objectMapper.readValue(json, PaymentIntent.class);

For every one of those backwards compatible changes, there is some way to integrate that would make them breaking.

Some changes are backwards-compatible for certain ecosystems but not others. For example, it is possible for “Make a required parameter optional” be backwards compatible:

  • Languages that don't check types at runtime (ex: JS, Python, Ruby, PHP) can simply accept null or nil for the optional parameter. As long as the SDK doesn't have runtime asserts to check for the presence of that parameter, making the parameter optional is something that you can roll out overnight and the existing SDKs can accommodate.
  • But for staticall-typed languages (like Java, Go, .NET, Haskell, or OCaml) you have to be more careful. SDKs in those languages usually check that all the right values are passed or use Maybe or Optional for optional parameters. So, making a required parameter optional would change the type signature of the SDK (ex: StringOptional<String>), triggering compiler errors for anybody that upgrades the SDK2. To avoid this problem, Stripe uses the builder pattern in its typed SDKs (Java, Go, .NET).

To read how Stripe rolls out backwards-compatible changes without having to maintain separate code paths for each version, read this great post.

Breaking changes

A good way to learn from others’ mistakes is to read Stripe’s upgrades. Stripe only upgrades the API version when it makes a breaking change somewhere. All backwards-compatible changes are immediately deployed to existing users so they don’t need their own version.

The rest of the sections have examples of breaking changes that Stripe had to make like this:

2011-08-01: Updates the list format. New list objects have a data property that represents an array of objects (by default, 10) and a count property that represents the total count.

From that change, we can learn that list endpoints shouldn’t return a list at the top level. They should be wrapped by a map that can hold other data:

// This shouldn't be the top-level response
[{id: "ex_123"}]

// Do this instead:
{
  data: [{id: "ex_123"}]
  // if you ever need to, you can put data here:
  metadata: "This turned out to be important"
}

When naming, be painfully concrete

The first Stripe APIs were all about payments and many of them included balance as a parameter. This was always some payment balance. But over time, as Stripe offered loans, bank accounts, and even cards, balance became a more and more ambiguous word. Are you talking about the loan’s balance or the bank account’s balance?

Users that come for the newer features like loans will interpret the original balance as loan balances. So, name them payment_balance from the get-go.

All of this applies to name, account, date, and other similarly vague words.

Beware of type and status

What is the type of a payment? While the team is trying to distinguish between subscription payments and one-time payments, they may be tempted to add type: subscriptions | one_time. But years later, a new type might emerge: was this payment made online or in-person?

type implies a form of categorizing, a complete ontology. status does the same but for state machines. But different perspectives require different ontologies and type and status make that first ontology privileged over future ones. What “aspect” of the object are you categorizing?

Instead of status, can it be fulfillment_status? Instead of type, or can it be timing_type? In my opinion, ugly is better than ambiguous.

2018-08-23: The amount field field in the tiers configuration for plans was renamed to unit_amount.

2014-06-13: Renames the type property on the Card object to brand.

2014-05-19: Replaces the account property on the Transfer object with bank_account. The bank_account property is only included when the transfer is made to a bank account.

2013-12-03: Replaces the user and user_email properties on the Application Fee object with an expandable account property.

2019-03-04: The date property has been renamed to created.

Let’s say you are returning a balance that can grow stale. And just in case you add balance_as_of: "2023-01-01T00:00:00". Over the years, you keep adding balance_xyz fields, one at a time. 5 years later, half of the fields at the top level are balance related and it is hard for developers to find the fields the resource is nominally about. It would’ve been better if balance was a sub-resource, and the fields were balance.as_of, balance.amount, etc.

2015-10-01: Replaces the bank_accounts property on the Account object with external_accounts. Replaces the bank_account value in the fields_needed property with external_account.

2016-02-19: Renames the name property on the Bank Account object to account_holder_name.

Enums over booleans

Input parameters

Even if it feels very black-or-white, use an enum. For example, is_test: false is better as environment: test | live. If later you have staging or even a custom environment, you can still use that enum. Having is_test be a boolean invariable corners you later.

For input parameters, it is backwards compatible to accept a new case in the enum. But it is not backwards compatible to change the field from bool to enum. So get ahead of the problem and use an enum from starters.

2014-09-08: Replaces the disabled, validated, and verified properties on the Bank Account object with a status enum property.

2017-05-25: Replaces the managed Boolean property on Account objects with type, whose possible values are: standard, express, and custom. A type value is required when creating accounts. The standard type replaces managed: false, and the custom type replaces managed: true.

Return fields

Most of the previous section applies to return data as well. It will be easier for you to add an enum case later than to have to reconsider a boolean field. Is that backwards compatible? This depends on the integration pattern.

Many programming languages check for enum’s exhaustively:

type Data = {
  environment: 'test' | 'live' // Can you add 'staging' here?
}

switch (data.environment) {
  case “test”: {
	...
  }
  case “live”: {
    ...
  }
  // what goes here?
}

If the last clause is default and the code does something reasonable for unforeseen cases, then adding cases like staging to an enum is backwards compatible. If not, new enums will break the integration in one of two ways:

  1. When the new enum is returned to that integration, the code will do something completely unexpected. To make matters worse, JavaScript will throw no exceptions in this case.
  2. Even if the new enum case was never returned to this particular integration, when the developer updates the SDK bindings with the new types, their compiler will check for exhaustiveness and throw a compile time error. This is vastly better than (1) but still annoying to force the developer to check for something that they don’t care about.

At Stripe there is a rule that it’s only backwards compatible to return a new enum value if:

  • the user opts into it with your integration, like a new payment method type
  • the enum fields is clearly not static like a list of banks or currencies

Keep this in mind when writing documentation and explaining to users how to integrate. If you don’t insist on default clauses, you won’t be able to easily extend enums.

2014-09-08: Replaces the disabled, validated, and verified properties on the Bank Account object with a status enum property.

2017-05-25: Replaces the managed Boolean property on Account objects with type, whose possible values are: standard, express, and custom. A type value is required when creating accounts. The standard type replaces managed: false, and the custom type replaces managed: true.

Return as little data as possible

Stripe originally added count on the list endpoint because Mongo returned it from the list queries. It turns out that Mongo struggles to produce that count as collections grow, and it creates all sorts of performance problems. It is not clear that Stripe users need count in the first place. For that and other reasons, list endpoints are some of the worst endpoints to maintain.

Once count is sent to the user, Stripe doesn’t really know which users depends on it3. It might be the case that very few users depend on count but Stripe has to assume that it is all of them. This is not the case for input parameters where it is easy to track what is being sent to Stripe.

This example shows one reason you avoid adding data to your responses unless absolutely needed. But when do you add it?

  • When important use-cases can’t be completed without it.
  • When enough users ask for it, first consider the cost, and then decide it is worth it. This sounds obvious but this whole section is about drilling into you that there are costs of returning data to users.

2014-03-28: Removes the count property from list responses.

2014-08-04: Removes the other_transfers, summary, and transactions properties from automatic transfer responses in favor of the balance history endpoint (/v1/balance/history). These properties were very expensive to calculate on every transfer response, so they were moved to Balance Transactions where the user had to specifically ask for them.

Input data size and length

You will likely want to store many of the strings and arrays that users send you. Make sure to validate those strings and arrays with a max length and to advertise that max length. Otherwise, users will send you really long strings and very long arrays and leave you with a long database bill.

2016-02-22: Returns an error on attempts to add more than 250 invoice items to an invoice.

2018-10-31: The description field on customer endpoints has a maximum character length limit of 350 now. The name field on product endpoints has a maximum character length limit of 250 now. The description field on invoice line items has a maximum character length limit of 500 now.

Return data size

Developers might store parts of your responses in their databases. For example, they will want to store ids you return (e.g. Stripe uses this format ch_123) alongside their related entities. To do that, they will add columns to tables and in the case of SQL databases, they’ll need to bound the column size with varchar(32) or similar. If you ever make the string longer than 32 chars, they will get an error when storing the id in their database.

Over the years, Stripe has had to lengthen the size of those ids as it needs that keyspace to be longer or to include more information. To avoid this problem, explain to users that they need 64 characters to store your ids (despite them having 16 right now) or make them sufficiently big from the get-go.

Validations

If you previously accepted untrimmed strings (e.g. “ example “) and you start to reject them, that will break certain integrations. To avoid problems down the line, be very strict and add all the validations that you can think of. You can always drop them later if they prove to a be problem for important use-cases.

2016-02-29: Adds postal code validation for legal entity addresses when creating and updating accounts

2011-09-15: Updates the card validation behavior when creating tokens.

2015-09-03: Returns an error if a request reuses an idempotency token with different parameters than the original request. Previously, errors were only returned for reusing the same idempotency token across different API endpoints.

Permissions and security

The security model is part of the API. If certain clients are able to access sensitive data now, they should be able to access such data in the future. If your API has different access levels, be especially conservative with the least privileged access levels.

For example, Stripe has publishable keys for end-user clients like mobile phones and webpages and secret keys for servers. Making a field accessible to publishable keys requires more thinking than making it accessible to secret keys.

These are particularly serious: you'll need to make a breaking change because you discovered that it is not a good idea to reveal certain data to certain clients, and you'll be in a hurry to make it.

2014-10-07: Prevents publishable keys from retrieving Token objects. When a card or bank account token is created with a publishable key, the fingerprint property is not included in the response.

Loud errors

Imagine if the user send you a bunch of unrelated parameters to your API, what should you do? You might be tempted to simply ignore the parameters. Don’t!

Consider the parameter send_email: {email_address: "sbensu@gmail.com"}. If they pass that, the developer expects an email to be sent. This is a silent side-effect in that the developer doesn't necessarily learn if the email was sent from the response. What should happen if they typo send_emial? If you ignore bad parameters, send_emial would trigger no errors. The developer would then expect emails to be sent but they wouldn’t. This is bad.

Extend this thinking to other types of errors or validations beyond misnamed parameters. And do it early, throwing errors for new conditions will be a breaking change.

2012-02-23: Shows all response fields, even those with null values. Previously, the API hid fields with null values.

2011-06-21: Raises exceptions on unrecognized parameters passed to the API instead of silently allowing and ignoring them.

2013-02-11: Updates the pay invoice call to return an error when the charge is not successful. Previously, the API would return a 200 status and set the invoice’s paid property to false.

2013-02-11: Updates the pay invoice call to return an error when the charge is not successful. Previously, the API would return a 200 status and set the invoice’s paid property to false.

2015-10-16: Returns an error if a tax_percent is provided without a plan during a customer update or creation.

Behavior changes

As Hyrum’s Law suggests, if your API has enough integrations, some of them will depend on the behavior that you want to change, even bugs. Consider this change from Stripe’s Upgrades:

2015-03-24: Updates coupons so they no longer apply to negative invoice items by default. Previously, coupons applied to all non-proration invoice items. To allow a coupon to apply to a negative invoice item, pass discountable=true when creating or updating the invoice item.

There was a clear in bug in how coupons were applied to certain invoices. But fixing it would break people who relied on this math and had created coupons with a reversed-engineer amount to work around the bug. To avoid breaking those integrations, Stripe created a new version.

This doesn’t mean that you should never fix bugs. Use your common sense to try to bound how many integrations could’ve plausibly depended on the bug's behavior and if the number is low enough, fix it without a new version. Otherwise, you’ll have to broadcast the change and cut a new version.

A similar example is smallest currency unit. Dollars have cents and the Stripe API denominates all USD amounts in cents, {currency: "usd", amount: 100} is $1. Integrations need to divide by 100 to render it to their users. But other currencies like the Japanese Yen don’t have cents, only Yens. {currency: "jpy", amount: 100} is ¥100, no division required.

Over time currencies evolve, some lose their cents4 (i.e. inflation). Stripe can’t just stop passing the cents since everyone has their own math for it5. So, currencies updates are considered breaking changes.

2013-12-03: Updates the refunding of application fees to be proportional to the amount of the charge refunded (when setting refund_application_fee=true). Previously, the entire application fee was refunded even when only part of the charge was.

And if the new behavior is not a bug but rather a change in the APIs side-effects, you almost surely want to cut a new version.

2013-10-29: Changes coupon behavior so that applying an amount-off coupon to an invoice does not increase the Customer account balance if the discount is greater than the invoice amount. Coupons are ignored—and not counted as redeemed—when applied to zero-cost invoices. This change does not apply to coupons created on earlier API version.

2014-01-31: Ignores trial dates on canceled subscriptions when automatically computing trial end dates for new subscriptions.

Implicit semantics: fingerprints and ids

Certain identifiers or keys sometimes have implicit meanings:

  • Is this id unique across different accounts? Or only within an account?
  • Are the ids sequential? Can you compare ids with < to see which one was generated first? If the user reads 123 and 124, they might conclude your ids are sequential.
  • Are the ids hierarchical? "invoice_123_line_item_345" leads the user to think that they can extract the invoice_id from that id.

When creating such ids or tokens, be careful about their implied semantics and compared those to how they look. You might be generating the ids sequentially, but that might not be a promise you want to make. If so, either change how the identifiers look or be very explicit about how the identifiers can be used.

2018-01-23: When being viewed by a platform, cards and bank accounts created on behalf of connected accounts will have a fingerprint that is universal across all connected accounts. For accounts that are not connect platforms, there will be no change.

2018-05-21: The id field of invoice line items of type: subscription no longer can be interpreted as a subscription ID, but instead is a unique invoice line item ID. It can be used for pagination.

2019-12-03: You can no longer use the prefix of the id to determine the source of the line item. Instead use the type field for this purpose.

Default values

The PaymentIntent API doesn’t require the developer to pass payment_method_types. The API interprets no value as default value:

POST /v1/payment_intents
{
  amount: 1000,
  currency: "usd"
}

is automatically interpreted as:

POST /v1/payment_intents
{
  amount: 1000,
  currency: "usd",
  payment_method_types: ["card"] // <-- default value
}

If you change the default values to be something else (e.g. payment_method_types: ["card", "link"]), you are changing what the API does without an integration change.

In most cases, changing default values is not backwards compatible. So, unless you are sure that a default is the best idea, don’t offer default values and ask the developer to pass the exact value that they need.

Thanks to Michelle Bu for reading drafts and providing valuable feedback and Adam D’Angelo for prompting the essay. Special thanks to Remi Jannel, Stripe API custodian, for providing feedback, commentary, and many of the examples included.


Footnotes

  1. Some of it also applies to gRPC Protos and GraphQL but not to the same extent.
  2. This talk by Rich Hickey expands on this problem.
  3. This is one of the advantages of GraphQL for API maintainers: the API users have to ask for the specific properties that they need in their return values. This makes it possible to determine which properties are popular and which ones are unused.
  4. The Icelandic króna is dropping their cents (aurar) as I write this essay. See Withdrawal of coin denominated in aurar.
  5. To make problems worse for users, different payment processors treat these currencies differently.