our transition to OpenAPI from JSON Hyper-Schema, we caught up internally and started discussing what else OpenAPI could help us unlock. It was at that point that we set the lofty goal of using OpenAPI for more than just our documentation. It was time to use OpenAPI to generate our SDKs.Before we dove headfirst into generating SDKs, we established some guiding principles. These would be non-negotiable and determine where we spent our effort.
You should not be able to tell what underlying language generated the SDK
This was important for us because too often companies build SDKs using automation. Not only do you end up with SDKs that are flavored based on generator language, but the SDKs then lack the language nuances or patterns that are noticeable to users familiar with the language.
For example, a Rubyist may use the following if expression:
do_something if bar?
Whereas most generators do not have this context and would instead default to the standard case where if/else expressions are spread over multiple lines.
if bar?
do_something
end
Despite being a simple and non-material example, it demonstrates a nuance that a machine cannot decipher on its own. This is terrible for developers because you’re then no longer only thinking about how to solve the original task at hand, but you also end up tailoring your code to match how the generator has built the SDK and potentially lose out on the language features you would normally use. The problem is made significantly worse if you’re using a strongly typed language to generate a language without types, since it will be structuring and building code in a way that types are expected but never used.
When a new feature is added to a product, it’s great that we add API support initially. However, if that new feature or product never makes it to whatever language SDK you are using to drive your API calls, it’s as good as non-existent. Similarly, not every use case is for infrastructure-as-code tools like Terraform, so we needed a better way of meeting our customers with uniformity where they choose to integrate with our services.
By extension, we want uniformity in the way the namespaces and methods are constructed. Ignoring the language-specific parts, if you’re using one of our SDKs and you are looking for the ability to list all DNS records, you should be able to trust that the method will be in the dns namespace and that to find all records, you can call a list method regardless of which one you are using. Example:
// Go
client.DNS.Records.List(...)
// Typescript
client.dns.records.list(...)
// Python
client.dns.records.list(...)
This leads to less time digging through documentation to find what invocation you need and more time using the tools you’re already familiar with.
Fast feedback loops, clear conventions
Cloudflare has a lot of APIs; everything is backed by an API somewhere. However, not all Cloudflare APIs are designed with the same conventions in mind. Those APIs that are on the critical path and regularly experience traffic surges or malformed input are naturally more hardened and more resilient than those that are infrequently used. This creates a divergence in quality of the endpoint, which shouldn’t be the case.
Where we have learned a lesson or improved a system through a best practice, we should make it easy for others to be aware of and opt into that pattern with little friction at the earliest possible time, ideally as they are proposing the change in CI. That is why when we built the OpenAPI pipeline for API schemas, we built in mechanisms to allow applying linting rules, using redocly CLI, that will either warn the engineer or block them entirely, depending on the severity of the violation.
For example, we want to encourage usage of fine grain API tokens, so we should present those authentication schemes first and ensure they are supported for new endpoints. To enforce this, we can write a redocly plugin:
module.exports = {
id: 'local',
assertions: {
apiTokenAuthSupported: (value, options, location) => {
for (const i in value) {
if (value.at(i)?.hasOwnProperty("api_token")) {
return [];
}
}
return [{message: 'API Token should be defined as an auth method', location}];
},
apiTokenAuthDefinedFirst: (value, options, location) => {
if (!value.at(0)?.hasOwnProperty("api_token")) {
return [{message: 'API Tokens should be the first listed Security Option', location}];
}
return [];
},
},
};
And the rule configuration:
rule/security-options-defined:
severity: error
subject:
type: Operation
property: security
where:
- subject:
type: Operation
property: security
assertions:
defined: true
assertions:
local/apiTokenAuthSupported: {}
local/apiTokenAuthDefinedFirst: {}
In this example, should a team forget to put the API token authentication scheme first, or define it at all, the CI run will fail. Teams are provided a helpful failure message with a link to the conventions to discover more if they need to understand why the change is recommended.
These lints can be used for style conventions, too. For our documentation descriptions, we like descriptions to start with a capital letter and end in a period. Again, we can add a lint to enforce this requirement.
module.exports = {
id: 'local',
assertions: {
descriptionIsFormatted: (value, options, location) => {
for (const i in value) {
if (/^[A-Z].*\.$/.test(value)) {
return [];
}
}
return [{message: 'Descriptions should start with a capital and end in a period.', location}];
},
},
};
rule/security-options-defined:
severity: error
subject:
type: Schema
property: description
assertions:
local/descriptionIsFormatted: {}
This makes shipping endpoints of the same quality much easier and prevents teams needing to sort through all the API design or resiliency patterns we may have introduced over the years – possibly even before they joined Cloudflare.
Building the generation machine
Once we had our guiding principles, we started doing some analysis of our situation and saw that if we decided to build the solution entirely in house, we would be at least 6–9 months away from a single high quality SDK with the potential for additional follow-up work each time we had a new language addition. This wasn’t acceptable and prevented us from meeting the requirement of needing a low-cost followup for additional languages, so we explored the OpenAPI generation landscape.
Due to the size and complexity of our schemas, we weren’t able to use most off the shelf products. We tried a handful of solutions and workarounds, but we weren’t comfortable with any of the options; that was, until we tried Stainless. Founded by one of the engineers that built what many consider to be the best-in-class API experiences at Stripe, Stainless is dedicated to generating SDKs. If you've used the OpenAI Python or Typescript SDKs, you've used an SDK generated by Stainless.
The way the platform offering works is that you bring your OpenAPI schemas and map them to methods with the configuration file. Those inputs then get fed into the generation engine to build your SDKs.
resources:
zones:
methods:
list: get /zones
The configuration above would allow you to generate various client.zones.list() operations across your SDKs.
This approach means we can do the majority of our changes using the existing API schemas, but if there is an SDK-specific issue, we can modify that behavior on a per-SDK basis using the configuration file.
An added benefit of using the Stainless generation engine is that it gives us a clear line of responsibility when discussing where a change should be made.
Service team: Knows their service best and manages the representation for end users.
API team: Understands and implements best practices for APIs and SDK conventions, builds centralized tooling or components within the platform for all teams, and translates service mappings to align with Stainless.
Stainless: Provides a simple interface to generate SDKs consistently.
The decision to use Stainless has allowed us to move our focus from building the generation engine to instead building high-quality schemas to describe our services. In the span of a few months, we have gone from inconsistent, manually maintained SDKs to automatically shipping three language SDKs with hands-off updates freely flowing from the internal teams. Best of all, it is now a single pull request workflow for the majority of our changes – even if we were to add a new language or integration to the pipeline!
Lessons from our journey, for yours