Understanding APIs: misery to mastery
In my college days, we used to play a game in one of our computer hardware classes where we would compete to see who could overheat the CPU the fastest with the fewest lines of code. In those days, we were using the original Pentium processors and programming in assembly language. It’s interesting that we’re now 15 years later and realizing that a similar thing is happening with our own processor—our brain. But instead of exceeding a thermal threshold, we’re exceeding a cognitive threshold. We have known for a long time that we each have a limited supply of cognitive resources. Research has shown that we often invest these resources both willingly and unwillingly when confronted with stresses, choices, and challenges.
It's no surprise that understanding APIs can tax our greatest resource: our cognitive reserves.
So what does the progression from novice API-user to master look like? Typically, we'd expect the progression from novice to competent to be linear:
We start as a beginner with our first exposure to an API, and with each interaction, we become more competent. Eventually, we may master the API. This is the optimistic progression, but this isn’t necessarily realistic. Consider that there are different phases to the progression, corresponding to the risk of API abandonment.
First phase: high risk
Understanding APIs can be difficult. It’s at this first phase where the risk of abandonment is the highest, where beginners find themselves, and where the cognitive investment is the highest. If they burn through their cognitive reserves, they’re more likely to abandon the API or worse, misuse it.
Second phase: medium risk
During this phase, consumers are starting to become proficient. The risk of API abandonment is lowering because the cognitive demands are reducing with each interaction. API users are becoming comfortable with API conventions and are starting to predict how they will interact with parts of the API they have yet to use.
Third phase: low risk
This is the point where consumers are advanced and knowledgeable about the API is near its entirety. The risk of API abandonment due to cognitive depletion is lowest because consumers invest little of their cognitive reserves to achieve their goals. They’re now relying on muscle memory. In addition, when consumers find themselves in a place of relative comfort within the third phase of an API, they are unlikely to abandon it only to risk finding themselves within the first phase of a competing API.
Another way to look at it is, if our API requires more cognitive processing than it should, then we’re going to frustrate and lose potential users.
What if we could change the rate at which a user moves from novice to competent? If we can change the rate at which we bring a novice API user to competent, we’ll indirectly reduce the time novice users spend in the riskiest phase of API adoption. Novice users will still incur the cognitive costs from learning something new, but they would require a smaller investment to reach a level of competency.
Cognitive reserves need to be respected.
Recognizing that we have a limited supply of this valuable resource, and considering how much we must invest to become proficient with a new API, how can we design our APIs to ease the burden on our fellow developers? To answer this question, we must look at several aspects of the interaction between an engineer and an API.
In a 2013 study entitled "An Empirical Study of API Usability" researchers suggested we consider the following questions:
- What is the cognitive effort required to understand the semantics of API features based on their names and documentation?
- Does the API's abstraction level cater to usability?
- Does the API's design facilitate reuse and conciseness in client code?
- Can API usage be learned easily and incrementally?
Each of these questions impact the amount of cognitive processing an API user must exert. We can use these four simple questions to establish a set of guidelines for improving the interaction between our consumers and the API itself. I’ll use the following code example to help find answers to these questions.
var documentRevisionDataState:DocumentRevisionDataState = new DocumentRevisionDataState(); // Document VO var documentVO:DocumentVO = new DocumentVO(); documentVO.type = documentType; documentVO.documentid = id ? id : GUIDUtil.createGUID(); documentVO.ref_permission = ""; var documentRevisionVO:DocumentRevisionVO = DocumentRevisionFactory.createDocumentRevisionVO( documentVO ); var document:Document = DocumentRevisionFactory.createFromRevision(DocumentRevisionFactory.defaultDocModelConfig, documentRevisionVO , documentRevisionDataState ); return document;
And, I’ll further assume that our only task is to create an object of type Document, given a documentType and id.
- What is the cognitive effort required to understand the semantics of API features based on their names and documentation? Considering how well we communicate API semantics is vital to consumers because it establishes the language our API uses to converse with consumers. It’s how our API speaks to them. We should take care to ensure we're speaking in a language they will relate to and understand. Let’s ask ourselves some questions about this code:
- Do you find that the API types map to the domain concepts in the way you expected? I don’t believe they do because we need to create objects that don’t hold any contextual value for us purely to satisfy a requirement of the API. What exactly is a DocumentRevisionDataState and what does that have to do with a document?
- Do you feel you had to keep track of information not represented by the API to solve the tasks? Definitely. To better understand why we're using static convenience functions, I had to dig deeper into other usages of it and consider the reasons for those static convenience functions.
- Do you feel you had to learn many classes and dependencies to solve the tasks? Absolutely. It’s not clear that when creating DocumentRevisionVOs, one should use a static convenience function. Similar issue when creating the actual document object.
- Does the API's abstraction level cater to usability? I think we’ve all come across some examples where an API was either not abstract enough (requiring overly verbose API consumer code) or too abstract (requiring one to override one or more API methods). Either one of these extremes requires additional cognitive investment. Continue reviewing the example code to understand how we might be able to avoid improper API abstractions. Let’s ask ourselves some questions about this code:
- Do you find the API abstraction level appropriate to the tasks? Given the amount of boilerplate code needed to construct the dependent objects, I would say no—it seems that this API isn’t abstract enough for our purposes.
- Did you need to adapt the API (inheriting from API classes, overriding default behaviors, providing non-API types) to meet your needs? While examining this specific code, one does not need to adapt the API to meet our needs. We can, however, see how the creation of helper functions that are separate from the API itself implies that our API was adapted at one point to meet needs similar to our own.
- Do you feel you had to understand the underlying implementation to be able to use the API? It’s clear from the code listed that one must become familiar with the implementation itself to understand the proper sequence of steps. Example: which fields on the DocumentVO must be populated before passing this object to the createFromRevision function.
- Does the API’s design facilitate reuse and conciseness in client code? This point speaks directly to how consumer code will be read. Will the API we're using place irrational requirements on our consumer code to the point of rendering it unreadable? If so, we're not only wasting the cognitive resources of the API-consumer, but also any other developer who attempts to understand this consumer code. Let’s ask ourselves some questions about this code:
- Does the amount of code required for this task seem about right, too much, or too little for you? This seems too verbose with too much boilerplate to me. Imagine how much time a new API user would need to discover and understand how to create a Document? How long did it take you to read the code and understand what was going on?
- How easy can you evaluate your own progress (intermediate results) while solving the task? With this code, I can see how an API user would spend quite a bit of time running the code, hitting an RTE and realizing he or she needs to peek at the implementation. Personally, this would drain my cognitive resources and cause me to become discouraged.
- Do you feel you had to choose one way out of many to solve a task in the scenario? When reviewing the code, and sadly, needing to look at the implementation, I don’t think it’s clear that there’s even one way to solve the task, never mind multiple ways.
- Can API usage be learned easily and incrementally? The following questions were not part of the original study but believed to be valuable in assessing how well an API can be learned. Let’s ask ourselves some questions about this code:
- Does the API have an appropriate level of documentation and examples? In this case, there weren't any examples other than usages. Those are different things in my mind—examples are how you should' use the API. Usages are how the API has been used but not necessarily the proper way.
- Does the API allow a new user to learn basic parts of it before progressing to more difficult aspects? While answering this question, I found myself considering:
- One is likely to start with attempting to construct a Document object via the constructor.
- One is also likely to discover that they must also create three additional objects, and I eventually thought that yes, the user will start with the basic parts before progressing to more difficult aspects.
The following is purely anecdotal, but based on feedback around our previous API. Many engineers have struggled with our previous version, and it is believed that their experience closely matches this graph:
Ok, this is perhaps a bit opinionated and jaded by my own experiences, but I contend it still remains a valid point.
If our APIs are unwieldy and demand irrational amounts of cognitive investment, they’ll waste our most precious resource: our cognitive reserves.
Let's agree to echo this mantra in our code reviews and in the code itself: cognitive reserves need to be respected.
In part two of this series, we're going to investigate ways in which we can bring users from beginner to master more quickly with a particular focus on respecting cognitive demands.
About the Author
Dustin is a Senior Software Engineer at Workiva and a performance junky who is always prototyping something to make high impact improvements.