Introduction to modern authentication patterns for application developers (Part 1)
Background and History (OAUTH v2)
Almost since the beginning of the Web there has been a need to secure websites to ensure not everyone on the Internet has access. The way that this is done is that a user
authenticates with a username and password. The server then
authorizes them to perform certain actions based on their permissions. This distinction between authentication and authorization is important - one is verifying we are who we say we are and the other is a decision as to whether, if that is true, we are allowed to do the thing we are asking to do.
The early tools built to handle this requirement, such as Sun’s OpenSSO or Tivoli/IBM Access Manger, tended to involve exchanging a login/password for a random opaque token from an authorization server. They would then request a resource from another application server giving it this token in the request. That resource server, in turn, validates their token against the authorization server as well as retrieves additional information about the user, called
claims, to decide if they are authorized. This process flow is illustrated by the diagram below (from OpenAM's documentation - which is a more recent opensource fork of Sun's OpenSSO):
This has approach been standardised and matured into the OAUTH then the OAUTH v2 standards. This standard is fairly complex and non-prescriptive though — leading to different tools often implementing different incompatible subsets of the standard.
There are a few major challenges for availability and scaling of this approach. It means that nearly every request to a resource server depends on a subsequent validation request against an authorization server to succeed. That authorization server, in turn, depends on a database to keep track of which opaque tokens it has issued for whom, and which of those tokens are still valid, in order to respond to those validation requests. This led to rather complex and expensive architectures to ensure that the authorization servers are always available to handle all the requests such as the following — because if they go down or become very slow it can cause a major user-facing outage of your site/application.
JSON Web Tokens (JWT)
Originally, the token was not intended to have any inherent meaning to the clients using it — it was meant to be a random hexadecimal that would be presented to the authorization server for validation as well as retrieval any claims needed to process the request. However, the standard didn’t say that it couldn’t be something like JSON with the required data/claims right in it instead. This led to the development of JSON Web Tokens or JWTs as an alternative.
These JWTs not only embed the user details (claims) required by the resource server right in the token but then crytographically sign the whole thing with its private key (which becomes a sensitive secret in this model — anybody with this key can sign JWTs which the resource servers will trust!). This, together with an embedded expiry time, means that the process flow becomes much simpler as below:
Note that in this scenario the resource server does not need to talk to the authorization server at all — it just verifies the cryptographic signature in the token and makes sure the expiry date is in the future. And, since the majority of the load in the old model for the authorization server was around validating all the requests, this is a huge reduction in the load on the system. Also, the data around the token and whether it is valid is offloaded to each client/browser, and it’s localData, rather than a centralised session state database backing the authorization server.
This approach does have one downside — once a JWT access token is issued under this model it is difficult to revoke before the expiry time embedded within it. This can be mitigated, though, by issuing tokens with short expiry times. Also, rather than having the user need to re-authenticate to get each new short-lived access token, many tools also allow for issuing a refresh token that can be exchanged for updated access tokens as required. And you can blacklist/revoke a refresh token so that the client can’t refresh/renew any future access tokens from it and will need to re-authenticate (which they can’t if you disable their access). As an example, if your access tokens are valid for one hour and your refresh tokens are valid for one day, and you want to revoke access to that user, you would blacklist their refresh token and disable subsequent logins so they would lose access in at most one hour when their current access token expires.
As we said earlier, the OATH v2 standard is not prescriptive enough to ensure that two different tools will necessarily be configured the same and be compatible. OpenID Connect is an extension or overlay on top of that standard to be more prescriptive and ensure compatability between providers and implementations. It also has the goal of helping to facilitate a single login to be used across several unrelated sites — for example logging into Spotify with your Facebook account.
It not only requires the use of JWTs instead of opaque tokens but outlines three types of JWTs —
- ID Token — contains claims about the identity of the authenticated user such as name, email, and phone_number.
- Access Token — grants access to authorized resources.
- Refresh Token — contains the information necessary to obtain a new ID or access token.
The standard splits access and id/claims into two different tokens because the info/claims from my Facebook account might not be relevant to Spotify in the example above — just that I have authenticated properly there and now have the relevant access token to present to Spotify.
In future articles I’ll continue the series with examples of a few of the modern managed services for handling authentication such as AWS’ Cognito and Auth0.