Tim McLean

software engineering, applied crypto, etc in Waterloo, Ontario, Canada.

What I don't like about JSON Web Tokens

February 25, 2015

JSON Web Token (JWT) is a standard for creating tokens that assert some number of claims. For example, a server could generate a token that has the claim “logged in as admin” and provide that to a client. The client could then use that token to prove that they are logged in as admin. The tokens are signed by the server’s key, so the server is able to verify that the token is legitimate.

JWTs generally have three parts: a header, a payload, and a signature. The header identifies which algorithm is used to generate the signature, and looks something like this:

header = {"alg":"HS256","typ":"JWT"}

HS256 indicates that this token is signed using HMAC-SHA256.

The payload contains the claims that we wish to make:

payload = {"loggedInAs":"admin","iat":1422779638}

As suggested in the JWT spec, we include a timestamp called iat, short for “issued at”.

The signature is calculated by base64url encoding the header and payload and concatenating them with a period as a separator:

key = 'secretkey'
unsignedToken = encodeBase64(header) + '.' + encodeBase64(payload)
signature = HMAC-SHA256(key, unsignedToken)

To put it all together, we base64url encode the signature, and join together the three parts using periods:

token = encodeBase64(header) + '.' + encodeBase64(payload) + '.' + encodeBase64(signature)

# token is now:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJsb2dnZWRJbkFzIjoiYWRtaW4iLCJpYXQiOjE0MjI3Nzk2Mzh9.gzSraSYS8EXBxLN_oWnFSRgCzcmJmMjLiuyu5CSpyHI

Great. So, what’s wrong with that?

Well, let’s try to verify a token.

First, we need to determine what algorithm was used to generate the signature. No problem, there’s an alg field in the header that tells us just that.

But wait, we haven’t validated this token yet, which means that we haven’t validated the header. This puts us in an awkward position: in order to validate the token, we have to allow attackers to select which method we use to verify the signature.

Spec, meet implementation, meet disaster

So, what are the implications? Well, first of all, an attacker could turn your HMAC-SHA512 verification into a HMAC-SHA256 verification. Of course, that’s pretty boring, since HMAC-SHA256 is still plenty secure. Maybe in some cases, you could confuse an implementation into using an HMAC key as an RSA key…?

But let’s take another look at the spec. Oh, look, a signing algorithm called none! And it’s mandatory to implement! A bit of research shows that some JWT libraries will happily accept a token signed with the none algorithm and otherwise ignore the server’s secret key. End result: anyone can create their own “signed” tokens and claim to be logged in as “admin”.

What can we do about it?

Most of the JWT libraries that I’ve looked at have an API like this:

# sometimes called "decode"
verify(string token, string secretKey)
# returns payload if valid token, else that's an error

I suggest adding an algorithm parameter. The server should already know what algorithm it uses to sign tokens. Many libraries use HMAC-SHA256 as a default, which seems reasonable. In any case, JWT libraries should probably not even look at the alg field in the header, except maybe to check that it says what they expect it to say.

Anyone using a JWT implementation should read the code to make sure that any tokens signed with none are flat-out rejected. To be extra safe, make sure that any tokens you receive are signed using the algorithm you expect. Even better: have a policy of performing security audits on any open source libraries that you use to provide mission-critical funtionality.

Where did things go wrong?

I’m honestly unsure why the standard specifies an alg field at all. I suspect the original intention may have been to allow the token to specify which key it was signed with, in case a server wanted to support both HMAC and RSA (for example). Alternatively, this may have been intended to provide cryptographic agility. It seems to me, though, that the “key ID” header field is quite adequate for both of these purposes, and less prone to errors in implementation.

The inclusion of a none algorithm is rather baffling. Presumably, it covers some use case, but I would have left it out – simpler protocols are easier to get right.

Introducing Substructed: a new way of editing code

October 03, 2013

I’d like to introduce you to a side project of mine. Substructed (demo) is a programming editor that takes advantage of the structured nature of code to allow advanced programmers to write and edit code more quickly.

Most editors today (such as Vim and Emacs) provide two dimensions for navigating code: down/up (rows) and left/right (columns). Substructed also provides two dimensions for navigation: forward/backward and in/out. Instead of navigating text, you navigate the syntax tree.

Consider, for example, the following JSON:

[
    "a",
    "b",
    "c",
    [
        "d",
        "e",
        "f"
    ],
    "g"
]

Substructed’s cursor looks essentially like a selection:

[
	"a",
	"b",
	"c",
	[
		"d",
		"e",
		"f"
	],
	"g"
]

If we navigate forward one movement (which corresponds to the “j” key in Substructed’s command mode), the cursor moves to the next element of the array:

[
    "a",
    "b",
    "c",
    [
        "d",
        "e",
        "f"
    ],
    "g"
]

If we navigate inward one movement (which corresponds to the enter key in Substructed’s command mode), the cursor moves inside the inner array to the first element:

[
    "a",
    "b",
    "c",
    [
        "d",
        "e",
        "f"
    ],
    "g"
]

Today, I’m open sourcing a prototype of Substructed (demo) that can edit JSON. Very soon, I would like to begin implementing support for “real” language (I’m currently considering Python). I’m making this prototype available now to collect feedback before moving forward.