"How Do I Test It?" Proof Key for Code Exchange by OAuth2 Public Clients

Having adapted an example OpenID Connect server (built in Rails) to a production system, I need to add PKCE support so that it could be used securely to allow mobile systems to log in. OAuth2 (and therefore OpenID-Connect) isn't considered as secure on mobile devices because rogue applications on the mobile device can hijack the authorization code.

Here I'll try to describe the thought process that goes into not the implementation, but the interpretation of the spec into a set of tests that will allow us to be reasonably confident that mobile clients will be able to connect. (Pleasingly, the first time the mobile integrators tried to connect with PKCE it worked perfectly, both with errors and successful paths).

My implementation is in Ruby, of course, and my tests will be in RSpec with expectations, so some of the language might not be generic but the concepts should map to other languages / frameworks. I've marked the actual tests I needed in yellow with the word TEST: in front of them.

We'll be working from the spec here: https://tools.ietf.org/html/rfc7636

First of all, take a quick read through of it, so that we understand the basic reason for the extension, and roughtly how it works. Then we'll go through it again, pointing out where the spec requires and suggests tests. If you're reading this you can have the spec open in another tab or window or on a completely new screen if you're fancy or on some paper if you're reading this in 1997.

We can skip over the introduction, which is essentially a what not to do - it's describing the attack procedure, but if we've implemented correctly that won't be possible. So the first section of the spec that we should pay close attention to is:

1.1 Protocol Flow

There's a diagram here showing the correct flow. We can see that there are two request/response actions listed on the diagram, A+B and C+D. Since we're writing tests for the server, we need to make the requests A and C, and apply our expectations to the responses B and D.

The A->B test is made against the authorization endpoint
The A->B test introduces two pieces of information: t(code verifier) and t_m
The C->D test is made against the token endpoint.
The C->D test introduces another piece of information: code_verifier.

So we're looking at two broad categories of controller test here, to start with.

2. Notational conventions

Again, this is largely ignorable, especially if you're familiar with IETF terminology. If not, pay attention to the fact that they have keywords in ALL CAPS: the MUSTs, SHOULDs, MUST NOTs, etc. Because they're probably going to form the mass of things we'll have to test.

3. Terminology

Here the spec defines the terms: code verifier, code challenge, and code challenge method. These relate to the terms above, but are going to be actually used in the protocol. Obviously if you (as I did) are implementing first and testing later you already know about these terms. If you're being good and doing TDD, these are new to you.

4. Protocol

Here's the meat of the spec - both for implementation and for tests. So from here we want to pay close attention, especially to the MUSTs and MUST NOTs.

4.1 Client Creates a Code Verifier

When we're implementing the server, we don't have to worry about how the client does things. But in the test we're pretending to be a client, so we have to know something about how to make the artifacts the client does. We'll need the verifier in test group C->D, but also A->B (where we'll use it in transformed form).

It's a 43-138 character random string made up of the characters A-Z / a-z / 0-9 / - / . / _ / ~

So our test may have to generate this code. If we want to be exhaustive, we should generate codes with the following properties: a blank string, valid strings of length 42, length 43, length 128, length 129, and strings of a valid length (43, say) but with invalid characters. This might be a bit overkill, but if it's done in a test loop it can probably be done.

The recommendation, however, is that we generate a 32-character sequence (which becomes 43 characters after being Base64 url-encoded). That's a pretty simple thing to do. (In Ruby I did it with SecureRandom).

4.2 Client Creates the Code Challenge

Here, again, we're looking at client things. But the test and the implementation overlap here to some extent, because the code challenge is a transform of the code verifier, and the implementation has to have the same ability so that it can confirm. There are two transforms mentioned in the spec:

'plain' - where the code challenge and code verifier are identical. Implementation is left as an exercise for the reader.
'S256' - the challenge is made by doing a SHA256 digest of the verifier, then Base64 URL encoding it. This is the first MUST we have to pay attention to - the server MUST implement the S256 method, so we're going to have to test it.

4.3 Client Sends the Code Challenge with the Authorization Request

We've reached the first call, A in the request/response pairs above, and hence the logical beginning of our controller tests. Our test has to call the authorization endpoint with the parameters code_challenge (REQUIRED), and a code_challenge_method (OPTIONAL). To be exhaustive for this we'd need a call without a code_challenge, then with a code_challenge and with various code_challenge_methods settings (none, 'plain', 'S256', and one invalid one)

4.4 Server returns the code

Here's the first response (B), so here are the expectations we have to think about for our test. There's a MUST mandating that the returned authorization code must be associated with the code_challenge and code_challenge method. In a successful request, there's no indication in the returned authorization code of the stored code_challenge (either because it's encrypted and unextractable, or because the server has stored it), so this can't be black-box tested - although if one were being strict, one might test that the returned authorization code doesn't include the code_challenge. But that aside, this isn't really a situation for a black box test. It would be a good idea to test this as a white box. We can check whether the parameter passed in via the request did actually get into persistent storage and was attached to the authorization code (if we're storing), or that the authorization code in the request does include the parameter (in some suitably secure way).

TEST: call GET /authorizations/new with a code_challenge parameter, and expect that the persistent store of authorizations contains that code_challenge attached to the authorization, and a code_challenge_method of 'plain' attached to the authorization.

TEST: call GET /authorizations/new with a code_challenge parameter and a code_challenge_method of 'plain', and expect that the persistent store of authorizations contains that code_challenge attached to the authorization, and a code_challenge_method of 'plain' attached to the authorization.

TEST: call GET /authorizations/new with a code_challenge parameter and a code_challenge_method of 's256', and expect that the persistent store of authorizations contains that code_challenge attached to the authorization, and a code_challenge_method of 's256' attached to the authorization.

We can put expectations into these tests that they also return a 200, and a valid authorization code, but we probably already have those tests (and indeed in my case I did), because they're the test for the basic non-PKCE OAuth2 operation.

Note that this section doesn't say anything about failures, so we don't test for failure here.

4.4.1 Error Response

Okay, here are the failures, which we can test as black boxes and perhaps as white boxes. We can see in the spec that the endpoint should return an error response with the parameter "error" set to "invalid_request" for both mentioned errors, and each SHOULD have an appropriate value. There's one test that's required for any implementation:

TEST: call GET /authorizations/new with a code_challenge parameter and a code_challenge_method of 'broken', and expect that the response contains an "error" parameter set to "invalid_request".

...and one that's only necessary in the case that your server requires PKCE. Mine didn't, but you might have a server that requires PKCE all the time or for certain clients (ones known to be mobile). In that case, you'll want another test:

TEST: call GET /authorizations/new with no code_challenge parameter and expect that the response contains an "error" parameter set to "invalid_request".

Note that the SHOULD here is one that's difficult to test. It's not a simple thing to test that the error_message contains an appropriate message, but it's a nice implementation reminder. Check that your implementation returns different messages for both of these two situations!

4.5. Client Sends the Authorization Code and the Code Verifier to the Token Endpoint

Here's the second call, C in the request/response pairs. The client is REQUIRED to send a code_verifier. We're interpreting that as REQUIRED if PKCE was invoked in request A (otherwise this isn't backwards compatible with non-PKCE), so we should have tests in which a PKCE request wasn't invoked, and this call is made without the code_verifier parameter. Probably this is already covered in our existing tests, though. Note that this requirement can be achieved even without being explicitly programmed, because if a PKCE code_challenge was sent with the authorization code, it will not match a nil code_verifier. So even if our implementation doesn't specifically check that a code_verifier was supplied, it still should fail to match the challenge.

We assume in the call that a previous call has been made, so the setup has to make the code_verifier, then create an authorization with the relevant transformed code_challenge attached to it.

4.6. Server Verifies code_verifier before Returning the Tokens

...and here are the expectations for the call. The server can return two things here - either the token as requested, or an "invalid_grant" error.

TEST: call POST /access_tokens with a valid authorization code that has a token challenge attached to it, and with a "code_verifier" parameter that matches the challenge. Expect that the response includes a valid token.

TEST: call POST /access_tokens with a valid authorization code that has a token challenge attached to it, but with a "code_verifier" parameter that doesn't match the challenge. Expect that the response contains an "error" parameter set to "invalid_request".

5. Compatibility

Nothing much here except the assertion that we MAY allow clients to connect without PKCE. Since in my case I already had clients doing that, it's nice to know that it's allowed, and it confirms my existing tests from before the server supported PKCE should still pass without modification.

6. IANA Considerations

Can be ignored as far as testing is concerned.

7. Security Considerations

Most of these refer exclusively to the behaviour and internal security of clients, so can be completely ignored. One possible exception being 7.2 Protection against eavesdroppers - the essence of this paragraph is that clients shouldn't use, or allow downgrade to, the "plain" method, and that servers MUST implement S256. So again, nothing extra to test, just a confirmation that some of our tests have to include S256 (which they do).

Appendix A

...is all about implementation. We may have read it while implementing (I know I did).

Appendix B

Is very useful, though! It gives us an example of a code_verifier / code_challenge pair for the S256 method. Having a known good pair like this is a godsend for unit tests of the S256 verification method, and these example values can also be used in the controller tests above.

UNIT TEST: we call the method we use to do S256 transformations with a code verifier parameter of "dBjftJeZ4CVP-mB92K27uhbUJU1p1r_wW1gFWFOEjXk", and expect that the return value is "E9Melhoa2OwvFrEMTJguCHaoeK1t8URWbuGJSstw-cM"

Kludge It

Search This Blog