Skip to Content

OAuth & Browserless Devices

One of the last few legitimate usages of the Resource Owner Password Credentials (ROPC) grant type is for browserless devices, for example, a smart TV and other such Internet of Things (IoT) devices. I’ve talked before about why ROPC should not be used in any new application and that it was only designed to quickly tokenize legacy applications (and that’s legacy back in 2012), but let’s take a quick look at why it should not be used for IoT devices:

  • Impersonation: when you use the ROPC grant type, you are not authenticating users; instead you are impersonating them. Whatever the user can do, in theory, the client app can also do
  • No Single Sign-On (SSO)
  • No Multi-factor authentication (MFA, except for repeatable credentials)
  • No federation with external identity providers
  • Requires custom integration with password managers
  • Dubious security benefits on a public client (client authentication becomes difficult, therefore exposing an API endpoint that attacks can use to use when phishing and brute forcing
  • ROPC significantly increases the attack surface of your IoT application, with user credentials passing through the IoT device; devices that are not exactly known for their focus on security.

OAuth & Input Constrained Devices

Whether you have access to a browser or not, if it’s hard to type in your username & password, you are going to upset your users and not exactly encourage users to choose a strong password.

I recently had to do this in front of a group of impatient friends waiting to watch a film on Netflix. I had to send everyone out of the room so that I could slowly type my password in. Worse yet, in another instance, I’ve had to enter a password that was generated by my password manager. That took a while.

So, whether you are using the ROPC grant type or a proper OAuth/OpenID Connect flow, it is clear that these devices have a problem.

The OAuth Device Flow for Browserless and Input Constrained Devices

To address the issue of such devices, the OAuth working group are in the stages of finalizing a new specification called “OAuth Device Flow for Browserless and Input Constrained Devices.”

It’s been discussed for a while, and there are already a couple of existing versions out there, namely Google’s implementation (which I’m assuming has something to do with one of the specification’s authors working there). So, let’s take a look at how it works.

At the time of writing, this specification is in its final draft. You can track its publication status here.

Requirements

To securely authenticate the user, the OAuth device flow makes use of a secondary device. Let’s face it, no matter how clever we try to be, if there’s no browser or if the input device isn’t up to the task, then we need to outsource the work.

Luckily, most people have an internet connected device with a browser in their pocket or, heaven forbid, attached to their belt. Depending on the operating system the UX quality is debatable, but it’s certainly better than a TV remote or cycling through the alphabet one character at a time.

This secondary device will never interact directly with the client device that is requesting authorization.

The Flow

The first step in the process is for the client device to ask our authorization server for access. In return, our authorization server responds with: a device code, a user code, and a verification URI.

The device will then transmit to the user, the user code, and verification URI, asking the user to visit this URI and enter the code.

When the user visits this site, the authorization server needs to authenticate the user (if they haven’t already done so, hooray for SSO). Once they have verified their identity, they enter the user code and give their consent to the client device. It is the user’s responsibility to confirm that the client application requesting authorization matches the one that is making the request.

While this is happening, the device is polling the authorization server’s token endpoint, with its device code, asking: “Have they authorized me yet? Have they authorized me yet?”. This goes on until the authorization server says yes or gets annoyed enough to turn the car around. Upon authorization, the authorization server returns the tokens in response to the polling.

The flow of the device. Showing client device, authorization server, secondary device and their interactions.

The Protocol

Device Authorization

To start the flow, the client application makes a request to the new device authorization endpoint, that looks something like:

Where scopes can optionally be defined using the scope parameter.

If the client exists, and the scopes it requests are correct, then the authorization server will respond with:

Where verification_uri_complete, expires_in, and interval are optional.

The first three parameters are the most important here. First, we have the device_code, which is going to be used by our client application.

Then we have the user_code and the verification_uri, which is going to be passed on to our user.

User Interaction

Once our device has been authorized for the device flow, we now need to get the user and their secondary device involved. We do this using the verification_uri and user_code.

The simplest method is to ask the user to visit the verification_uri, authenticate, enter their user_code, and consent to the delegation request. As a result, the complexity of the verification_uri needs to be kept to a minimum, as the user will have to type this into their browser.

Using a browser on another device, visit:   

https://example.com/device                            

And enter the code: 

WDJB-MJHT    

Typing URLs is silly if a verification_uri_complete is included in the device authorization response, we can try and improve the user experience here. The verification_uri_complete parameter is the verification_uri and user_code concatenated in a way that the authorization server will be able to read both. For example, with the user_code in a query string on the verification_uri. We can now use this full Uri to remove any user copy and typing between devices.

An easy win would be to use QR codes. So instead of typing in a URL and a code, the user just has to scan the QR code. It’s recommended that you still display the manual URL and code, since the user may not have access to a QR code reader, or they may want to confirm the code when authorizing the client.

Other alternatives included in the spec are Bluetooth Low Energy, Near Field Communication, and text-to-speech audio. The general rule is that the communication method should be one-way, and only accessible by people in close proximity to the client device. The neat thing about this is that we aren’t restricted to visual only methods of delivery.

Device Token Request

While the user interaction is taking place, the client device is periodically polling the authorization server’s token endpoint.

Where our grant type is the new urn:ietf:params:oauth:grant-type:devicecode, and device_code is the code that was sent to the client device during the initial device authorization request.

If this is a confidential client, this request could include client credentials, however your client device is most probably going to be considered a public client.

Our token response is either going to be the typical token response of:

Otherwise you’ll receive a token error response:

What about OpenID Connect?

The device flow can easily be adapted for use with OpenID Connect, simply ask for an identity scope, and return an identity token in the token response! It’s really that simple, and this method is already in use by Google.

Security Considerations

Most security considerations revolve around implementation specifics, for example the necessary entropy of the user code given expected secondary device and time to live. However, there is one that I alluded to earlier in the article that I want to discuss in further detail.

Devices that use the OAuth device flow, are typically going to be public clients. In other words, they cannot keep a secret. This could be due to the source code being on an end-user device (a mobile phone, a browser, a fridge) and there being no back-end server present (for secure back channel client authentication).

Because the devices are public clients and cannot securely authenticate themselves, we are going to be more vulnerable to phishing attacks, with other unauthorized applications making device authorization requests to our authorization server. Here we must rely on the user being able to differentiate between our authorized applications and imposters.

This is completely acceptable according to OAuth, after all the user is a key player in the authorization decision. There’s only so far that we can go with securing the federation process when dealing with public clients, instead we must focus on minimising the attack surface and arming the user with enough information to make an informed decision before consenting to their data being shared.

See section 5 of the spec for more details.