Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token exchanges provide no ip forwarding headers, triggering issues at Auth0 under scale #668

Closed
ShawnFumo opened this issue May 18, 2022 · 12 comments
Labels
enhancement New feature or request

Comments

@ShawnFumo
Copy link

ShawnFumo commented May 18, 2022

Our next.js application is using a serverless lambda to proxy API calls, to add the access token to the request (so the browser never has access to the unencrypted access token). This has been working, but as volume increased, we started to get reports of people being logged out before the inactivity timeout. We looked in the logs and saw intermittent "Failed Exchange of Refresh Token for Access Token". After talking to Auth0 support, they noted that there was a lot of different user requests coming from the same IPs, and that may be the cause.

You can see from the attached image that the user's real IP gets to Auth0 for account creation and login. Then that person's exchange from auth code to access token happens from an Amazon IP. Then a different user's refresh token to access token exchange happens, coming from the same IP. Then that same user has another exchange soon after, but from a different IP. I believe lambdas don't have static IP addresses, so that makes sense.

nextjs ips

Support suggested adding the auth0-forwarded-for header and I'm currently trying to see how to hook into this library to inject that header. But it seems like this would be an issue for anyone using the library at a decent scale and that the header should be added by default? This is the code we're using, so you can see it is the default functionality of the library:

import { withApiAuthRequired, getAccessToken } from '@auth0/nextjs-auth0';

export default withApiAuthRequired(async function shows(req, res) {
    let accessToken = ""
    try {
      const {accessToken: at} = await getAccessToken(req, res);
      accessToken = at
    } catch (error) {
      res.status(401).send({error: error})
      return
    }

  // Omitting the code below which adds the token to the graphql request
}

Here's an example of a failing request with most of it redacted, but showing our library and node versions:

{
  "content": {
    "message": "Failed Exchange of Refresh Token for Access Token",
    "attributes": {
      "data": {
        "auth0_client": {
          "version": "1.7.0",
          "name": "nextjs-auth0",
          "env": {
            "node": "v14.19.1"
          }
        },
      },
      "evt": {
        "name": "failed_exchange_refresh_token_for_access_token"
      },
      "http": {
        "useragent": "nextjs-auth0/1.7.0"
      },
      "service": "auth0"
    }
   }
}

We are not using refresh token rotation. The access token expiration was set at 2 minutes and refresh token inactivity timeout set to 12 minutes. We recently tried increasing both values but are still seeing the errors.

It is possible that support is mistaken about the cause, but it seems very plausible. I thought it'd be worth bringing up there.

Thanks!

@adamjmcgrath
Copy link
Contributor

Hi @ShawnFumo

The auth0-forwarded-for header is to prevent triggering the attack protection for the Resource Owner Password Flow, I've not heard of it being required for the Refresh Grant.

I think it's more likely that you're getting intermittent failed exchanges because you have a short inactivity timeout, the default is 30 days and you have 12 minutes. I'm not sure how users use your app, but if they don't make an API request for 12 minutes, they'll get that error.

Let me double check the advice you were given about the auth0-forwarded-for and the Refresh Grant though. Can you tell me where you got that information from and share your tenant domain and clientId?

@adamjmcgrath adamjmcgrath added the question Further information is requested label May 20, 2022
@ShawnFumo
Copy link
Author

@adamjmcgrath Thanks. Before you delve too much more, while I don't know if the header is still a good idea or not, we just realized the bigger issue that is probably driving the errors (it was easier to see exactly what was going on now that we just got our Auth0 logs streaming into DataDog). I don't know if it is specific to nextjs-auth0 or not, but I saw someone else had a very similar issue).

Basically, we get the session cookie during the 302 redirection from auth/callback as normal, but we are never getting Set-Cookie coming back on subsequent ajax calls through Apollo Client. So the session cookie never gets updated. After the initial access token expires, the lambda will see it is expired and request a new token and attach it to the graphql request, and that succeeds, but then the browser is never updated. So basically after a couple of minutes, ALL requests for the user will do a token exchange. And since pages may have multiple requests and someone can be clicking around quickly, there could be 10-15 exchanges using the same refresh token within seconds. Which I'm sure must be trigger something!

It's just strange in that during local dev, the cookie comes back, and on prod I can see the XHR requests are 'same-site' and the cookie does go outward to the lambda successfully, so it shouldn't be a fetch credentials setting issue. I also see that the CloudFront behavior for /api/* has Cookies All w/ legacy caching (seems to be the default) and the lambda is set @edge on Origin Request. My next step was going to try to log the headers in the response inside the lambda to make sure it is going out, but if the issue is code in nextjs-auth0 or something wrong in how I'm calling it, why does it work locally? Very perplexing...

The advice for the header was in this ticket, if you're able to see that.

Thanks!

@adamjmcgrath
Copy link
Contributor

Hi @ShawnFumo - if your lambda is doing any sort of public caching then it's likely that it will be stripping the set-cookie header from the response. This is desirable, because you don't want users hitting the cache and getting a different user's set-cookie header.

You should make sure you don't cache any authenticated response in a public place (the payload nor the headers), like lambda edge caching.

@ShawnFumo
Copy link
Author

ShawnFumo commented May 20, 2022

@adamjmcgrath I don't believe that's the issue. The lambda going through CloudFront is the same one that handles handleAuth() (and under the same behavior path), so if no cookies could come back, the /api/auth/callback wouldn't be able to set it either. The api requests themselves have had Cache-Control: no-cache all along and I even tried setting Cache-Control: no-cache="Set-Cookie" in the response as detailed here, but that didn't change anything.

I found I was able to set cookies myself from the lambda and those get through ok and save into the browser's cookies. After digging around in the code for nextjs-auth0 and auth0-session, I may finally know what is going on. My assumption was that withApiAuthRequired or getAccessToken, since they're going out and getting a new access token, that a new updated session cookie would automatically be returned in the response.

I "think" from what I'm seeing, the fact that the cache was already built from the old cookie in the request means that won't put a new cookie in the response. I can see in callbackHandlerFactory that it specifically does sessionCache.create(req, res, session); before doing it's redirect, which itself will call this.cookieStore.save.

So I think I need to detect if a token exchange happens and then manually create the cache or save the cookieStore? The problem is I can't see how to get access to those existing constructed objects from inside of withApiAuthRequired. Maybe if I get the config and build a new CookieStore from that, get the session from getSession, and save with that?

It feels like I'm a bit in the weeds, but trying! Does my assumption sound right that it won't try to Set-Cookie in the response even if a token exchange happens due to expired access token in request? I guess it'd be a different issue/feature request, but would be nice if there was an option for that to happen automatically for the user.

And if that is all the case, then I'm still not sure how locally it seems like new Set-Cookie responses do happen. Unless there's something else about it being local that makes it seem as if it is doing that when it really isn't?

@ShawnFumo
Copy link
Author

ShawnFumo commented May 20, 2022

I got it working! It isn't ideal: I'm importing stuff from the dist folder directly (I'm assuming getConfig and CookieStore aren't part of the public interface?), and having to reconstruct a new CookieStore when it already exists (because I can't access that one), but it does work. When the access token is still valid, no cookie is returned, and when an exchange happens, a new appSession cookie is put into the response. What's interesting is on my local machine, it shows two copies of the session cookie coming back. Is there something special that injects a cookie just when running local?

import { withApiAuthRequired, getAccessToken, Session } from '@auth0/nextjs-auth0';
import { CookieStore } from '@auth0/nextjs-auth0/dist/auth0-session';
import { getConfig } from '@auth0/nextjs-auth0/dist/config';
import { AfterRefresh } from '@auth0/nextjs-auth0/dist/session/get-access-token';
import { NextApiRequest, NextApiResponse } from 'next';

const afterRefresh: AfterRefresh = (req: NextApiRequest, res: NextApiResponse, session: Session) => {
  // Output a new session cookie when refreshing the access token
  const store = new CookieStore(getConfig().baseConfig)
  store.save(req, res, session);

  return session;
}

export default withApiAuthRequired(async function shows(req, res) {
    try {
      const { accessToken } = await getAccessToken(req, res, {afterRefresh: afterRefresh})

      // use token in graphql request
    }
    catch { /* ... */ }
}

So, I'm not sure what to do now. Is this a bug I should submit? Or is it unexpected on your side that a user of the library would want to update the cookie when the access token expires?

Edit: This quote below seems to imply that you're expecting all requests that look at the session to have an outgoing Set-Cookie, so perhaps it is a bug with how nextjs-auth0 works on lambda@edge or the version of node? Have you seen the library working on AWS, or is it generally only tested with Vercel? I don't think we have anything misconfigured on our side for the library since we're only setting the basic values like AUTH0_SECRET, base_url, issuer_base, client_id, client_secret, scope, and audience. We didn't change anything to do with the session settings.

This SDK offers a rolling session by default, which means that any response that reads the session will have a Set-Cookie header to update the cookie's expiry. Vercel and potentially other hosting providers include the Set-Cookie header in the cached response, so even if you think the response's content can be cached publicly, the responses Set-Cookie header cannot.

Edit2: Just a thought.. is it possible that the on-headers library has an issue? It looks like that library is supposed to add stuff when the header is about to be written, so if that never gets run, it wouldn't output the Set-Cookie right?

@ShawnFumo
Copy link
Author

ShawnFumo commented May 21, 2022

It seems I was actually right about the on-headers thing. On AWS, the callback never gets called:

console.log('***before using onHeaders')
onHeaders(res, () => console.log("***hey, writing the headers"))

res.status(response.status).send(response.body);

Locally it has both logs, but on AWS, only the first goes out. I think it has to do with us using send with a stream (it was easiest to proxy the response body by turning off next's bodyParser for that api endpoint, leaving response.body as a stream. The other person I saw online with a similar issue also turned off bodyParser.

It turns out that the on-header library doesn't link into some event in node. It actually replaces the writeHead function on the response object with a wrapper call. I'm guessing at some layer there is an optimization which doesn't end up calling writeHead. I tried to figure it out but it's hard to follow the exact seams between nextjs, express, and node itself.

For now, the easiest thing to do so the cookie comes out on AWS and doesn't have an error locally was this:

      if (process.env.NODE_ENV === 'development') {
        res.status(response.status)
      } else {
        res.writeHead(response.status)
      }

      res.send(response.body);

It's still not ideal since I read you aren't supposed to really call writeHead directly from an Express app, let alone nextjs (which is kind of what caused the issue to start with, relying on implementation details). Maybe it'd be better instead of using the on-header library to add the callback directly to the response object (like _onHeader or _addSession or something). Then in the definition of withApiAuthRequired, you could replace the response's send method with a proxy that calls the handler if it has been defined, before it calls the original send?

If that sounds reasonable, I could try to do a PR to implement it.

@adamjmcgrath
Copy link
Contributor

@ShawnFumo

We've just worked around a similar issue for withPageAuthRequired (see the new https://auth0.github.io/nextjs-auth0/modules/helpers_get_server_side_props_wrapper.html)

If you have a workaround and no one else is experiencing this issue, then I'm going to suggest we leave it.

If you want to open a PR, happy to look at it, but it should probably follow a similar solution to #664

@ShawnFumo
Copy link
Author

@adamjmcgrath, thanks for the info. I did see at least one other person experiencing the issue online. There might be more that don't even realize it is happening and are hitting Auth0 with more token exchanges than they should (or don't know why session rotation isn't working, etc). While my workaround does work right now, it feels pretty brittle. I still prefer it to what I had before of reaching into the non-public API of the library, but changing how auth works under the covers in future versions could easily break it.

I'll try to do a PR pretty soon, modifying WithPageAuthRequired to be similar to GetServerSidePropsWrapper.

@adamjmcgrath
Copy link
Contributor

Sure, thanks for the update @ShawnFumo

I'll try to do a PR pretty soon, modifying WithPageAuthRequired to be similar to GetServerSidePropsWrapper.

WithPageAuthRequired already uses GetServerSidePropsWrapper. You'll want to make WithApiAuthRequired work like GetServerSidePropsWrapper. e.g.

withApiAuthRequiredFactory(sessionCache):
  return withApiAuthRequired(myRoute):
    return apiRoute(req, res):
      sessionCache.init(req, res, autoSave=false)
      ret = myRoute(req, res)
      sessionCache.save(req, res)
      return ret

See #664 for reference implementation

if you don't get round to it in the next week or so, I'll probably have some time to take a look

@adamjmcgrath adamjmcgrath added enhancement New feature or request and removed question Further information is requested labels May 24, 2022
@ShawnFumo
Copy link
Author

ShawnFumo commented May 24, 2022

@adamjmcgrath Oops, yeah I did mean WithApiAuthRequired. I think I was in a hurry cleaning up the casing in the message and copy-pasted from the wrong spot 😆

That fix is easier than I was thinking it was going to be. I assumed you couldn't call setHeader after send (and so would have to proxy send) but I tried it locally and it worked. I guess Next must be delaying the actual writing of the response until later.

@adamjmcgrath
Copy link
Contributor

That fix is easier than I was thinking it was going to be. I assumed you couldn't call setHeader after send (and so would have to proxy send) but I tried it locally and it worked.

Correct, you're calling setHeader in sessionCache.save (synchronously within apiRoute) before Next writes the response

@adamjmcgrath
Copy link
Contributor

👋 2.0.0-beta.0 has been released - which no longer uses onHeaders (and no longer requires GetServerSidePropsWrapper )

Check out the README for how to install the Beta. And the Migration guide for more info

Please raise a new issue if you notice any problems

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants