Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FunctionsFetchError: Failed to send a request to the Edge Function - effecting 1% of invocations #263

Open
williamlmao opened this issue Feb 7, 2024 · 20 comments
Labels
bug Something isn't working

Comments

@williamlmao
Copy link

williamlmao commented Feb 7, 2024

Bug report

  • [Uncertain] I confirm this is a bug with Supabase, not with my own application.
  • [x - threads I found on this same error did not help] I confirm I have searched the Docs, GitHub Discussions, and Discord.

Describe the bug

We are getting intermittent "Failed to send a request to the Edge Function" errors. This bug is quite hard to reproduce. I have never been able to reproduce it locally, and 98.5% of function invocations go through just fine. For context, we've seen this error 1.3k times out of 72,474 invocations. I've had users report that they are getting this error, but then when they retry it goes through fine the next time.

My big questions are:

  1. Is it normal to see this error intermittently in supabase edge runtime? Is this just a fact of life, is this a bug, or is something wrong on our end? It's very hard to tell from my end.
  2. If it's likely that something is wrong on our end, I'd love any suggestions of how to debug. The request body we send for this function is very simple (just a couple IDs and a text input) so there is not much variability. Having a hard time identifying a pattern especially since the error is not very descriptive.

The stack trace we see in sentry is this:

 // 2. client-level headers
                    // 3. default Content-Type header
                    headers: Object.assign(Object.assign(Object.assign({}, _headers), this.headers), headers),
                    body,
                }).catch((fetchError) => {
                    throw new FunctionsFetchError(fetchError);
                });
                const isRelayError = response.headers.get('x-relay-error');
                if (isRelayError && isRelayError === 'true') {
                    throw new FunctionsRelayError(response);
                }

To Reproduce

I can't even reproduce this locally, so I can't provide reproduction steps.

Additional Context

We are running supabase CLI 1.142.1 and "@supabase/supabase-js": "2.39.1",

Please let me know if there's anything else I can provide that would be helpful.

@williamlmao williamlmao added the bug Something isn't working label Feb 7, 2024
@kiwicopple kiwicopple transferred this issue from supabase/supabase-js Feb 8, 2024
@Mykyta-Chernenko
Copy link

I have exactly the same issue, around 2% of the requests fail with this error. I've been talking with the support for a month but they haven't managed to resolve the issue

@sebestindragos
Copy link

sebestindragos commented Mar 31, 2024

There's another related thread here which I don't understand why it was closed.

I'm also facing this error and wondering what a possible solution would be. I recently started using backoff on the client side to retry failed calls, but only helped to some degree. Still facing errors after all retries have been consumed.

L.E. I was able to reproduce this locally by shutting down the docker container for the edge runtime:
image
Although the error message is a bit different, I've been seeing both of them in production which I think are related:

  • Failed to send a request to the Edge Function
  • Edge Function returned a non-2xx status code

Also seeing errors when fetching DB object (so not via edge functions, but via the normal DB REST API): TypeError: fetch failed - TypeError: fetch failed.

Seriously considering moving away from Supabase. I really love the product, but if it's this unreliable in a production environment then it's not really usable.

@laktek
Copy link
Contributor

laktek commented Apr 5, 2024

We try to figure out the causes for these intermittent failures. But these request failures can happen due to multiple reasons:

  • Your client's firewall blocking requests
  • Cloudflare CDN (which Supabase uses in production) blocking production traffic
  • Edge Runtime failing to respond due to an internal issue (this is the part we can focus on and try to reduce from happening)
  • Your edge function server implementation having an issue (usually these would be logged in Function logs)

I think it's best to have some error handling and retrying mechanism implemented in the client calling the edge function to reduce the chances of them erroring out (we'll consider building some of this logic into supabase-js itself)

@Mykyta-Chernenko
Copy link

We try to figure out the causes for these intermittent failures. But these request failures can happen due to multiple reasons:

  • Your client's firewall blocking requests
  • Cloudflare CDN (which Supabase uses in production) blocking production traffic
  • Edge Runtime failing to respond due to an internal issue (this is the part we can focus on and try to reduce from happening)
  • Your edge function server implementation having an issue (usually these would be logged in Function logs)

I think it's best to have some error handling and retrying mechanism implemented in the client calling the edge function to reduce the chances of them erroring out (we'll consider building some of this logic into supabase-js itself)

I retry for 3 times with backoff of 2, 4, 8 seconds. Most of the time the issue goes away, but around 0.5% of the requests still fail.

@sebestindragos
Copy link

@laktek as mentioned already using backoff retries, but still seeing issues. And IMO that's a terrible solution to suggest for users (hey, you should just retry requests). Because failures can still happen and products are loosing customers because of them.

@sebestindragos
Copy link

sebestindragos commented Apr 10, 2024

@laktek here is another type of error I was able to catch recently. It's the html code of a cloudflare page.

I copy pasted it into an html file and it looks like this:
image
image

<html class=\"no-js ie7 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if IE 8]>    <html class=\"no-js ie8 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if gt IE 8]><!--> <html class=\"no-js\" lang=\"en-US\"> <!--<![endif]-->\n<head>\n\n\n<title>vnawaforiamopaudfefi.supabase.co | 520: Web server is returning an unknown error</title>\n<meta charset=\"UTF-8\" />\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />\n<meta http-equiv=\"X-UA-Compatible\" content=\"IE=Edge\" />\n<meta name=\"robots\" content=\"noindex, nofollow\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1\" />\n<link rel=\"stylesheet\" id=\"cf_styles-css\" href=\"/cdn-cgi/styles/main.css\" />\n\n\n</head>\n<body>\n<div id=\"cf-wrapper\">\n    <div id=\"cf-error-details\" class=\"p-0\">\n        <header class=\"mx-auto pt-10 lg:pt-6 lg:px-8 w-240 lg:w-full mb-8\">\n            <h1 class=\"inline-block sm:block sm:mb-2 font-light text-60 lg:text-4xl text-black-dark leading-tight mr-2\">\n              <span class=\"inline-block\">Web server is returning an unknown error</span>\n              <span class=\"code-label\">Error code 520</span>\n            </h1>\n            <div>\n               Visit <a href=\"https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_520&utm_campaign=vnawaforiamopaudfefi.supabase.co\" target=\"_blank\" rel=\"noopener noreferrer\">cloudflare.com</a> for more information.\n            </div>\n            <div class=\"mt-3\">2024-04-10 04:02:07 UTC</div>\n        </header>\n        <div class=\"my-8 bg-gradient-gray\">\n            <div class=\"w-240 lg:w-full mx-auto\">\n                <div class=\"clearfix md:px-8\">\n                  \n<div id=\"cf-browser-status\" class=\" relative w-1/3 md:w-full py-15 md:p-0 md:py-8 md:text-left md:border-solid md:border-0 md:border-b md:border-gray-400 overflow-hidden float-left md:float-none text-center\">\n  <div class=\"relative mb-10 md:m-0\">\n    \n    <span class=\"cf-icon-browser block md:hidden h-20 bg-center bg-no-repeat\"></span>\n    <span class=\"cf-icon-ok w-12 h-12 absolute left-1/2 md:left-auto md:right-0 md:top-0 -ml-6 -bottom-4\"></span>\n    \n  </div>\n  <span class=\"md:block w-full truncate\">You</span>\n  <h3 class=\"md:inline-block mt-3 md:mt-0 text-2xl text-gray-600 font-light leading-1.3\">\n    \n    Browser\n    \n  </h3>\n  <span class=\"leading-1.3 text-2xl text-green-success\">Working</span>\n</div>\n\n<div id=\"cf-cloudflare-status\" class=\" relative w-1/3 md:w-full py-15 md:p-0 md:py-8 md:text-left md:border-solid md:border-0 md:border-b md:border-gray-400 overflow-hidden float-left md:float-none text-center\">\n  <div class=\"relative mb-10 md:m-0\">\n    <a href=\"https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_520&utm_campaign=vnawaforiamopaudfefi.supabase.co\" target=\"_blank\" rel=\"noopener noreferrer\">\n    <span class=\"cf-icon-cloud block md:hidden h-20 bg-center bg-no-repeat\"></span>\n    <span class=\"cf-icon-ok w-12 h-12 absolute left-1/2 md:left-auto md:right-0 md:top-0 -ml-6 -bottom-4\"></span>\n    </a>\n  </div>\n  <span class=\"md:block w-full truncate\">Seattle</span>\n  <h3 class=\"md:inline-block mt-3 md:mt-0 text-2xl text-gray-600 font-light leading-1.3\">\n    <a href=\"https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_520&utm_campaign=vnawaforiamopaudfefi.supabase.co\" target=\"_blank\" rel=\"noopener noreferrer\">\n    Cloudflare\n    </a>\n  </h3>\n  <span class=\"leading-1.3 text-2xl text-green-success\">Working</span>\n</div>\n\n<div id=\"cf-host-status\" class=\"cf-error-source relative w-1/3 md:w-full py-15 md:p-0 md:py-8 md:text-left md:border-solid md:border-0 md:border-b md:border-gray-400 overflow-hidden float-left md:float-none text-center\">\n  <div class=\"relative mb-10 md:m-0\">\n    \n    <span class=\"cf-icon-server block md:hidden h-20 bg-center bg-no-repeat\"></span>\n    <span class=\"cf-icon-error w-12 h-12 absolute left-1/2 md:left-auto md:right-0 md:top-0 -ml-6 -bottom-4\"></span>\n    \n  </div>\n  <span class=\"md:block w-full truncate\">vnawaforiamopaudfefi.supabase.co</span>\n  <h3 class=\"md:inline-block mt-3 md:mt-0 text-2xl text-gray-600 font-light leading-1.3\">\n    \n    Host\n    \n  </h3>\n  <span class=\"leading-1.3 text-2xl text-red-error\">Error</span>\n</div>\n\n                </div>\n            </div>\n        </div>\n\n        <div class=\"w-240 lg:w-full mx-auto mb-8 lg:px-8\">\n            <div class=\"clearfix\">\n                <div class=\"w-1/2 md:w-full float-left pr-6 md:pb-10 md:pr-0 leading-relaxed\">\n                    <h2 class=\"text-3xl font-normal leading-1.3 mb-4\">What happened?</h2>\n                    <p>There is an unknown connection issue between Cloudflare and the origin web server. As a result, the web page can not be displayed.</p>\n                </div>\n                <div class=\"w-1/2 md:w-full float-left leading-relaxed\">\n                    <h2 class=\"text-3xl font-normal leading-1.3 mb-4\">What can I do?</h2>\n                          <h3 class=\"text-15 font-semibold mb-2\">If you are a visitor of this website:</h3>\n      <p class=\"mb-6\">Please try again in a few minutes.</p>\n\n      <h3 class=\"text-15 font-semibold mb-2\">If you are the owner of this website:</h3>\n      <p><span>There is an issue between Cloudflare's cache and your origin web server. Cloudflare monitors for these errors and automatically investigates the cause. To help support the investigation, you can pull the corresponding error log from your web server and submit it our support team.  Please include the Ray ID (which is at the bottom of this error page).</span> <a rel=\"noopener noreferrer\" href=\"https://support.cloudflare.com/hc/en-us/articles/200171936-Error-520\">Additional troubleshooting resources</a>.</p>\n                </div>\n            </div>\n        </div>\n\n        <div class=\"cf-error-footer cf-wrapper w-240 lg:w-full py-10 sm:py-4 sm:px-8 mx-auto text-center sm:text-left border-solid border-0 border-t border-gray-300\">\n  <p class=\"text-13\">\n    <span class=\"cf-footer-item sm:block sm:mb-1\">Cloudflare Ray ID: <strong class=\"font-semibold\">871fd6a14523c74d</strong></span>\n    <span class=\"cf-footer-separator sm:hidden\">&bull;</span>\n    <span id=\"cf-footer-item-ip\" class=\"cf-footer-item hidden sm:block sm:mb-1\">\n      Your IP:\n      <button type=\"button\" id=\"cf-footer-ip-reveal\" class=\"cf-footer-ip-reveal-btn\">Click to reveal</button>\n      <span class=\"hidden\" id=\"cf-footer-ip\">35.167.165.194</span>\n      <span class=\"cf-footer-separator sm:hidden\">&bull;</span>\n    </span>\n    <span class=\"cf-footer-item sm:block sm:mb-1\"><span>Performance &amp; security by</span> <a rel=\"noopener noreferrer\" href=\"https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_520&utm_campaign=vnawaforiamopaudfefi.supabase.co\" id=\"brand_link\" target=\"_blank\">Cloudflare</a></span>\n    \n  </p>\n  <script>(function(){function d(){var b=a.getElementById(\"cf-footer-item-ip\"),c=a.getElementById(\"cf-footer-ip-reveal\");b&&\"classList\"in b&&(b.classList.remove(\"hidden\"),c.addEventListener(\"click\",function(){c.classList.add(\"hidden\");a.getElementById(\"cf-footer-ip\").classList.remove(\"hidden\")}))}var a=document;document.addEventListener&&a.addEventListener(\"DOMContentLoaded\",d)})();</script>\n</div><!-- /.error-footer -->\n\n\n    </div>\n</div>\n</body>\n</html>

I think it's pretty clear from this that the issue is on Supabase's end. Whatever server you have running the edge functions runtime is crashing.

@williamlmao
Copy link
Author

@laktek, thanks for your response.

The biggest issue here is that the error codes don't give any information on which one of the options you listed was the reason for the error.

From our perspective

  • Your client's firewall blocking requests - I don't think this is the case most of the time, I have received the error on my own end intermittently. If my firewall was blocking the request I would have expected it to happen more often.
  • Cloudflare CDN (which Supabase uses in production) blocking production traffic : This is something we have absolutely no control over right?
  • Edge Runtime failing to respond due to an internal issue (this is the part we can focus on and try to reduce from happening) - This is something we have no control over
  • Your edge function server implementation having an issue (usually these would be logged in Function logs) - We don't get any function logs. Usually this error happens on the client side and the edge function is never even invoked. Based on what you are saying, this indicates that it is not a problem with our function server right?

We already have retries and error handling built into our site, but that doesn't fix the problem, and it doesn't seem like there is a path forward on my end to solve the problem.

I hope you can find how incredible frustrating this is on our end. We really love supabase, and would really love to stay on supabase edge functions, but I'm starting to think we have to move off unless you are able to indicate to us that this is something that you can solve within the next month or so.

@williamlmao
Copy link
Author

Another note to add is, it is definitely something to do with the cloudflare CDN. Every invocation 502 error we have has cloudflare listed in the response metadata.

{
"headers": [
{
"content_length": "524",
"content_type": "text/html",
"date": "Tue, 16 Apr 2024 18:48:07 GMT",
"server": "cloudflare",
"vary": null,
"x_sb_edge_region": null,
"x_served_by": null
}
],
"status_code": 502
}

@hkrutzer
Copy link

Perhaps some sort of request ID, or OpenTracing header, could be added, to make it easier to find logs corresponding to the failing requests. Similar to e.g. the CF-Ray-Id header from Cloudflare.

@evelant
Copy link

evelant commented May 23, 2024

I had the same issue. 2-4% of all requests just failed for no apparent reason at all. I ended up switching to Bun hosted on fly.io. It was really easy and the experience is way better. It just works, no hassles with broken Deno tooling (monorepos are practically impossible), bugs in Deno, missing features in Deno, random failures, etc.

@laktek
Copy link
Contributor

laktek commented Jun 11, 2024

Are y'all still experiencing random errors? We've made some stability improvements in platform which we believe should help reduce the random 502 errors.

@cspace001
Copy link

Hi Im getting this (similar) error when doing password reset on my webapp.

@laktek
Copy link
Contributor

laktek commented Jun 12, 2024

@cspace001 password reset using Edge Functions?

@alexbriannaughton
Copy link

@laktek I still seem to get this error more often than I'd like in my pg_cron --> pg_net --> edge function invocation flow.

@mansueli
Copy link
Member

mansueli commented Sep 3, 2024

I've improved the example in pg_cron for calling edge functions so you raise the timeout value:

select
  cron.schedule(
    'invoke-function-every-half-minute',
    '30 seconds',
    $$
    select
      net.http_post(
          url:='https://project-ref.supabase.co/functions/v1/function-name',
          headers:=jsonb_build_object('Content-Type','application/json', 'Authorization', 'Bearer ' || 'YOUR_ANON_KEY'),
          body:=jsonb_build_object('time', now() ),
          timeout_milliseconds:=5000
      ) as request_id;
    $$
  );

https://supabase.com/docs/guides/database/extensions/pg_cron#invoke-supabase-edge-function-every-30-seconds

@laktek
Copy link
Contributor

laktek commented Sep 4, 2024

@alexbriannaughton Can you try increasing the timeouts in pg_net requests as mentioned in above example from @mansueli ?

@alexbriannaughton
Copy link

alexbriannaughton commented Sep 4, 2024

@laktek I had actually already set the timeout_milliseconds to 4500 to troubleshoot before posting here and opening a support request.

I will note that I haven't had the issue for the last 24 hours, but I'm not sure what I did on my end to make it stop!

Edit: had another one the following day.

@RedChops
Copy link

We've been dealing with this for about a year now. It's hard to tell the exact percentage of failing requests but it seems like 2% is probably accurate.

We've built in backoff on the client side all over the place but that's more annoying to do on function-to-function calls and is also a terrible end user experience.

These days the errors seem to be mostly 502, 503, and 520. Just today we got this one:

image

in our payment processing code. There are no other logs I can look at, this all seems to be squarely a Supabase issue and it hasn't really seemed to improve much over time

@laktek
Copy link
Contributor

laktek commented Sep 11, 2024

@RedChops, can you open a support ticket (via https://supabase.help) with the name of the project ID and functions that are receiving these errors? I'll investigate internally to find causes for these errors.

@JeongJuhyeon
Copy link

JeongJuhyeon commented Sep 27, 2024

@RedChops, can you open a support ticket (via https://supabase.help) with the name of the project ID and functions that are receiving these errors? I'll investigate internally to find causes for these errors.

We've been seeing the same for multiple weeks (we're seeing ~5% 502s), opened a support ticket there, ticket ID 3663152744. Includes the details of both a 200 and a 502 to the same edge function.

We only get 502, no other 5XX codes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests