Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeking Advice: Queue Worker Implementation on Supabase Edge Functions #464

Open
jumski opened this issue Dec 19, 2024 · 2 comments
Open

Comments

@jumski
Copy link

jumski commented Dec 19, 2024

disclaimer: got sent to issues from Discussions and Discord so there is a chance developer can see this message.

Hello everyone,

I’m currently working on an open-source queue worker built on top of Supabase Edge Functions as part of a larger, Postgres-centric workflow orchestration engine that I’ve been developing full-time for the last two months. I’d greatly appreciate any guidance on leveraging the Edge Runtime to build a robust, community-oriented solution. Insights from both the Supabase team and the community would be invaluable.


Current Approach

  • The worker process starts when the Edge Function boots (outside the Deno request handler).
  • Immediately after boot, its main loop is scheduled using waitUntil.
  • It connects to Postgres and uses pgmq.read_with_poll to continuously read messages from a target queue.
  • When a message arrives, the handler runs, and its promise is also managed via waitUntil.

Handling CPU/Wall Time Limits
At some point, the worker may hit its CPU or wall-clock time limit, triggering the onbeforeunload event. From what I’ve learned, once onbeforeunload fires, that Edge Function instance no longer accepts new requests. To ensure continuity, I issue an HTTP request to /api/v1/functions/my-worker-fn, effectively spawning a new Edge Function instance that starts a fresh worker.

Disclaimer:
I’m not currently tracking these workers in a database nor performing any heartbeats. However, I will add this soon to control the number of spawned workers and better manage the overall process.


Questions

  1. Compliance: Is this pattern—specifically spawning a new instance when onbeforeunload fires—aligned with the Terms of Service for Edge Functions?
  2. Instance Limits: Are there any rules or limitations on how frequently new instances can be spawned or how many instances can run concurrently?
  3. Graceful Shutdown: Beyond onbeforeunload, are there other events or APIs I can use to achieve a more graceful shutdown process?
  4. Roadmap & Collaboration: Is Supabase considering a built-in solution for similar use cases? I’d be happy to collaborate rather than reinvent the wheel.

Next Steps & Feedback
I plan to release this worker component soon to gather feedback. While it’s just one building block of the larger orchestration engine I’m developing, ensuring it aligns with best practices and community standards is a top priority.

Thank you for your time and any insights you can share!

@jumski jumski changed the title Help with understanding how to develop for Edge Functions I'm building Queue Worker on top of Edge Functions and have few questions Dec 19, 2024
@jumski jumski changed the title I'm building Queue Worker on top of Edge Functions and have few questions Seeking Advice: Queue Worker Implementation on Supabase Edge Functions Dec 19, 2024
@jgoux
Copy link

jgoux commented Dec 20, 2024

Hey @jumski, thanks for sharing this project, it seems super interesting!

My only concern about relying on Edge Functions to run a long running process like this is that the process isn't supervised.

If anything goes wrong or your function crashes, how do you ensure that the poller restarts correctly?

@jumski
Copy link
Author

jumski commented Dec 20, 2024

Thank you @jgoux for taking the time to read my message! 🙇

I plan multiple measures to improve reliability:

  • Idempotency: Handlers must be retry-safe (but they always should be!)
  • Worker Start: Logs its start in the database.
  • Heartbeats: Periodically updates last_heartbeat in DB.
  • AbortController: Enables clean shutdowns on onbeforeunload (just pass it to fetch etc).
  • Visibility Timeout: Messages have a short timeout (1s) for fast retries in case worker dies
  • Extend Timeout: Handlers can call pgmq.set_vt for longer tasks.
  • Message Archival: Processes must archive messages after handling.
  • Retry on Failure: Unarchived messages reappear due to timeout.
  • Scaling: A cron job starts workers if last_heartbeat is stale.

This won’t be as robust as a long-lived process, but I hope it’s good enough for an entry-level system.

From my early tests with an no-op handler function this system can process around 1500 messages before CPU time limit hits.

Any feedback on boot/terminate processes or overlooked Deno/Edge APIs would be great! Let me know if spawning workers on onbeforeunload raises compliance or resource concerns.

Lastly, handlers will be API-compatible with Graphile Worker for easy migration in case someone grows out of edge worker.

Thanks again! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants