-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Website migration to Vercel #766
Comments
|
I'm absolutely in favor of adopting Vercel to improve the dx and have staging branches etc. However our website is mostly static information: it seems a waste of energy to render things all the time. My understanding of this proposal is that we would deploy:
This does not improve the reliability of our infrastructure because it's still passing through our single nginx server. It will instead increase the load on our single server and worsen the response latency for our end users. I'm not going to block this for that anyway. I think that the proposed architecture is less reliable and not something we want to have. Happy to help designing something better with the tech at play (e.g. completely bypass our nginx and proxy vercel from cloudflare). |
Thanks for the feedback, @mcollina! I also feel like using Cloudflare directly could be an interesting solution. Maybe, through their Workers feature or something (as it allows path matching) Anyhow, just wanted to make it clear I just gave an example, of how we could achieve using Vercel just for the website without getting rid of the current infra for downloads/releases. I do believe using SSR will come at handy for all the features we want to introduce. It also improves build, reduces memory usage during build time and allows us to do other things that right now we've done some not ideal approaches on the initial PR for Next.js. |
Hey folks! Happy to answer questions about Vercel.
What do you mean by a waste of energy? Is this for Vercel? If things are static, those generated assets are deployed to our Edge Network (CDN). You wouldn't need to use another CDN on top.
This is one of the nice things about Vercel – you can choose static or dynamic, on a per-page basis, without needing to change infrastructure. The Node.js site currently has a lot of pages (I think over 1,000?). Plus you have internationalization, so multiply 1000 times the number of languages. You'll likely want to defer the generation of some of the infrequently used pages outside of the build (to prevent really, really long builds). As mentioned above, I've approved an open-source sponsorship to provide a free Vercel team for the Node.js site 😄 |
The content is mostly static, but this proposal is about making some of them dynamic to deal with a few rough edges of the current solution. Why we need a CDN/NGINX on topMost of the complexity we need to deal with comes from technical debt. We are currently serving all Node.js binaries from the same infrastructure of our website. As you would expect from a project of this size, very few people can push binaries. On the other hand, we want a contributor friendly website, and let people publish changes to our website from GitHub. I would personally feel uncomfortable deploying the binary downloads as part of our website. |
@mcollina I don't want to sound like I'm repeating myself (So I apologize if it sounds repetitive), but again, we don't want to (for now, it's not even part of the plans) change where the binaries and downloads and archives are. Just the live website 😅 (Note that we didn't mention at any moment that we should get rid of Cloudflare or NGINX) |
Gotcha, ty for clarifying, and sorry for the confusion 👀 |
@nodejs/build to make sure build team is aware. |
Ah, apologies @mcollina I was misunderstanding how you wanted to serve those binaries. Agreed that putting Cloudflare in front of Vercel here would make sense! |
Teasing apart the downloads from the website itself is a good goal, but as Matteo mentioned this part does not get us many of the benefits:
I guess the challenge is that the download/sites are tightly coupled in terms of existing URLs and we can't use dns to have traffic directed to different places for the downloads/site content? If that is impossible then possibly using CloudFlare as the proxy to Vercel/our existing server for downloads might be possible (that might possibly be what was suggested above). We mighte even be able to set it up so that we do both in parallel to test out where test.nodejs.org splits requests across the two locations while nodejs.org continues to serve both from the existing server? |
Not really, if we do it through NGINX it's pretty easy, but it doesn't reduce the load on our servers. It would be a temporary solution. Afaik, Cloudflare supports path-based proxying tho. It is something that can be done through Cloudflare Workers, as I mentioned before... We could also, of course, make a Cloudflare rule to redirect the downloads/binaries path to get.nodejs.org or something of sort. |
@mhdawson quick-question: do we use a free Cloudflare license? |
We have a free Business plan |
Sponsored? Cool, I suppose then all usage pricing goes to 0, right? Because of https://developers.cloudflare.com/workers/platform/pricing/ |
It also says that we have free access to Workers (don't know if it's unlimited), but can't we just use Page Rules to proxy based on the URL? |
Page Rules is also handy and can do the same, indeed. |
I think we would need to add page rules to route the website to Vercel and the downloads to nginx. |
Sounds good to me :) |
From the discussion it sounds like:
might be possible? @ovflowd do you think it would be based on what you undertstand about cloudflare? |
Yup we can make it happen :) |
@ovflowd good to hear. That seems like a way to show it working in advance. There is still the question of getting agreement from @nodejs/build and @nodejs/tsc that depending on an external site to host the website is something we want to do. |
I think the vast majority of these pages are blog posts and API docs, neither of which are internationalized AFAIK. |
@tniessen the multiplication is still valid because even if the content itself is not translated, the pages are and served on all the available languages. You can simply run the build command of current website and see the thousands of html pages that it outputs 😅 (and that is without incorporating API) so imagine adding API on too. That is basically because the static model has no routing, so any of those pages must be statically available on all those languages, even if the actual translated part is just the headers or some common i18n elements. |
Ah, I see. As far as I can tell, metalsmith neither caches build outputs nor parallelizes the build. If build times are a concern, if we replace metalsmith by some other build process, that would hopefully be addressed.
Why? Aren't these just static files with no internationalization? |
Still, metalsmith is blazing fast (mostly due to its simplicity compared to a full blown React Framework, but still super props to Metalsmith!!), and Next.js full SCG build takes 2-3 minutes in comparison with Metalsmith. What we want to address here is not build time, but the flexibility on addressing things such as i18n. On the current way, yes API docs are statically built and have no incorporation, I'd ask you to head over https://nodejs.dev/api/ and see some of the plans of how we plan to incorporate the API docs to the actual website 🙃 |
Ah, thanks for explaining. I can see how switching to a slower build system while also vastly increasing the number of generated resources could be problematic. |
Indeed, here actually ISR/SSR kinda becomes a "must" because the build system is slower, for production at least. But of course that's just one of the reasons of adopting ISR/SSR... But that's kinda the tradeoff from a simpler framework to another that does a plethora of things, for better or for worse. |
The slow build of the main site is only an issue for onboarding the docs that isn't a current concern with the existing website. I'm still not sure that splitting that off to |
@mhdawson I think from the TSC-side, we got a 👍, right? I believe we can forward that to the Foundation to proceed with the request, right? |
I would prefer to see a PoC of cloudflare/next/downloads before committing. Anyhow, I think we would need support from Vercel for the documentation previews for the new Node.js docs, so I wouldn't object moving things forward with them. |
We can do a PoC regardless of what we plan to do with Vercel, but we need access to Vercel first, right? |
Let me know how I can help with the Vercel setup. |
Since the website is now based on Next.js (and since that likely won't change anytime soon), I am not against supporting SSR -- as long as it remains entirely optional and is not tied to any particular vendor. I don't see the need for APIs or anything fancy at the moment, aside maybe from a search feature if people really want that. If we need vendor-specific code, it should be optional (and as little as possible). Assuming this is going to be vendor-independent and given that we already heavily rely on Cloudflare, could we just as well host it there to avoid further complicating our infrastructure setup? I am sure Vercel is an awesome service; I simply have no idea what exactly is being proposed and/or required here since I don't use Next.js. |
Yeah, the idea is still have no vendor-based code (vendors being Vercel) inside the codebase and as less as possible framework-specific imports in the codebase. Right now we have just a handful, if I recall correctly. But adopting Vercel as our Infrastructure, would lift a huge burden we have right now on the very small Build team. Not to mention many cool features, like like Git integration, Branch Previews, better caching and et cetera. We definitely need to check how we integrate that on Cloudflare and we're working on some PoCs. But yes, for maintainability and future proofing the solution, the repository aims to be as much as vendor independent as possible. |
My take is that there were no objections to the direction in general of a hosted environment outside of a machine run by the build wg for the Website content. As mentioned by @mcollina a good next step is a PoC showing the current downloads and website being split up even if they are still hosted on the same machine. Once that was in place it should be easy to try out shifting the website to vercel through vercel while being able to fallback to the instance we have running on the original machine. |
But this is already virtually happening. Both downloads and the regular website are being served by different nginx blocks. 🤔 But yeah, I think an actual feasible PoC would be to actually set up a testing nginx config with similar infra in a free/staging Vercel environment. We maybe could spin up a machine on our DigitalOcean account. |
From my point of view, the issues we've had yesterday and today showed that we have two singular points of failure. Cloudflare currently cannot keep us online if the DO origin is not working as intended, and the DO origin won't sustain traffic if our Cloudflare configuration is not working as intended. If we add another vendor to the infra setup, we have another singular point of failure. I don't see the downloads going anywhere simply due to their total size (except perhaps Cloudflare R2 but I don't think that's realistic right now), but Cloudflare cache reserve or Argo or so could still make things easier on the DO origin and maybe even sustain outages of the DO origin. That leaves the Next.js website.
My naive understanding is that both Cloudflare Pages and Vercel support Next.js including SSR. So I guess my real question is whether Vercel has advantages over Cloudflare Pages that outweigh adding another vendor and thus potential point of failure to our network. (IIRC Vercel uses Cloudflare infrastructure for serverless anyway?) |
We have another provider involved, Equinix, that hosts the fallover server. This is supposed to take over if the DO origin is overwhelmed. |
At least on Website-wise, these issues are natively mitigated by the whole website traffic being balanced, served and handled by those providers. I don't know how ClooudFlare pages stack on using Next.js. I also feel that starting a whole new investigation about Cloudflare and Cloudflare pages when we're already amid the evaluation of one vendor at this point feels unnatural. I don't think having multiple vendors is an issue; it's more about how we handle the access to these vendors, how we document the processes and how we do things. A good part of the incident we had yesterday/today was because we didn't know exactly, to a full extent, what was happening. It took us a lot of playing around and trying things until they worked. I assume Vercel is an on-premise/managed vendor, which would prevent us from doing funky configuration files (such as nginx) to handle these things. Cloudflare still has a crucial role in splitting the traffic from Website/Downloads (assuming the Website goes to another origin than the Downloads one), which would also reduce the configuration and complexity of our configuration on nginx. I'm also all in adopting whatever Cloudflare solution we have for downloads, such as R2 or anything else. If these providers are happy to provide us with the resources that we need.
Sadly, I'm unfamiliar with CF Pages, but a quick Google says yes. I prefer diversifying our vendors rather than keeping everything at Cloudflare, as if one day they abruptly stop providing us resources, we have at least more options. We also need to factor in the people managing these services, the learning curve for maintaining these vendor-specific services, et cetera. With this last incident, we had a sudden yet very good opportunity to work together and solve many long-standing issues. These origin issues are mitigated, and I believe that even without Cloudflare, our server could now sustain (under heavy stress) but still sustain the requests. Our web server was on life-support and essentially depended on Cloudflare. (I'm assuming that the cache is out of the equation, but not the firewall, we would be hammered pretty hard without the current Page Rules).
I feel that Vercel has better integration and management of Next.js installations + better building processes + CI for the whole ecosystem; after all, they're the maintainers of Next.js. This might be biased, but Vercel is not locked into Next.js; you can do the same on GitHub Pages or any other static content provider at Vercel. Finally, regardless if we adopt Vercel, Cloudflare Pages, GitHub Pages or whatever, I think we must make a PoC that 100% shows that we can split traffic and have entirely independent origins for downloads and the main website. I think it's easy if somebody gives me the resources to make a test machine (cc @MattIPv4, hehe). |
Being ablet to run using Cloudflare pages is interesting, the key question might be how different is it to deploy to Cloudflare pages or Vercel. If the answer is not much then maybe we could deploy to both with one being the backup for the other, similar to how we have a backup server today? |
That is a very reasonable approach. +1 for that. |
Small note: In case you aren't familiar with Vercel's workflow, you can comment directly on the UI when reviewing changes, similar to Figma or Google Docs. This has been an invaluable tool for myself while working on the Next.js documentation. |
+1 to @leerob comment. I completely forgot that Vercel has a neat feature for collaborative review for staging previews. That would benefit us in our day-to-day workflow for the Website Redesign. |
I certainly do not mean to complicate any existing work. My understanding based on your earlier comments was that we so far do not have any vendor-specific code and only use somewhat standard frameworks, so I assumed that using a vendor that claims to support these frameworks and that we've been working with successfully for almost a decade would be a simpler (or, in your words, natural) choice.
In general, I'd agree with that. In this particular case, however, I don't see Vercel (or any other vendor) fully replacing Cloudflare's role, and even even if that was possible, it would require a lot of work. So if Cloudflare were to stop providing us with resources (however unlikely that may be given an extremely positive track record), I doubt that this diversification would magically save us. Conversely, if the Next.js/SSR vendor stops providing us with resources, we can simply drop SSR and go back to hosting static content quite literally anywhere. Worst case, we briefly lose server-side search or something.
That sounds good to me. If Vercel really provides an objectively better developer experience, we might as well benefit from it (assuming it integrates well into our existing GitHub workflows). |
Fair argument.
That was never the case. We should not at all remove Cloudflare from the equation. I'm very sorry if I gave this impression.
It might not save (I doubt), but it still helps. (Or at least I see it helping. As it might not be helpful for the binaries/dist/downloads but still for the main website + API docs)
I tend to be inclined towards that. I never specifically used Vercel, but I always see all the fantastic effort put into the DX there. Per their docs (I think Lee also shared some with me), it should fit well with our existing workflows. Well, we wouldn't use a hacky Webhook anymore but actual proper GitHub Actions with proper environment detection. (Not to mention preview branches, which is excellent). |
After a short discussion in today's TSC meeting, I've enabled the Vercel app on nodejs.org. This is for experimentation, but we'll want TSC to weigh in before actually setting up hosting via Vercel. |
FYI, here are some updates on what's going on with Vercel:
|
Hey, you all 👋 I'm super happy to announce that we have a PoC ready! You can test https://vercel.nodejs.org. The PoC has:
This PoC is ready to be tested, and once it reaches approval from the TSC, we can do the switch on the primary DNS. |
cc @Trott and the @nodejs/tsc (FYI we might want to add this to the agenda) |
I think there's still stuff to figure out before we can switch the primary DNS. For example, the current origin rules rewrite to |
That should just be a case of creating |
|
That redirect is setup on the NGINX level, so on Cloudflare, it doesn't matter. Yet we should use something differently. Maybe something such as |
In nodejs/build#3366, we suggested |
The Vercel migration got concluded. Our website is live on Vercel. The summary of all steps is available here: nodejs/build#3366 (comment).
Note that the above is just a virtual hostname; everything is still under nodejs.org and works transparently. Thank Cloudflare again for generously upgrading us to Enterprise for free and providing us with R2 credits. |
Hey, I'm opening this issue to keep publicly tracking the conversation we have with a currently ongoing effort to evaluate a possible migration of the https://nodejs.org website toward Vercel infrastructure.
This consideration initially started because we're adopting Next.js (full SCG) with nodejs/nodejs.org#4991 and the fact that by adopting Vercel's infra, we would have a set of benefits.
It is essential to mention that if we decide not to use Vercel infra, we might still want to consider having our Next.js installation run with SSR (Server-Side Rendering) on our Infrastructure maintained by the @nodejs/build WG.
What is this proposal about, and why?
On why adopting Next.js SSR/ISR
Accept-Language
),hreflang
, and other features./pages/{twoLetter}
directory format of having one repeated page for each locale and adopt a similar design from https://github.com/nodejs/nodejs.dev, where we have a single file for a page and multiple files for content. (Example: here and here).On why adopting Vercel infra
What is this proposal not about
dist
, downloads, releases,dist.json
, generated metadata, other subdirectories, and microsites will remain on our current Build Infrastructure with the same NGINX rules.What was talked about so far
Next Steps
To be documented
The text was updated successfully, but these errors were encountered: