Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: batch sites uploads in groups under 100mb #1195

Merged
merged 4 commits into from
Jun 13, 2022

Conversation

threepointone
Copy link
Contributor

@threepointone threepointone commented Jun 6, 2022

There's an upper limit on the size of an upload to the bulk kv put api (as specified in https://api.cloudflare.com/#workers-kv-namespace-write-multiple-key-value-pairs). This patch batches sites uploads staying under the 100mb limit, after base64 encoding the files.

Fixes #1187


This fix could be improved by moving it to kv.ts:putKVBulkKeyValue() but I couldn't figure out a clean solution. So once this lands, I'll file an issue and make a followup PR. We'll probably need it for people who use wrangler kv:bulk put anyway.

There's an upper limit on the size of an upload to the bulk kv put api (as specified in https://api.cloudflare.com/#workers-kv-namespace-write-multiple-key-value-pairs). This patch batches sites uploads staying under the 100mb limit.

Fixes #1187
@changeset-bot
Copy link

changeset-bot bot commented Jun 6, 2022

🦋 Changeset detected

Latest commit: 7147ff7

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
wrangler Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Contributor

github-actions bot commented Jun 6, 2022

A wrangler prerelease is available for testing. You can install this latest build in your project with:

npm install --save-dev https://prerelease-registry.developers.workers.dev/runs/2489229231/npm-package-wrangler-1195

You can reference the automatically updated head of this PR with:

npm install --save-dev https://prerelease-registry.developers.workers.dev/prs/1195/npm-package-wrangler-1195

Or you can use npx with this latest build directly:

npx https://prerelease-registry.developers.workers.dev/runs/2489229231/npm-package-wrangler-1195 dev path/to/script.js

@EatonZ
Copy link
Contributor

EatonZ commented Jun 6, 2022

I tried the first test version the bot linked and it doesn't appear to have fixed the issue.🤔

@threepointone
Copy link
Contributor Author

Oh for real? I thought I'd extensively test this tomorrow, but good to know it already doesn't work lol.

@EatonZ
Copy link
Contributor

EatonZ commented Jun 7, 2022

Yup, still the same 413 error.

rozenmd
rozenmd previously approved these changes Jun 7, 2022
@rozenmd rozenmd dismissed their stale review June 7, 2022 09:29

Didn't see that the issue wasn't fixed - apologies

@threepointone threepointone marked this pull request as draft June 7, 2022 09:46
@threepointone threepointone changed the title fix: batch sites uploads in groups under 100mb wip - fix: batch sites uploads in groups under 100mb Jun 7, 2022
@EatonZ
Copy link
Contributor

EatonZ commented Jun 12, 2022

@threepointone Any workaround for this? Need to update my full site soon and this is still blocking. If not then I will probably have to downgrade to Wranger v1 for the time being.

@threepointone
Copy link
Contributor Author

We want to land this next week! Hopefully earlier than later.

@threepointone
Copy link
Contributor Author

You should feel free to downgrade if it's blocking you tho, terribly sorry for the inconvenience :(

@rozenmd rozenmd changed the title wip - fix: batch sites uploads in groups under 100mb fix: batch sites uploads in groups under 100mb Jun 13, 2022
@rozenmd rozenmd marked this pull request as ready for review June 13, 2022 14:02
@rozenmd
Copy link
Contributor

rozenmd commented Jun 13, 2022

Hey @EatonZ, do you mind retrying with the latest test version in the comment above? (npx https://prerelease-registry.developers.workers.dev/runs/2488829472/npm-package-wrangler-1195)

// Since the bulk upload api endpoint stays the same
// We're going to have to clear the mock as soon as it's resolved
// And immediately add a mock for another one
// Welcome to a callback pyramid in 2022
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make this a bit less painful if we capture useful stuff in the handler and then check it after runWrangler has completed...

const requests = mockUploadAssetsToKVRequest(kvNamespace.id);
await runWrangler("publish);
expect(requests.uploads).toEqual(expectedAssets);

Either write out the expectedAssets matchers by hand, or generate them based on the assets array.

We would need to tweak the mockUploadAssetsToKVRequest() so that you don't need to specify assets as a parameter, and instead add them to a requests object that should be returned from the call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a follow-up task for this as it'll require refactoring the other tests too: #1245

// delete all the unused assets
deleteKVBulkKeyValue(accountId, namespace, Array.from(namespaceKeys)),
]);
// sequentially upload each bucket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW @threepointone @petebacondarwin: I compared uploading a 150MiB site via this sequential solution, and a Promise.all approach:

  const bucketsToPut = [];
  for (const bucket of uploadBuckets) {
    bucketsToPut.push(putKVBulkKeyValue(accountId, namespace, bucket));
  }
  await Promise.all(bucketsToPut);

The sequential solution took 51 sec while the parallel solution took 38 sec, could be worth looking into if folks complain it's too slow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't recollect why I made this sequential, probably to avoid saturating the network. We could make this parallel, and consider adding limited concurrency later if people complain with extremely large payloads (like, GBs)

@threepointone
Copy link
Contributor Author

I can't stamp this because I started the PR 😅 It's other wise LGTM, feel free to take a call on the recommended changes (or file issues to do them in subsequent PRs)

@threepointone
Copy link
Contributor Author

Extremely grateful to you @rozenmd!!! Thanks so very much.

@rozenmd rozenmd merged commit 66a85ca into main Jun 13, 2022
@rozenmd rozenmd deleted the batch-sites-uploads-by-size branch June 13, 2022 15:16
@github-actions github-actions bot mentioned this pull request Jun 13, 2022
@EatonZ
Copy link
Contributor

EatonZ commented Jun 13, 2022

@rozenmd I tested the 2.0.11 release and it works for me now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 BUG: wrangler publish - 413 Request Entity Too Large
4 participants