Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3-to-S3 copy_to exception: "parts must contain etag for each part" #870

Closed
klausbadelt opened this issue Jul 13, 2015 · 3 comments
Closed

Comments

@klausbadelt
Copy link

When copying (large, i.e. 50 GB+) objects between buckets in different regions, cross-account, with #copy_to, we receive this exception:

Delivery job 20661 failed (ArgumentError): parts must contain etag for each part
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/client.rb:390:in `validate!'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/client.rb:483:in `validate_parts!'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/client.rb:1871:in `block (2 levels) in <class:V20060301>'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:560:in `build_request'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:491:in `block (3 levels) in client_request'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/response.rb:175:in `call'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/response.rb:175:in `build_request'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/response.rb:114:in `initialize'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:203:in `new'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:203:in `new_response'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:490:in `block (2 levels) in client_request'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:391:in `log_client_request'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:477:in `block in client_request'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:373:in `return_or_raise'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/core/client.rb:476:in `client_request'
    (eval):3:in `complete_multipart_upload'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/multipart_upload.rb:280:in `complete'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/multipart_upload.rb:303:in `close'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/s3_object.rb:729:in `multipart_upload'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/s3_object.rb:1353:in `multipart_copy'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/s3_object.rb:915:in `copy_from'
    /usr/local/lib/ruby/gems/2.0.0/gems/aws-sdk-v1-1.64.0/lib/aws/s3/s3_object.rb:1003:in `copy_to'

Also, the transfer is very slow (~ 900 KB/s, from us-east-1 to us-west-). [aws-sdk-v1-1.64.0]

@trevorrowe
Copy link
Member

Can you share a code example of how you are invoking #copy_to. Also, do you receive this error periodically on large objects or is it consitently reproducible?

@klausbadelt
Copy link
Author

in_uri = URI in_url
out_uri = URI out_url
in_obj  = Kino.s3.buckets[in_uri.host].objects[in_uri.path.sub(/^\//,'')]
filesize = in_obj.content_length.to_i
# Copy S3 -> S3
# direct S3-to-S3 (incl. multipart) copy
out_obj = in_obj.copy_to out_uri.path.sub(/^\//,''),
                         bucket_name: out_uri.host,
                         content_length: filesize,
                         reduced_redundancy: true,
                         use_multipart_copy: true,
                         acl: :bucket_owner_full_control

The exception handler (which produces the backtrace above) looks like this:

rescue Kino::NoRetryError, OAuth2::Error, StandardError => e
  # manually move the message into the dead letter queue
  Delivery.dead_letter_q.send_message(msg.body) if msg rescue nil
  error = "Delivery job #{job && job.id} failed (#{e.class}): #{e.message}\n\t#{e.backtrace.join("\n\t")}"
  Kino.log.error error
  Kino.notify error, "Delivery error"
  job.update_job status: Kino::JOB_STATUS_FAILED, message: "(#{e.class}) #{e.message}" # rescue nil

I did skip some non-related code (IMHO) but hope this shows the use. Pretty basic I think. Interesting to me that I'm catching an ArgumentError deep inside the SDK call stack, which to me seems outside my control.
This happens now practically every time on long running #copy_to (12+ hours) with file sizes around 50 GB and more. Smaller files (< 10 GB) seem fine.

@trevorrowe
Copy link
Member

Good news, I was able to track down the issue here and I believe I have a fix. I'll publish a bug-fix release with this patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants