Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

504 Gateway timeout on large file #481

Closed
naveensrinivasan opened this issue Nov 4, 2021 · 18 comments · Fixed by #501 or sigstore/cosign#1081
Closed

504 Gateway timeout on large file #481

naveensrinivasan opened this issue Nov 4, 2021 · 18 comments · Fixed by #501 or sigstore/cosign#1081
Assignees
Labels
bug Something isn't working

Comments

@naveensrinivasan
Copy link
Contributor

Description

We are trying to use rekor to store our GH releases for Scorecard ossf/scorecard#1201.

I ran into an issue with cosign that didn't support timeout on sign-blob sigstore/cosign#990 and am in the process of fixing that issue I ran into another issue with rekor.

HTTP/2.0 504 Gateway Timeout
Content-Length: 160
Content-Type: text/html
Date: Thu, 04 Nov 2021 17:59:15 GMT
Strict-Transport-Security: max-age=15724800; includeSubDomains

<html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>

Error: signing ./cosign: &{0 } (*models.Error) is not supported by the TextConsumer, can be resolved by supporting TextUnmarshaler interface
main.go:48: error during command execution: signing ./cosign: &{0 } (*models.Error) is not supported by the TextConsumer, can be resolved by supporting TextUnmarshaler interface

The size of the binary that I am trying to upload is 81M.

@naveensrinivasan naveensrinivasan added the bug Something isn't working label Nov 4, 2021
@naveensrinivasan
Copy link
Contributor Author

naveensrinivasan commented Nov 4, 2021

Related to an existing issue #264

@bobcallaway
Copy link
Member

I’m not convinced this is due to #264 since the request hit the server and timed out during processing. I think we should try to fix sigstore/cosign#990 and see if that fixes this.

@naveensrinivasan
Copy link
Contributor Author

naveensrinivasan commented Nov 4, 2021

I’m not convinced this is due to #264 since the request hit the server and timed out during processing. I think we should try to fix sigstore/cosign#990 and see if that fixes this.

I have the fix for the cosign timeout and while running with the fix is where I ran into this issue.

To confirm your doubts I used the rekor-cli to check if it works and it is also running into a similar issue.

➜  temp openssl ecparam -genkey -name prime256v1 > ec_private.pem
➜  temp openssl ec -in ec_private.pem -pubout > ec_public.pem
read EC key
writing EC key
➜  temp openssl dgst -sha256 -sign ec_private.pem -out cosign.sig cosign
➜  temp ./rekor-cli upload --artifact cosign  --signature cosign.sig --pki-format=x509 --public-key=ec_public.pem
Post "https://rekor.sigstore.dev/api/v1/log/entries": context deadline exceeded
➜  temp ./rekor-cli upload --timeout 90s  --artifact cosign  --signature cosign.sig --pki-format=x509 --public-key=ec_public.pem
&{0 } (*models.Error) is not supported by the TextConsumer, can be resolved by supporting TextUnmarshaler interface

And here is tail end of the log with rekor-cli for the same above command with DEBUG=1

gS0VZLS0tLS0KTUZrd0V3WUhLb1pJemowQ0FRWUlLb1pJemowREFRY0RRZ0FFekdrNjErcExJUlFRZVYzR2xXLytiajVGVURLdwpzUEJvdUtROWIwbXU5M0I2MmxRNEFmU2R1S00vVlVBZEJHZXJ5dDdFV2FUeElxNW1ZL3p6ekxZdkl3PT0KLS0tLS1FTkQgUFVCTElDIEtFWS0tLS0tCg=="}}},"kind":"rekord"}

HTTP/2.0 504 Gateway Timeout
Content-Length: 160
Content-Type: text/html
Date: Thu, 04 Nov 2021 19:20:17 GMT
Strict-Transport-Security: max-age=15724800; includeSubDomains

<html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>

&{0 } (*models.Error) is not supported by the TextConsumer, can be resolved by supporting TextUnmarshaler interface

DEBUG=1 ./rekor-cli upload --timeout 90s --artifact cosign --signature cosign.sig --pki-format=x509 --public-key=ec_public.pem
Thoughts?

@naveensrinivasan
Copy link
Contributor Author

naveensrinivasan commented Nov 9, 2021

I was able to run the rekor server locally

Here is the server log after it is uploaded

rekor-rekor-server-1         | 2021-11-09T02:05:37.443Z	INFOrestapi/configure_rekor_server.go:144	[56593d4bf6fe/6kNG8UdoVI-000099] "POST http://172.19.0.4:3000/api/v1/log/entries HTTP/1.1" from 172.19.0.1:43450 - 409 153B in 1m44.88365215s

Client log

./rekor-cli upload --timeout 200s  --artifact cosign  --signature cosign.sig --pki-format=x509 --public-key=ec_public.pem --rekor_server http://172.19.0.4:3000
Post "http://172.19.0.4:3000/api/v1/log/entries": EOF

I was able to confirm that it was successful on the server but it didn't respond to the client when it was successful.

Based on what I saw from the log I decide to search for the binary in rekor

./rekor-cli search --artifact cosign --rekor_server http://172.19.0.4:3000
Found matching entries (listed by UUID):
be99689f02ff2cb56b4becba926d2bd3c795ca6779ebb32e2e13aeec1d36847c

With the ☝️ successful match. I decided to search the https://rekor.sigstore.dev and it was able to locate it which confirmed the upload was successful but the message to the client is the issue.

./rekor-cli search --artifact cosign
Found matching entries (listed by UUID):
be99689f02ff2cb56b4becba926d2bd3c795ca6779ebb32e2e13aeec1d36847c

@bobcallaway
Copy link
Member

I was able to run the rekor server locally

Here is the server log after it is uploaded

rekor-rekor-server-1         | 2021-11-09T02:05:37.443Z	INFOrestapi/configure_rekor_server.go:144	[56593d4bf6fe/6kNG8UdoVI-000099] "POST http://172.19.0.4:3000/api/v1/log/entries HTTP/1.1" from 172.19.0.1:43450 - 409 153B in 1m44.88365215s

The 409 error code returned here after 104 seconds feels like the upload must have completed in order for us to have confirmed that there was an entry already in the log for this.

I will try to recreate this locally and see if I can run it down.

@asraa
Copy link
Contributor

asraa commented Nov 10, 2021

Another alternative is to replace the rekorEntry used to upload in cosign with the hash of the artifact rather than the whole payload.

https://github.com/sigstore/cosign/blob/d44259202830948517db361efd85fca6e87776ff/pkg/cosign/tlog.go#L122

@naveensrinivasan
Copy link
Contributor Author

Another alternative is to replace the rekorEntry used to upload in cosign with the hash of the artifact rather than the whole payload.

https://github.com/sigstore/cosign/blob/d44259202830948517db361efd85fca6e87776ff/pkg/cosign/tlog.go#L122

Cosign doesn't provide an option to upload artifact hash https://github.com/sigstore/cosign/blob/d44259202830948517db361efd85fca6e87776ff/cmd/cosign/cli/sign/sign_blob.go#L55

@asraa
Copy link
Contributor

asraa commented Nov 10, 2021

Cosign doesn't provide an option to upload artifact hash https://github.com/sigstore/cosign/blob/d44259202830948517db361efd85fca6e87776ff/cmd/cosign/cli/sign/sign_blob.go#L55

Oops, I think you're misunderstanding my comment. I'm not suggesting changing cosign to sign an artifact hash, I'm suggesting changing the code in cosign linked in my comment so that it uploads to rekor with the hash of the entry, rather than the whole payload.

@bobcallaway is this possible? I'm testing that change but I think sending the rekor entry with just a hash causes problems with SHA mismatch because maybe that SHA is just for consistency

I guess the reason is probably because it needs to check signature validation and so it needs the whole payload to keep in line with all the different signer/verifier impls.

if oldSHA != "" && computedSHA != oldSHA {

@naveensrinivasan
Copy link
Contributor Author

Cosign doesn't provide an option to upload artifact hash https://github.com/sigstore/cosign/blob/d44259202830948517db361efd85fca6e87776ff/cmd/cosign/cli/sign/sign_blob.go#L55

Oops, I think you're misunderstanding my comment. I'm not suggesting changing cosign to sign an artifact hash, I'm suggesting changing the code in cosign linked in my comment so that it uploads to rekor with the hash of the entry, rather than the whole payload.

@bobcallaway is this possible? I'm testing that change but I think sending the rekor entry with just a hash causes problems with SHA mismatch because maybe that SHA is just for consistency

if oldSHA != "" && computedSHA != oldSHA {

Cool, makes sense! Thanks

@asraa
Copy link
Contributor

asraa commented Nov 10, 2021

I don't think that avenue is possible :/
Technically you're signing the hash of the artifact so I assumed it was possible (and rekor only stores the hash). But I think to play nice with all the different verifier methods with different hash requirements for message signing, rekor still wants the whole payload. Unfortunate...

@naveensrinivasan
Copy link
Contributor Author

Ran some pprof heap with the large file

Showing nodes accounting for 267.13MB, 97.25% of 274.69MB total
Dropped 77 nodes (cum <= 1.37MB)
Showing top 10 nodes out of 54
      flat  flat%   sum%        cum   cum%
  106.85MB 38.90% 38.90%   106.85MB 38.90%  encoding/json.unquote (inline)
   80.14MB 29.17% 68.07%    80.14MB 29.17%  encoding/base64.(*Encoding).DecodeString
   80.14MB 29.17% 97.25%    80.14MB 29.17%  reflect.unsafe_NewArray
         0     0% 97.25%   106.85MB 38.90%  encoding/json.(*Decoder).Decode
         0     0% 97.25%   106.85MB 38.90%  encoding/json.(*decodeState).literalInterface
         0     0% 97.25%   107.35MB 39.08%  encoding/json.(*decodeState).object
         0     0% 97.25%   106.85MB 38.90%  encoding/json.(*decodeState).objectInterface
         0     0% 97.25%   107.35MB 39.08%  encoding/json.(*decodeState).unmarshal
         0     0% 97.25%   107.35MB 39.08%  encoding/json.(*decodeState).value
         0     0% 97.25%   106.85MB 38.90%  encoding/json.(*decodeState).valueInterface

It is not surprising it is `encoding JSON which is causing most of the delay.

@naveensrinivasan
Copy link
Contributor Author

naveensrinivasan commented Nov 11, 2021

I did pprof CPU dump on createLogEntry for the start and stop of this function call.

The test was uploading a binary(cosign Linux binary) 81 MB file which is uploading cosign binary

func createLogEntry(params entries.CreateLogEntryParams) (models.LogEntry, middleware.Responder) {

profile001

Here is the pprof CPU I captured and the code with the simple option to store the profile to the disk main...naveensrinivasan:naveen/prof
and if you don't want to clone the branch changes I made to generate the cpu.pprof you can download it from here.
cpu.pprof.zip(Note of caution please don't download random zip files from the internet)

@naveensrinivasan
Copy link
Contributor Author

I was thinking if we could replace the json encoding with https://github.com/json-iterator/go as it is well maintained and keeps the standard API. I also saw this recommendation go-swagger/go-swagger#1868 (comment)

I tried the code change but it didn't make any difference
main...naveensrinivasan:naveen/feat/json-iterator

@naveensrinivasan
Copy link
Contributor Author

naveensrinivasan commented Nov 16, 2021

https://github.com/naveensrinivasan/grpc_upload_test

image

The 81 MB uploads take with the gRPC tests.

2021/11/16 12:36:46  Unary Avg: 1.039287592s
2021/11/16 12:36:46 Stream Avg: 761.814253ms

The uploads with REST take about 180seconds.

@asraa
Copy link
Contributor

asraa commented Nov 19, 2021

@naveensrinivasan just FYI this is how you can upload this hashed type, e.g. on a signature of a file README.md

./rekor-cli upload --type hashedrekord:0.0.1 --artifact-hash $(sha256sum  README.md | awk '{print $1}') --signature README.md.sig --pki-format=x509 --public-key=ec_public.pem

@naveensrinivasan
Copy link
Contributor Author

@naveensrinivasan just FYI this is how you can upload this hashed type, e.g. on a signature of a file README.md

./rekor-cli upload --type hashedrekord:0.0.1 --artifact-hash $(sha256sum  README.md | awk '{print $1}') --signature README.md.sig --pki-format=x509 --public-key=ec_public.pem

Thank you @asraa! AFAIK it is still not supported in cosign. Or if I am wrong how can I use with cosign for keyless signing?

@asraa
Copy link
Contributor

asraa commented Nov 19, 2021

Thank you @asraa! AFAIK it is still not supported in cosign. Or if I am wrong how can I use with cosign for keyless signing?

Hah thank you for reminding me that this was for use through cosign CLI. I'll open a PR there to fix that!

@naveensrinivasan
Copy link
Contributor Author

Thank you @asraa! AFAIK it is still not supported in cosign. Or if I am wrong how can I use with cosign for keyless signing?

Hah thank you for reminding me that this was for use through cosign CLI. I'll open a PR there to fix that!

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants