Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After an upgrade fails to download, subsequent upgrades will fail #693

Closed
joshdover opened this issue Jul 8, 2022 · 7 comments · Fixed by #752
Closed

After an upgrade fails to download, subsequent upgrades will fail #693

joshdover opened this issue Jul 8, 2022 · 7 comments · Fixed by #752
Assignees
Labels
bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.4.0

Comments

@joshdover
Copy link
Contributor

We have reports that after an upgrade fails to download, subsequent upgrade attempts can fail with an error about a 0 byte file when trying to validate the sha and/or signature of the download.

More details needed cc @WiegerElastic

@joshdover joshdover added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Jul 8, 2022
@WiegerElastic
Copy link

WiegerElastic commented Jul 8, 2022

Here we go.

failed to dispatch actions, error: failed upgrade of agent binary: 2 errors occurred: * package '/Library/Elastic/Agent/data/elastic-agent-f44953/downloads/elastic-agent-8.3.0-darwin-aarch64.tar.gz' not found: open /Library/Elastic/Agent/data/elastic-agent-f44953/downloads/elastic-agent-8.3.0-darwin-aarch64.tar.gz: no such file or directory * fetching package failed: unexpected EOF 
2022-06-29T10:27:35+02:00 - message: Application: [5f63b25a-7322-4dd3-bd8c-db6f1b54a4ff]: State changed to FAILED: failed upgrade of agent binary: 2 errors occurred: * package '/Library/Elastic/Agent/data/elastic-agent-f44953/downloads/elastic-agent-8.3.0-darwin-aarch64.tar.gz.sha512' not found: open /Library/Elastic/Agent/data/elastic-agent-f44953/downloads/elastic-agent-8.3.0-darwin-aarch64.tar.gz.sha512: no such file or directory * fetching package failed: Get "https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.3.0-darwin-aarch64.tar.gz": context canceled - type: 'ERROR' - sub_type: 'FAILED'
failed to dispatch actions, error: failed upgrade of agent binary: 2 errors occurred: * package '/Library/Elastic/Agent/data/elastic-agent-f44953/downloads/elastic-agent-8.3.0-darwin-aarch64.tar.gz.sha512' not found: open /Library/Elastic/Agent/data/elastic-agent-f44953/downloads/elastic-agent-8.3.0-darwin-aarch64.tar.gz.sha512: no such file or directory * fetching package failed: Get "https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.3.0-darwin-aarch64.tar.gz": context canceled 

I think this will be one of your main signals (unexpected EOF):

2022-06-29T07:41:56+02:00 - message: Application: [5f63b25a-7322-4dd3-bd8c-db6f1b54a4ff]: State changed to FAILED: failed upgrade of agent binary: 2 errors occurred: * package '/Library/Elastic/Agent/data/elastic-agent-f44953/downloads/elastic-agent-8.3.0-darwin-aarch64.tar.gz' not found: open /Library/Elastic/Agent/data/elastic-agent-f44953/downloads/elastic-agent-8.3.0-darwin-aarch64.tar.gz: no such file or directory * fetching package failed: unexpected EOF - type: 'ERROR' - sub_type: 'FAILED'

Directory content at the time:

sh-3.2# pwd
/Library/Elastic/Agent/data/elastic-agent-f44953/downloads
sh-3.2# ls -lhra
total 536856
-rw-r--r--   1 root  wheel   170B Jun 16 10:16 osquerybeat-8.2.3-darwin-aarch64.tar.gz.sha512
-rw-r--r--   1 root  wheel   488B Jun 16 10:16 osquerybeat-8.2.3-darwin-aarch64.tar.gz.asc
-rw-r--r--   1 root  wheel    66M Jun 16 10:16 osquerybeat-8.2.3-darwin-aarch64.tar.gz
-rw-r--r--   1 root  wheel   169B Jun 16 10:16 metricbeat-8.2.3-darwin-aarch64.tar.gz.sha512
-rw-r--r--   1 root  wheel   488B Jun 16 10:16 metricbeat-8.2.3-darwin-aarch64.tar.gz.asc
-rw-r--r--   1 root  wheel    60M Jun 16 10:16 metricbeat-8.2.3-darwin-aarch64.tar.gz
-rw-r--r--   1 root  wheel   168B Jun 16 10:16 heartbeat-8.2.3-darwin-aarch64.tar.gz.sha512
-rw-r--r--   1 root  wheel   488B Jun 16 10:16 heartbeat-8.2.3-darwin-aarch64.tar.gz.asc
-rw-r--r--   1 root  wheel    40M Jun 16 10:16 heartbeat-8.2.3-darwin-aarch64.tar.gz
-rw-r--r--   1 root  wheel   171B Jun 16 10:16 fleet-server-8.2.3-darwin-aarch64.tar.gz.sha512
-rw-r--r--   1 root  wheel   488B Jun 16 10:16 fleet-server-8.2.3-darwin-aarch64.tar.gz.asc
-rw-r--r--   1 root  wheel   7.0M Jun 16 10:16 fleet-server-8.2.3-darwin-aarch64.tar.gz
-rw-r--r--   1 root  wheel   167B Jun 16 10:16 filebeat-8.2.3-darwin-aarch64.tar.gz.sha512
-rw-r--r--   1 root  wheel   488B Jun 16 10:16 filebeat-8.2.3-darwin-aarch64.tar.gz.asc
-rw-r--r--   1 root  wheel    49M Jun 16 10:16 filebeat-8.2.3-darwin-aarch64.tar.gz
-rw-r--r--   1 root  wheel   176B Jun 16 10:16 endpoint-security-8.2.3-darwin-aarch64.tar.gz.sha512
-rw-r--r--   1 root  wheel   488B Jun 16 10:16 endpoint-security-8.2.3-darwin-aarch64.tar.gz.asc
-rw-r--r--   1 root  wheel    39M Jun 16 10:16 endpoint-security-8.2.3-darwin-aarch64.tar.gz
-rw-r-----   1 root  wheel     0B Jun 29 10:27 elastic-agent-8.3.0-darwin-aarch64.tar.gz
drwxr-xr-x   8 root  wheel   256B Jun 29 10:27 ..
drwxr-xr-x  21 root  wheel   672B Jun 29 10:27 .

After I removed the 0 byte file, upgrading was successful.

I haven't been able to find the actual download for this file in the logs. I created an extract which you can find on my G-drive, here.

@AndersonQ
Copy link
Member

AndersonQ commented Jul 8, 2022

When there is an issue during the download of an artifact, it might happen that the agent does not manage to clean up the downloaded files. That results on 0 bytes files present, which makes the agent to understand the artifacts were successfully downloaded and then move on to the sha512 check. The check fails due to the fact the agent does not find the checksum on the .sha512 file. Below are the logs of this issue.

09:23:55.710 elastic_agent [elastic_agent][debug] Dispatch 1 actions of types: *fleetapi.ActionUpgrade
09:23:55.710 elastic_agent [elastic_agent][debug] handlerUpgrade: action 'action_id: 2474320a-1949-4ead-afb7-a609ef52c82e, type: UPGRADE' received
09:23:55.710 elastic_agent [elastic_agent][info] 2022-06-28T09:23:55+02:00 - message: Application: [795aefcb-a77e-4bd3-9244-2f01acc50346]: State changed to UPDATING: Update to version '8.2.3' started - type: 'STATE' - sub_type: 'UPDATING'
09:23:55.712 elastic_agent [elastic_agent][debug] Request method: POST, path: /api/fleet/agents/795aefcb-a77e-4bd3-9244-2f01acc50346/acks, reqID: 01G6MK40E0Z772ST007KMP18Q0
09:23:56.857 elastic_agent [elastic_agent][debug] action with id '2474320a-1949-4ead-afb7-a609ef52c82e' was just acknowledged
09:23:56.858 elastic_agent [elastic_agent][error] 2022-06-28T09:23:56+02:00 - message: Application: [795aefcb-a77e-4bd3-9244-2f01acc50346]: State changed to FAILED: failed verification of agent binary: 2 errors occurred:
   * checksum for "elastic-agent-8.2.3-darwin-aarch64.tar.gz" was not found in "/Library/Elastic/Agent/data/elastic-agent-b9a28a/downloads/elastic-agent-8.2.3-darwin-aarch64.tar.gz.sha512"
   * checksum for "elastic-agent-8.2.3-darwin-aarch64.tar.gz" was not found in "/Library/Elastic/Agent/data/elastic-agent-b9a28a/downloads/elastic-agent-8.2.3-darwin-aarch64.tar.gz.sha512"
- type: 'ERROR' - sub_type: 'FAILED'
09:23:56.858 elastic_agent [elastic_agent][debug] Failed to dispatch action 'action_id: 2474320a-1949-4ead-afb7-a609ef52c82e, type: UPGRADE', error: failed verification of agent binary: 2 errors occurred:
   * checksum for "elastic-agent-8.2.3-darwin-aarch64.tar.gz" was not found in "/Library/Elastic/Agent/data/elastic-agent-b9a28a/downloads/elastic-agent-8.2.3-darwin-aarch64.tar.gz.sha512"
   * checksum for "elastic-agent-8.2.3-darwin-aarch64.tar.gz" was not found in "/Library/Elastic/Agent/data/elastic-agent-b9a28a/downloads/elastic-agent-8.2.3-darwin-aarch64.tar.gz.sha512"
09:23:56.858 elastic_agent [elastic_agent][error] failed to dispatch actions, error: failed verification of agent binary: 2 errors occurred:
   * checksum for "elastic-agent-8.2.3-darwin-aarch64.tar.gz" was not found in "/Library/Elastic/Agent/data/elastic-agent-b9a28a/downloads/elastic-agent-8.2.3-darwin-aarch64.tar.gz.sha512"
   * checksum for "elastic-agent-8.2.3-darwin-aarch64.tar.gz" was not found in "/Library/Elastic/Agent/data/elastic-agent-b9a28a/downloads/elastic-agent-8.2.3-darwin-aarch64.tar.gz.sha512"
09:23:56.858 elastic_agent [elastic_agent][debug] 'operator-default-0cddf941' has status 'online'
09:23:56.858 elastic_agent [elastic_agent][debug] 'gateway-8b26c43c' has status 'error'

So far the issue seems to be on the clean up procedure that expects the file path to be returned, however on failure the download method does not returns it.

@WiegerElastic
Copy link

@AndersonQ
Copy link
Member

@WiegerElastic, @joshdover there are 2 different errors here:

  • a failure during download that leads to have both the .tar.gz and the .tar.gz.sha512 files, but empty (my comment);
  • the .tar.gz being empty and not having the .tar.gz.sha512 (@WiegerElastic comment).

@amolnater-qasource
Copy link

amolnater-qasource commented Aug 3, 2022

Hi @joshdover
We have revalidated this issue on 8.4 Kibana cloud environment and had below observations.

Observations:

As per attached PR #752 :

If artifact retrieval fails, downloaded artifacts are removed.

Could you please confirm if we are expecting this?

Please let us know if we are missing anything here.
Thanks

@gbanasiak
Copy link
Contributor

@amolnater-qasource Was subsequent upgrade blocked with 0 file present in your 8.4.0 test? If yes, can we re-open this issue please?

@amolnater-qasource
Copy link

Hi @gbanasiak
We have observed a different issue on latest 8.4 Snapshot, where we are unable to trigger agent upgrade from fleet UI.
Hence we have reported a separate issue at elastic/kibana#139006

Build details:
VERSION: 8.4.0 Snapshot
BUILD: 55248
COMMIT: 62ed40b0aacde57988ccce0dfec5f0bb9e8bccf4

Please let us know if we are missing anything.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.4.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants