[Osquery] Cypress - Add SHA512 integrity validation for cached agent#258842
[Osquery] Cypress - Add SHA512 integrity validation for cached agent#258842tomsonpl merged 8 commits intoelastic:mainfrom
Conversation
…c Agent downloads
|
/ci |
|
Pinging @elastic/security-defend-workflows (Team:Defend Workflows) |
There was a problem hiding this comment.
Two things I noticed:
computeFileHash will hang at runtime
await finished(stream.pipe(hash)) never resolves because Hash is a Transform stream and nobody reads from its readable side, so finished() waits forever. Tests pass because createHash is fully mocked. I verified locally on Node 22, it hangs.
Quick fix:
export const computeFileHash = async (filePath: string): Promise<string> => {
const hash = createHash('sha512');
const stream = fs.createReadStream(filePath);
for await (const chunk of stream) {
hash.update(chunk);
}
return hash.digest('hex');
};Cache breaks when the SHA URL is down
If fetchExpectedHash fails, expectedHash stays undefined, so no sidecar file gets written. Next run sees the file but no sidecar, treats it as unavailable, and re-downloads. This loops forever while the hash endpoint is unreachable, effectively disabling the cache.
One option: compute the local hash after download and store it as the sidecar regardless, so at least disk-level corruption is caught on reuse even without remote validation.
|
Thanks @szwarckonrad, my implementation wasnt full without your suggestion. Big thank you! Fixed now :) |
|
Starting backport for target branches: 8.19, 9.2, 9.3 https://github.com/elastic/kibana/actions/runs/23451616679 |
💔 All backports failed
Manual backportTo create the backport manually run: Questions ?Please refer to the Backport tool documentation |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
13 similar comments
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
|
Friendly reminder: Looks like this PR hasn’t been backported yet. |
Summary
Adds SHA512 integrity validation for cached Elastic Agent downloads used by Defend Workflows Cypress tests. Corrupt or truncated tarballs are now automatically detected and re-downloaded, preventing persistent CI failures.
Problem
The Defend Workflows Cypress tests (
cy.task('createEndpointHost')) provision Vagrant VMs with Elastic Agent. The agent tarball is pre-downloaded during CI image build and cached at~/.kibanaSecuritySolutionCliTools/agent_download_storage/.Example
https://buildkite.com/elastic/kibana-on-merge/builds/91361#019d0abf-4e4e-43e2-a693-44b1023278b4/L2306-L2428
There was zero integrity validation anywhere in this chain. If a download was truncated (network hiccup, CDN issue), the partial file was cached and reused for up to 2 days, causing every test run on that CI agent to fail with:
The Elastic artifacts API already exposes
sha_urlalongsideurlin search responses, but it was completely ignored.Changes
SHA512 validation after download (
agent_downloads_service.ts): fetch expected hash from artifacts APIsha_url, compute SHA512 of downloaded file, compare. On mismatch, retry (up to 3 attempts). Onsha_urlfetch failure, proceed without validation (best-effort — don't block CI if hash infra is down).Cache integrity validation on reuse (
agent_downloads_service.ts): cached files now require a.sha512sidecar file. On cache hit, re-compute hash and compare with sidecar. If mismatch or missing sidecar → delete corrupt file and re-download. This is the key self-healing behavior.SHA URL propagation (
fleet_services.ts):getAgentDownloadUrl()now returnsshaUrlfrom the artifacts API response. All callers updated to pass it through:agent_downloaderCLI (packer cache),create_and_enroll_endpoint_host_ci.ts(Cypress),enrollHostVmWithFleet().Vagrantfile defense-in-depth: added
gzip -tcheck beforetar -zxfextraction. Catches any corruption that slips through (e.g., SCP transfer issue). Fails with"Agent tarball integrity check failed"instead of cryptic tar errors.Sidecar cleanup:
.sha512files are cleaned up alongside their parent tarballs during the existing cache TTL cleanup.How it works
Test plan
agent_downloads_service.test.ts— 14 tests covering:isAgentDownloadFromDiskAvailablewith/without sidecarfetchExpectedHashparsing and error handlingagent_downloader.test.ts— 8 tests verifyshaUrlpassthrough in all scenarios