Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UII] Request diagnostics action has incorrect expiration #183692

Closed
jen-huang opened this issue May 17, 2024 · 1 comment · Fixed by #183788
Closed

[UII] Request diagnostics action has incorrect expiration #183692

jen-huang opened this issue May 17, 2024 · 1 comment · Fixed by #183788
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@jen-huang
Copy link
Contributor

jen-huang commented May 17, 2024

I stumbled on REQUEST_DIAGNOSTICS_TIMEOUT_MS which seems it was intended to be 3 hours, but actually the calculation is missing another * 60, so the diagnostics file action expiration is only 3 minutes at the moment:

const REQUEST_DIAGNOSTICS_TIMEOUT_MS = 3 * 60 * 1000; // 3 hours;

Fix the calculation and double check that we are setting an expiration everywhere we create REQUEST_DIAGNOSTICS actions. This file seems to create some but does not pass in an expiration so it defaults to 1 month:

@jen-huang jen-huang added bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team labels May 17, 2024
@jen-huang jen-huang self-assigned this May 17, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

jen-huang added a commit that referenced this issue May 20, 2024
## Summary

Resolves #183692.

This PR:
- Corrects the expiration so that it is 3 hours
- Makes sure it is applied when bulk requesting diagnostics too
- Adjusts API integration tests to ensure the correct expiration is
applied
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue May 20, 2024
…3788)

## Summary

Resolves elastic#183692.

This PR:
- Corrects the expiration so that it is 3 hours
- Makes sure it is applied when bulk requesting diagnostics too
- Adjusts API integration tests to ensure the correct expiration is
applied

(cherry picked from commit e814a77)
jen-huang added a commit that referenced this issue May 20, 2024
## Summary

Resolves #167366.

This PR introduces the ability to manually delete diagnostics files
before their ILM policy kicks in. This PR:

- Adds a new `DELETE /api/fleet/agents/files/{fileId}` route which
returns `{id: string, deleted: boolean}`
- Deletes the file from `.fleet-fileds-fromhost-data-agent` data stream
when a request is received, and updates the corresponding meta
information in `.fleet-fileds-fromhost-meta-agent` to set it to
`DELETED` status
- This is similar to how the meta information [reconciliation via task
manager](https://github.com/elastic/kibana/blob/9e5d6b3e0fa731a37374c091cd7284d1c41f116e/x-pack/plugins/fleet/server/services/files/index.ts#L148)
is done when the ILM policy deletes diagnostics file
 - Updates the Agent details > Diagnostics tab:
- Surface Delete action in files table for requests that resulted in
files
   - Refactors existing UI around the copy text and generate button
   - Add `Show expired requests` toggle
     - Off by default, which means the following will be shown:
       - Files that are on disk
       - Files being generated
- File requests which errored out but are still not expired (i.e. users
can see errors with recent requests)
     - When toggled on, will additionally show:
       - File requests which errored out AND are expired
- File requests that are just expired (i.e. edge case where a file was
deleted by ILM but the meta info had not yet reconciled)

FYI, the expiration threshold is currently only 3 minutes. This is a
bug, see: #183692

The main reason for adding this toggle is to keep the initial list view
clean. The items in this list are built from all `REQUEST_DIAGNOSTICS`
agent actions that the user submits, which can be on a single agent or
bulk agents.

When a file is deleted manually with this new work, or by the existing
ILM policy, we can correctly flag the associated action as having
`DELETED` files and hide it from view. But when a request errors out or
otherwise results in no files being generated, we still want to keep the
history of the request (we have no precedent of deleting agent
activity). Over time, this history is no longer useful for the user and
just pollutes the table, so it is better to hide these items from the
initial view.

<img width="1314" alt="image"
src="https://github.com/elastic/kibana/assets/1965714/36a2b405-5e58-4444-a114-ab06b842a505">

<img width="1330" alt="image"
src="https://github.com/elastic/kibana/assets/1965714/bd8c40b1-46d9-47e3-880f-5c831b626398">

### Testing
Use the single and bulk request diagnostics feature and test the delete
functionality. Go nuts :)

### Checklist

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [x]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
kibanamachine referenced this issue May 20, 2024
…3788) (#183856)

# Backport

This will backport the following commits from `main` to `8.14`:
- [[UII] Fix request diagnostics action expiration threshold
(#183788)](#183788)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Jen
Huang","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-05-20T18:01:23Z","message":"[UII]
Fix request diagnostics action expiration threshold (#183788)\n\n##
Summary\r\n\r\nResolves
https://github.com/elastic/kibana/issues/183692.\r\n\r\nThis PR:\r\n-
Corrects the expiration so that it is 3 hours\r\n- Makes sure it is
applied when bulk requesting diagnostics too\r\n- Adjusts API
integration tests to ensure the correct expiration
is\r\napplied","sha":"e814a77e97be091b34feadd846d69ef38a583665","branchLabelMapping":{"^v8.15.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","Team:Fleet","v8.14.0","v8.15.0"],"title":"[UII]
Fix request diagnostics action expiration
threshold","number":183788,"url":"https://github.com/elastic/kibana/pull/183788","mergeCommit":{"message":"[UII]
Fix request diagnostics action expiration threshold (#183788)\n\n##
Summary\r\n\r\nResolves
https://github.com/elastic/kibana/issues/183692.\r\n\r\nThis PR:\r\n-
Corrects the expiration so that it is 3 hours\r\n- Makes sure it is
applied when bulk requesting diagnostics too\r\n- Adjusts API
integration tests to ensure the correct expiration
is\r\napplied","sha":"e814a77e97be091b34feadd846d69ef38a583665"}},"sourceBranch":"main","suggestedTargetBranches":["8.14"],"targetPullRequestStates":[{"branch":"8.14","label":"v8.14.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.15.0","branchLabelMappingKey":"^v8.15.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/183788","number":183788,"mergeCommit":{"message":"[UII]
Fix request diagnostics action expiration threshold (#183788)\n\n##
Summary\r\n\r\nResolves
https://github.com/elastic/kibana/issues/183692.\r\n\r\nThis PR:\r\n-
Corrects the expiration so that it is 3 hours\r\n- Makes sure it is
applied when bulk requesting diagnostics too\r\n- Adjusts API
integration tests to ensure the correct expiration
is\r\napplied","sha":"e814a77e97be091b34feadd846d69ef38a583665"}}]}]
BACKPORT-->

Co-authored-by: Jen Huang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants