Skip to content
This repository was archived by the owner on Sep 17, 2024. It is now read-only.
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions e2e/_suites/ingest-manager/features/ingest-manager.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
@ingest
Comment thread
mdelapenya marked this conversation as resolved.
Outdated
Feature: Enable Fleet user and create initial Kibana setup
Comment thread
mdelapenya marked this conversation as resolved.
Outdated

Scenario: Enrolling an agent
Given there is a "Fleet" user in Kibana
And the "Fleet" Kibana setup has been created
Comment thread
mdelapenya marked this conversation as resolved.
Outdated
When the agent binary is installed in the target host
Then the dashboards for the agent are present in Elasticsearch

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to know the exact data needed here: the ES query

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the command to run the agent is:
./elastic-agent run

after this command is executed, we can wait a matter of seconds (5-20 seconds?) and then verify the existence of certain folders / data on the host as evidence of it working.
The logs we can check for are relative to the path where the agent was installed, so it would be, for example with a 7.8 agent:
elastic-agent-7.8.0-darwin-x86_64-BC5/data/logs/default/filebeat
elastic-agent-7.8.0-darwin-x86_64-BC5/data/logs/default/metricbeat

and from here:
elastic-agent-7.8.0-darwin-x86_64-BC5/data/run/default/metricbeat--7.8.0/meta.json

  • any non-empty file will suffice for all 3 assertions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for the Dashboards, lets actually use the API from Kibana, and even the Ingest one to assess this:
/api/ingest_manager/data_streams

  • if you call it prior to any Agent being deployed it should return a list of zero data streams as:
    {
    "data_streams": []
    }

when called after the Agent is running, it will return a list of (currently in 7.8) 20 streams, with a format as:
{
"data_streams": [
{},
{
"index": "metrics-system.load-default",
"dataset": "system.load",
"namespace": "default",
"type": "metrics",
"package": "system",
"package_version": "0.1.0",
"last_activity": "2020-06-04T18:59:29.693Z",
"size_in_bytes": 42605308,
"dashboards": [
{
"id": "79ffd6e0-faa0-11e6-947f-177f697178b8-ecs",
"title": "[Metrics System] Host overview ECS"
},
...
{
"id": "5517a150-f9ce-11e6-8115-a7c18106d86a-ecs",
"title": "[Logs System] SSH login attempts ECS"
},
{
"id": "Filebeat-syslog-dashboard-ecs",
"title": "[Logs System] Syslog dashboard ECS"
}
]
},
...
{},
{}
]
}

Lets assert the following...

  • the data_streams call returns more than 1 elements in its list.
  • the data_streams call returns a list element with an "index" of "metrics-system.process-default"
  • the list element "index": "metrics-system.process-default" has a sibling of a list called 'dashboards'
  • the list 'dashboards' will be confirmed to have an element with a title of "[Metrics System] Host overview ECS"

I don't think we should walk the whole list here, I understand there is separate automation to confirm this and would make the test brittle to changes. How does that sound?

And the agent shows up in Kibana

@mdelapenya mdelapenya Jun 3, 2020

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to get this without checking the UI, maybe an API call? I'd like to avoid any UI/DOM interaction if possible

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes is it. I was using very 'loose' language, 'shows up' and 'in Kibana' can be interpreted to the API as:
Request URL, GET: /api/ingest_manager/fleet/agents?page=1&perPage=20&showInactive=false
With the presumption that there were zero agents when we started, there should be one item in the list[] that is returned. Response snippet we can use to assert:
{
"list": [
{
"id": "0a17686e-40c5-4a81-86ae-fb41ddd7ea96",
"active": true,
"config_id": "f1a077d0-a688-11ea-b905-bd56f880a400",
"type": "PERMANENT",
"enrolled_at": "2020-06-04T18:10:49.376Z",
"user_provided_metadata": {},
"local_metadata": {},
"access_api_key_id": "m7SHgHIBm78rI0UKTW-D",
"current_error_events": [],
"last_checkin": "2020-06-04T18:34:30.949Z",
"config_revision": 3,
"status": "online"
}
],
"success": true,
"total": 1,
"page": 1,
"perPage": 20
}

I suggest we look only that the ID exists and that the current_error_events[] list is empty
The status: 'online' would be good, but note that it is likely to be 'error' after it is enrolled, but before the agent is 'run' just to be aware of that nuance.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can call GET /api/ingest_manager/fleet/agents


Scenario: Un-enrolling an agent
Given there is a "Fleet" user in Kibana
And the "Fleet" Kibana setup has been created
And the agent binary is installed in the target host
Comment thread
mdelapenya marked this conversation as resolved.
Outdated
When the agent is un-enrolled from Kibana

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to mention that we'll have to manually terminate the shell / process running on the host as part of the 'tear down' of this scenario, in order to test the re-enrolling and re-starting of the Agent.

Then no new data shows up in Elasticsearc locations using the enrollment token

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added using the enrollment token to match an existing step below. Is this assumption correct?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I would phase it as 'using' the enrollment token, but its not entirely wrong. I'd phrase it as the host / agent is no longer able to send documents into ES (it will still be attempting to send them, running on the host)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here I think you should say using the access token when an agent enroll into fleet we exchange an enrollmont token for an access token (that is one per agent).
One you invalidate an enrollment token, the agent already enrolled should continue to work, but you cannot enroll more agents with that enrollment token

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification Nicholas! Please look at L27:33 There is specific scenario for revoking the enrollment token for an agent. Is that what you mean?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, reading your comment, I'd rephrase this second scenario (the one revoking the token) to this:

Scenario: Revoking the enrollment token for an agent
  Given there is a "Fleet" user in Kibana
    And the "Fleet" Kibana setup has been created
    And the agent binary is installed in the target host
    And the agent is un-enrolled from Kibana
  When the enrollment token is revoked
  Then no new data shows up in Elasticsearc locations using the enrollment token
    And the enrolled agent continues to work

And I'd create another use case:

Scenario: A revoked enrollment token cannot enroll more agents
  Given there is an enrollment token
  When the enrollment token is revoked
  Then it's not possible to use the token to enroll more agents

Does it make sense to you?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, we should clarify what the enrolled agent continues to work means: i.e. it sends data to elasticsearch, there is an endpoint we can query, a process is running in the host, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining above two scenarios into one:

Scenario: Revoking the enrollment token for an agent
  Given there is an agent enrolled with an enrollment token
  When the enrollment token is revoked
  Then it's not possible to use the token to enroll more agents
    And the enrolled agent continues to work

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much Nicolas and Manu, I'm learning here too! Knowing now what I do, I'd suggest we really only have 1 distinct different case to test and I'd phrase it as:

Scenario: Revoking an enrollment token 
  Given the Fleet user is set up and a valid enrollment token exists
  When the enrollment token is revoked
  Then an attempt to enroll a new agent fails

the pre-requisite for the test changes such that the agent is NOT running and is NOT already enrolled.
@mdelapenya what do you think? Honestly, if you can get us the first more straight-forward case I'm happy to work this with the code snippets we have and infrastructure you provide. We need not stress about completing this one case now, the team is fine to take it over.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this scenario, because it's very straight-forward and simple at the same time. I'd replace what we had. wdyt about rephrasing the Given... to Given an agent is enrolled? Or do we want to make it clear for this scenario that we need the fleet user and the existence of a valid token?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, in what state would be the existing agent? Will it pause? will it continue to send data?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What data is not present here? I'd be great to understand more about its nature to identify when it shows up and when not

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated:
a query you can use is as follows:
query the metrics* index and hit the equivalent of KQL:
host.name:"7exl-w10x64l6-d" and @timestamp >= "2020-06-06T01:30:00.948Z"
where the hostname is replaced correctly and the timestamp in question is captured 2 seconds after the unenroll call.

translated into an ES query (forgive me if this is terrible, its a hacked version from dev tools and I didn't take the time to re-work it much:

  • the same find/replace of the hostname and timestamp values is needed of coruse:

GET _search
{
"version": true,
"size": 500,
"docvalue_fields": [
{
"field": "@timestamp",
"format": "date_time"
},
{
"field": "system.process.cpu.start_time",
"format": "date_time"
},
{
"field": "system.service.state_since",
"format": "date_time"
}
],
"_source": {
"excludes": []
},
"query": {
"bool": {
"must": [],
"filter": [
{
"bool": {
"filter": [
{
"bool": {
"should": [
{
"match_phrase": {
"host.name": "7exl-w10x64l6-d"
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"range": {
"@timestamp": {
"gte": "2020-06-06T01:50:00.948Z",
"time_zone": "America/New_York"
}
}
}
],
"minimum_should_match": 1
}
}
]
}
},
{
"range": {
"@timestamp": {
"gte": "2020-06-06T01:36:29.564Z",
"format": "strict_date_optional_time"
}
}
}
],
"should": [],
"must_not": []
}
}
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query is perfect! :)


Scenario: Enrolling, un-enrolling and re-enrolling an agent
Given there is a "Fleet" user in Kibana
And the "Fleet" Kibana setup has been created
And the agent binary is installed in the target host

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this we can query: /api/ingest_manager/fleet/agents
and assume it only has one Agent and so we can capture the 'id' field in list position 0, as:
{ "list": [ { "id": "4a8ba41b-d62b-44dc-903b-1a0593b0d57c",...
and save off the ID to use in next call, which is:

Request URL, POST: /api/ingest_manager/fleet/agents//unenroll

  • with no payload body. should return a 200

then we can query the list again and find it empty, with:
Request URL, GET: /api/ingest_manager/fleet/agents?page=1&perPage=20&showInactive=false

response:
{"list":[],"success":true,"total":0,"page":1,"perPage":20}

at which point we can query the inverse state and find the agent in the inactive list with:
/api/ingest_manager/fleet/agents?page=1&perPage=20&showInactive=true

and get a response with just one element in its list, like {

"list": [
{
"id": "4a8ba41b-d62b-44dc-903b-1a0593b0d57c",
"active": false,
...

And the agent is un-enrolled from Kibana
When the agent is re-enrolled from the host
And the agent runs from the host
Then the agent shows up in Kibana

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need here the exact thing to check: and API call, an XPATH element in the UI...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can absolutely get you the API calls and expectations. I don't know all of them off hand and am still digging thru 7.8 testing finding odd bugs, but I will work with the team tomorrow to fill in all of these with haste. we don't have the api documented yet either, so we'll get specifics for this and all similar requests in the branch

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the re-enroll call is exactly the same as it was prior, and the asserts are the same with the exception that we can check the timestamps on the metricbeat and filebeat files, to see that they are newer. newer than exactly what I'm not 100% sure on (there is some period where the Agent is in a state of transition. we could put a short pause in and wait for it to finish unenrolling and then capture that time and use it in the next step. ?


Scenario: Revoking the enrollment token for an agent
Given there is a "Fleet" user in Kibana
And the "Fleet" Kibana setup has been created
And the agent binary is installed in the target host
And the agent is un-enrolled from Kibana

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line 'And the agent is un-enrolled from Kibana' should be removed, the agent should be running as normal at this point. the purpose of this test is to validate a 2nd different way to cut off an agent from sending data into ES, which is to revoke the enrollment token. The agent un-enroll is very specific, however, the enrollment token revoking can impact a whole set of Agents (whichever were deployed with an enroll command that included the given token) its a way to bulk manage Agents, for some context.

When the enrollment token is revoked

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revoke enrollment token API is this:
Request URL, DELETE: /api/ingest_manager/fleet/enrollment-api-keys/{id}
and should respond with a 200 and:
{"action":"deleted","success":true}

The {id} to use in the above call can be retrieved from this call, there should only be key listed, so its the first 'id' in list position 0:
GET: /api/ingest_manager/fleet/enrollment-api-keys?page=1&perPage=20

{
"list": [
{
"id": "59f9eb9a-820d-47cd-abf5-51690b70bb79",
"active": true,
"api_key_id": "0cphgHIBjKSx981gocfb",
"name": "Default (e49bc4fe-c9a2-48bb-bb63-38bdb9fcf87b)",
"config_id": "de98abb0-a685-11ea-b905-bd56f880a400",
"created_at": "2020-06-04T17:29:42.347Z"
}
],

After it is revoked. you can run the same query
GET: /api/ingest_manager/fleet/enrollment-api-keys?page=1&perPage=20
and the active state should change to false, please assert on that.
"active": false

And then we can wait 2 seconds and assert nothing new is ingested into ES just as above.

Then no new data shows up in Elasticsearc locations using the enrollment token

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a final step we should ideally do here, we should attempt to re-enroll the Agent with the exact same string as before, but it should fail this time with an error message on the host cli as seen in this usage:

edavis-mbp:elastic-agent-7.8.0-darwin-x86_64-BC5 edavis$ ./elastic-agent enroll https://fc0fe63733904e31ac7aec80b3dbf246.us-central1.gcp.foundit.no:443 R2JSX2dISUJtNzhySTBVS0JXMGM6aEJ6NlZyb0dUUWE3anhBNWRmLU9qQQ==
The Elastic Agent is currently in Experimental and should not be used in production
This will replace your current settings. Do you want to continue? [Y/n]:y
2020-06-04T17:08:33-04:00 DEBUG client.go:178 Request method: POST, path: /api/ingest_manager/fleet/agents/enroll

fail to enroll: fail to execute request to Kibana: Status code: 401, Kibana returned an error: Unauthorized, message: [security_exception] missing authentication credentials for REST request [/_security/_authenticate], with { header={ WWW-Authenticate={ 0="Bearer realm="security"" & 1="ApiKey" & 2="Basic realm="security" charset="UTF-8"" } } }
edavis-mbp:elastic-agent-7.8.0-darwin-x86_64-BC5 edavis$

I suggest if we can, we parse this message for the keywords 'fail' 'unauthorized' (lowercased).

Scenario: Starting the agent starts backend processes
When the agent is started in the host
Then filebeat is started
And metricbeat is started
Comment thread
mdelapenya marked this conversation as resolved.
Outdated
And endpoint is started

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BDD step is the same, so we could write just one implementation method, with an input parameter (the process to be present in the target)


Scenario: Stopping the agent stops backend processes
Given an agent is running in a host
When the agent is stopped in the host
Then filebeat is stopped

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would need probably more like Then there are '2' metricbeat processes as we will need to check monitoring and ingesting beats

And metricbeat is stopped
And endpoint is stopped
Comment thread
mdelapenya marked this conversation as resolved.
Outdated