Add code for running connectors in Agent#2787
Conversation
seanstory
left a comment
There was a problem hiding this comment.
Haven't really read the logic yet (since this is still in draft) but I like the structure/direction. ![]()
connectors/agent/component.py
Outdated
| ) | ||
| self.opts = V2Options() | ||
| self.buffer = sys.stdin.buffer | ||
| self.agent_config = ConnectorsAgentConfiguration() |
There was a problem hiding this comment.
it took me a while to realise agent_config == (connector)service_config , should we rename to "service_config" or "component_config"? "Agent" references the process running the service component, so it's super accurate name in this context
There was a problem hiding this comment.
Yeah I'll try renaming it, but I'm also not sure about this class - maybe we can even pass agent configuration inside, parse it and emit connectors configuration to just separate this out of the checkin handler.
There was a problem hiding this comment.
I renamed it to ConnectorsAgentConfigurationWrapper and changed logic a bit + added code comments. I really could not make good names for stuff that I wrote here, open to suggestions
| f"Error while running services in ConnectorServiceManager: {e}" | ||
| ) | ||
| raise | ||
| finally: |
There was a problem hiding this comment.
should we care about stopping running services here?
There was a problem hiding this comment.
It should already be stopped - it's not possible to reach here without stopping. I can add additional step of verifying that everything stopped, but not sure it's needed
jedrazb
left a comment
There was a problem hiding this comment.
I meant to "comment", did you manage to run this with es_agent_client as dependency?
Oh yes, this branch is 100% functioning |
b075733 to
86525f6
Compare
60a1a86 to
a058c7c
Compare
|
|
||
| test: .venv/bin/pytest .venv/bin/elastic-ingest | ||
| .venv/bin/pytest --cov-report term-missing --cov-fail-under 92 --cov-report html --cov=connectors --fail-slow=$(SLOW_TEST_THRESHOLD) -sv tests | ||
| .venv/bin/pytest --cov-report term-missing --cov-fail-under 92 --cov-report html --cov=connectors --fail-slow=$(SLOW_TEST_THRESHOLD) -sv tests --ignore tests/agent |
There was a problem hiding this comment.
Ignoring tests/agent here cause repo that we build package from is private and it'll fail in CI. We can set it up in CI to pull the repo too, but occasional user of main can run into problems and I'd rather keep main clean and functioning
There was a problem hiding this comment.
I'd challenge that we don't care about 2xx, in operational review discussions we've discussed that bumps in traffic
++
There was a problem hiding this comment.
Sorry what's that quote? :D
There was a problem hiding this comment.
lol looks like a paste error from a thread in https://github.com/elastic/cloud/pull/130868#pullrequestreview-2271519957
| return configuration | ||
|
|
||
|
|
||
| def add_defaults(config): |
There was a problem hiding this comment.
Not sure about method name 🤷
| @@ -0,0 +1 @@ | |||
| elastic-agent-client@git+https://github.com/elastic/python-elastic-agent-client@main | |||
There was a problem hiding this comment.
For now we just pick main, later we'll have more solid plan, hopefully
There was a problem hiding this comment.
"Always releasable" enforced here 🤙
| from connectors.agent.component import ConnectorsAgentComponent | ||
|
|
||
|
|
||
| class StubMultiService: |
There was a problem hiding this comment.
Stub makes the test code much cleaner IMO so I went for it instead of a mock
jedrazb
left a comment
There was a problem hiding this comment.
LGTM. Couple of questions about error logging and doc improvements suggestion, nothing blocking I think
|
|
||
| test: .venv/bin/pytest .venv/bin/elastic-ingest | ||
| .venv/bin/pytest --cov-report term-missing --cov-fail-under 92 --cov-report html --cov=connectors --fail-slow=$(SLOW_TEST_THRESHOLD) -sv tests | ||
| .venv/bin/pytest --cov-report term-missing --cov-fail-under 92 --cov-report html --cov=connectors --fail-slow=$(SLOW_TEST_THRESHOLD) -sv tests --ignore tests/agent |
There was a problem hiding this comment.
I'd challenge that we don't care about 2xx, in operational review discussions we've discussed that bumps in traffic
++
| the method returns False. | ||
| """ | ||
|
|
||
| # TODO: find a good link to what this object is. |
There was a problem hiding this comment.
Trying to be helpful, so digged into proto def, https://github.com/elastic/elastic-agent-client/blob/main/elastic-agent-client.proto#L168
But it seems like source is a an arbitrary struct hehe (so we have to look somewhere else for the actual definition) and take a leap of faith now
There was a problem hiding this comment.
Yeah judging by the protobuf files it's arbitrary:
// Source is the original configuration of this unit configuration object in the agent policy.
// Only standard fields are defined as explicit types, additional fields can be parsed from source.
//
// This source field will almost always contain arbitrary unit configuration fields beyond those
// explicitly defined in this message type.
google.protobuf.Struct source = 1;
There was a problem hiding this comment.
The way I interrogated this during my POC was to just add logging. I'd get a checkin event, deserialize it, re-serialize it as json, then dump it in the logs.
| """ | ||
| msg = ( | ||
| f"This connector component can't handle action requests. Received: {action}" | ||
| ) |
There was a problem hiding this comment.
should we include the error log? I don't see any other "wrapper" logic that would surface this to the user / logs
There was a problem hiding this comment.
It should log the error when called, right?
cc @seanstory will it terminate the process?
There was a problem hiding this comment.
It should log the error. See: https://github.com/elastic/python-elastic-agent-client/blob/ba2525d23ed424c5740d9bbf87baff80a90533e7/elastic_agent_client/service/actions.py#L51-L61
It won't terminate, no. It just sends a message back through the protocol that the requested action failed. See this slack thread: https://elastic.slack.com/archives/C01QQ449KE1/p1719235054818189?thread_ts=1718987763.249969&cid=C01QQ449KE1
| """ | ||
| if self._running: | ||
| msg = f"{self.__class__.__name__} is already running." | ||
| raise ServiceAlreadyRunningError(msg) |
There was a problem hiding this comment.
should we add the error log, I'm not sure we catch-and-log the exception anywhere
There was a problem hiding this comment.
This one is not caught, it'll just crash the caller. Supposedly this case is "never gonna happen", so I'd just keep it like this.
| @@ -0,0 +1 @@ | |||
| elastic-agent-client@git+https://github.com/elastic/python-elastic-agent-client@main | |||
There was a problem hiding this comment.
"Always releasable" enforced here 🤙
seanstory
left a comment
There was a problem hiding this comment.
Bunch of little nits, but I'm ok with addressing them in follow-ups rather than blocking this.
Awesome stuff. I want to see a demo an make some toasts!
| .venv/bin/pytest --cov-report term-missing --cov-fail-under 92 --cov-report html --cov=connectors --fail-slow=$(SLOW_TEST_THRESHOLD) -sv tests | ||
| .venv/bin/pytest --cov-report term-missing --cov-fail-under 92 --cov-report html --cov=connectors --fail-slow=$(SLOW_TEST_THRESHOLD) -sv tests --ignore tests/agent | ||
|
|
||
| test-agent: .venv/bin/pytest .venv/bin/elastic-ingest install-agent |
There was a problem hiding this comment.
I think we should add this to CI now, to avoid forgetting or letting it get too behind in coverage. I think that ElasticMachine should be able to clone the private repo. Hit me up if it's not working.
There was a problem hiding this comment.
If we don't have anything to say here yet, let's just remove the file for now.
| from connectors.agent.config import ConnectorsAgentConfigurationWrapper | ||
| from connectors.agent.protocol import ConnectorActionHandler, ConnectorCheckinHandler | ||
| from connectors.agent.service_manager import ConnectorServiceManager | ||
| from connectors.services.base import MultiService |
There was a problem hiding this comment.
This is using the MulitService from Connectors instead of from the elastic_agent_client. Which I think is ok (and probably for the best) but just wanted to flag in case it wasn't intentional. I hadn't tested using ActionsService or CheckinV2Service with the Connectors MultiService.
There was a problem hiding this comment.
It's technically same code - we need to clean up the python-agent repo from some classes like this or move them to example namespace to not distribute these classes.
| self.agent_connectors_config_wrapper = agent_connectors_config_wrapper | ||
| self.service_manager = service_manager | ||
|
|
||
| async def apply_from_client(self): |
There was a problem hiding this comment.
This looks a lot like the "Fake example", but I'm worried we'll need to implement more here. This seems to only look at the first unit's config, but doesn't consider log_level, features, or anything else from the checkin service's sync_component or sync_units method.
| return configuration | ||
|
|
||
|
|
||
| def add_defaults(config): |
|
|
||
| main() | ||
|
|
||
| loop.close() |
There was a problem hiding this comment.
nothing's asserted here. Intended?
There was a problem hiding this comment.
Yup. main blocks so if the test fails the test will be killed by a timeout. I can add some asserts for log messages too, but not sure it's needed
There was a problem hiding this comment.
Maybe just a comment then, with what you've written in this thread. 👍
| } | ||
|
|
||
| if source.fields.get("api_key"): | ||
| es_creds["api_key"] = source["api_key"] |
There was a problem hiding this comment.
@artem-shelkovnikov commenting so that it doesn't get lost, by default we would get Fleet-specific api key format, this should do the trick to turn it to our version:
if source.fields.get("api_key"):
api_key = source["api_key"]
# if beats_logstash_format
if ":" in api_key:
api_key = base64.b64encode(api_key.encode()).decode()
There was a problem hiding this comment.
Thanks for the investigation! I'm gonna add this code as well :)
428b8ad
💔 Failed to create backport PR(s)The backport operation could not be completed due to the following error: The backport PRs will be merged automatically after passing CI. To backport manually run: |
Part of https://github.com/elastic/search-team/issues/8096
This PR adds a way to run Connectors Service in Agent.
Entry point is
connectors/agent/cli.py- if it's ran from an Agent component via the python in.venv/bin/pythonit will enable Connector Service running within an Agent (docker or not).This PR adds bare bone logic: positive cases work, so expect tricky bugs to still be out there if we haven't accounted for anything.
I've tried to keep it very small, and I think there's no way to make it simpler/smaller :)
Additionally - I had to exclude tests for the component from
make testbecause it would breakmake testfor people who have no access to the python agent client repository - this one will be opened later so we can revert this part of the changeset. If you need to run tests there is a make targetmake test-agent- I kept high coverage for future.Checklists
Pre-Review Checklist
config.yml.example)v7.13.2,v7.14.0,v8.0.0)