-
Notifications
You must be signed in to change notification settings - Fork 19
Track the execution state of every integration connection on the backend #1220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// TODO(ENG-2523): move base conda env creation outside of ConnectIntegration. | ||
if args.Service == shared.Conda { | ||
condaDB, err := database.NewDatabase(DB.Config()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@likawind @hsubbaraj-spiral why do we define a new database here (and in lambda)? I copied this from existing code. I would like to document the reason here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You couldn't pass the DB object to a separate go routine, otherwise it will error since some underlying connection is lost since you lost the parent scope that creates the DB object. Here, you'd need to call this NewDatabase
as a part of go func()
to ensure you initialize the DB and use it within the same scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both testing and editing an integration should update the exec_state. We can track the source of the change to so we can display "Tested on" vs "Edited on" vs "Created On" timestamps. What do ya'll think? cc: @vsreekanti This ofc means that testing an integration and finding out that it is failing may cause SDK operations that are using that integration to fail. But tbh that seems like proper behavior, since execution is going to fail anyways (it'll just fail earlier now).
Maybe "Last Verified" would make sense here. Or even have "Last Used" and make an entry to the table each time the integration is invoked. That would give a sense of stability even without using the testing integration button.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the "Last Used" idea, yeah. By making an entry to the table each time, you mean inserting an entry into the integrations table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good! Have you verified with conda or lambda to ensure we have no regression in async cases?
Blocking mainly for the part initializing DB in go routines. I have a few other comments, which can be addressed as follow ups:
- Why don't we add a new column to track exec state if it applies to all integrations? This reduce the need to deserialize the values
- I don't have impression that you can drop column in sqlite, but if tests are passing it's probably fine.
- I assume these information will show up in UI in the new project. If they do show up, what do we do if the user click 'test connect integration'? Should the exec state be updated if the test-connect failed?
// TODO(ENG-2523): move base conda env creation outside of ConnectIntegration. | ||
if args.Service == shared.Conda { | ||
condaDB, err := database.NewDatabase(DB.Config()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You couldn't pass the DB object to a separate go routine, otherwise it will error since some underlying connection is lost since you lost the parent scope that creates the DB object. Here, you'd need to call this NewDatabase
as a part of go func()
to ensure you initialize the DB and use it within the same scope.
return | ||
condaErr := setupCondaAsync(integrationRepo, integrationObject.ID, publicConfig, condaDB) | ||
if condaErr != nil { | ||
log.Errorf("Conda setup failed: %v", condaErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do we handle updating failure exec state here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside setupCondaAsync()
// the that entry to reflect the failure. | ||
defer func() { | ||
if err != nil { | ||
execution_state.UpdateOnFailure( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm my understanding - this is only useful for sync connections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'll make a note of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually no that's not true. It's only useful for the synchronous connection part for all integrations. Even lambda and conda have additional setup that is done after this. I think since the goroutines are the last thing that is done in this method, we can be assured that there are no update races between synch and async. I'll document that.
I initially made a separate table for connections called
I think you can, we have other migrations that do this.
I think both testing and editing an integration should update the exec_state. We can track the source of the change to so we can display "Tested on" vs "Edited on" vs "Created On" timestamps. What do ya'll think? cc: @vsreekanti This ofc means that testing an integration and finding out that it is failing may cause SDK operations that are using that integration to fail. But tbh that seems like proper behavior, since execution is going to fail anyways (it'll just fail earlier now). |
@likawind addressed the database thing. Good to know! I haven't fully tested it E2E but will do that soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR LGTM!
If there's too much work it make sense to keep exec_state in the dict. Generally agree with your proposals on test-connection. Following that, should we update integrations to failure state if we detect any such connection failed during operator execution? |
bbd06a7
to
b4d3694
Compare
some fixes shoot. Maybe it's better to keep it two tables moved it to an integration_connection table, because of editing the migration works. Now to worry about the read and write paths added the drop column one ok now I'm confused. Unwinding back to a single table approach. there are so many thing swrong with this but at least I see a path through now. made some progress. Write path internals in a good state unwound a bunch of stuff prune all the stuff that is unnecessary now Moved the transaction back inside reverted a bunch of stuff back, still need too drop validated column added validated column drop. Ready for testing now. remove dsdk changes last things, lint and block usage of broken integrations. lint done small formatting fixes fixed gh actions made exec_state optional on SDK cr fixed err removed elem update backend
f516f10
to
fb1047e
Compare
@likawind as lead reviewer. @hsubbaraj-spiral as secondary to maybe help answer Qs. The meat of the changes are in
connect_integration.go
.Describe your changes and why you are making these changes
Conda and Lambda integrations already manage the integration lifecycle from registration, setup, to completion. This PR makes the same pattern available for all integrations. This means that when we register an integration:
integration
table entry.integration
table entry will be updated to the appropriate terminal state. For all non-conda and non-lambda integrations, this happens synchronously.The
list_integrations
endpoint now returns the execution state instead of the validated flag. The validated column is dropped from the integrations table, as it is no longer useful. To keep this PR well scoped, we keep the data model of everything else (eg. thecreatedAt
column).This is what the database now looks like (after a couple successful connections and a Lambda integration that failed to setup):

This is what the describe() and list_integrations() SDK methods now look like:

As an immediate follow-up, I error on the SDK if we attempt to preview or publish against any integration that has not been successfully registered. Note that this change is purely on the SDK and Backend, and has no affect on our contract with the UI (failed integration cards will still show up normally for now).
Related issue number (if any)
ENG-2782
Loom demo (if any)
Checklist before requesting a review
python3 scripts/run_linters.py -h
for usage).run_integration_test
: Runs integration testsskip_integration_test
: Skips integration tests (Should be used when changes are ONLY documentation/UI)