-
Notifications
You must be signed in to change notification settings - Fork 16.4k
[AIP-44] Add internal API definition. #27892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d95cd3f to
86a84e8
Compare
86a84e8 to
5264018
Compare
5264018 to
5e401e6
Compare
7b8918f to
a9e2f10
Compare
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really love how small and succint it is.
The way we have it is I think absolutely minimal amount of code we need to achieve the robust and easy to maintain internal API.
- no duplication
- single source of truth
- we are using our own serialisation that we will keep on evolving (and securing - see #28067 for example and https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#allowed-deserialization-classes added in 2.5.0
- super easy to add support for new methods (just make it static and decorate it).
I think this is a great "founding PR".
Few nits, but other than that - LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary use for operationId is for code generators to generate method names. In our case, when we dynamically generate "actual" method names and parameters and pass them via json-rpc construct, it does not really matter (there will always be a single method to call) but I agree with @vincbeck that "json_rpc" is poor name here:.
How about internal_airflow_api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the way how Open API specification describes it conformnce with JsonRPC. We should have it in. Various tools that might be later employed might use the information (swagger UI for example to generate the documentation and describe the API).
This is not crucial in our case, the Swagger UI and other OpenAPI artifacts are really "side-efffect" here - we have no-one to consume those artifacts as our API is purely internal and most of it is not really "fixed" and described (i.e. the actual methods and parameters are not really validated/processed by the API specification - because they are dynamically generated on both client and server side - so this is not actually very needed, but might be useful in the future in case we will do any kind of additional tooling that might rely on inspecting the specification and using the "metadata" about JsonRPC under the hood.
BTW. This is perfectly OK we do not have those methods and parameters described. This is not the "usual" API that you expose and document. Both client and server in this communication are guaranteed to have single source of truth (inspection of the method names and their parameters and serializing them using our own serializer).
So we are good here. The Open API specification we have looks cool:
- single endpoint
- JSonRpc conformance
- no boiler-plate growing with every method added
- single source of truth for the actual "schema" of passed data
- the data is easy to inspect by human without extra tools (method names, and parameters send as JSON-serialized data).
- guaranteed client-server compatibility
I really like it :)
a9e2f10 to
15ee13e
Compare
|
|
||
| log.debug("Calling method %.", {method_name}) | ||
| try: | ||
| output = handler(**params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized something here. Dont we have an issue here? All functions listed in METHODS_MAP are annotated with @internal_api_call, so when we execute these functions here, we'll trigger again the decorator and as such, send a rpc request. Am I wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I have an issue for that #28267
The idea is: some components (like Scheduler/Webserver/Internal API) will always run these methods locally
Introduce Internal API as part of the webserver (standalone version in a follow-up) with a single JSON-RPC endpoint /internal/v1/rpcapi accepting 3 params "jsonrpc", "method_name" and "params".
Add a decorator for making calls to the Internal API. When method is decorated it is executed in one of two modes, depending on configuration
All communication is serialized as JSON using BaseSerialization from Airflow.
Also decorate first function DagFileProcessor.update_import_errors with @internal_api_call decorator