Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Allow to run pure Python request middlewares inside a Tower service #1734

Merged
merged 40 commits into from
Sep 22, 2022

Conversation

crisidev
Copy link
Contributor

@crisidev crisidev commented Sep 14, 2022

Motivation and Context

  • Customers want to be able to implement simple middleware directly in Python. This PR aims to add the initial support for it.
  • Improve the idiomatic experience of logging by exposing a handler compatible with Python's stardard library logging module.

Description

Middleware

A middleware is defined as a sync or async Python function that can return multiple values, following these rules:

  • Middleware not returning will let the execution continue without changing the original request.
  • Middleware returning a modified Request will update the original request before continuing the execution.
  • Middleware returning a Response will immediately terminate the request handling and return the response constructed from Python.
  • Middleware raising MiddlewareException will immediately terminate the request handling and return a protocol specific error, with the option of setting the HTTP return code.
  • Middleware raising any other exception will immediately terminate the request handling and return a protocol specific error, with HTTP status code 500.

Middlewares are registered into the Python application and executed in order of registration.

Example:

from sdk import App
from sdk.middleware import Request, MiddlewareException

app = App()

@app.request_middleware
def inject_header(request: Request):
    request.set_header("x-amzn-answer", "42")
    return request

@app.request_middleare
def check_header(request: Request):
    if request.get_header("x-amzn-answer") != "42":
        raise MiddlewareException("Wrong answer", 404)

@app.request_middleware
def dump_headers(request: Request):
    logging.debug(f"Request headers after middlewares: {request.headers()}")

NOTE: this PR only adds support for request middlewares, which are executed before the operation handler. Response middlewares, executed after the operation are tracked here: #1754.

Logging

To improve the idiomatic experience, now logging need to be configured from the Python side by using the standard logging module. This allows customers to opt-out of our tracing based logging implementation and use their own and logging level is now driven by Python.

import logging
from sdk.logging import TracingHandler

logging.basicConfig(level=logging.INFO, handlers=[TracingHandler.handle()])

Checklist

  • I have updated CHANGELOG.next.toml if I made changes to the smithy-rs codegen or runtime crates

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@@ -73,17 +77,15 @@ pub trait PyApp: Clone + pyo3::IntoPy<PyObject> {
.getattr(py, "pid")
.map(|pid| pid.extract(py).unwrap_or(-1))
.unwrap_or(-1);
tracing::debug!("Terminating worker {idx}, PID: {pid}");
println!("Terminating worker {idx}, PID: {pid}");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tracing statements have been moved to println! since at this stage we do not have the tracing subscriber initialized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use with_default with a fallback/default subscriber here? It's probably more effort than it's worth though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have actually fixed this issue in a new commit that I'll push in a bit!

Comment on lines +102 to +109
#{pyo3}::Python::with_gil(|py| {
let pyhandler: &#{pyo3}::types::PyFunction = handler.extract(py)?;
let output = if handler.args == 1 {
pyhandler.call1((input,))?
} else {
pyhandler.call1((input, state.0))?
};
output.extract::<$output>()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for removing tokio::task::block_in_place is that we are still bound to the mutex on the GIL that is taken inside the Python::with_gil closure.

No matter what we do, this synchronous mutex will not allow tokio to continue the execution.

Removing block_in_place yields a significant performance improvement since it removes the overhead of scheduling new tasks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other frameworks using the same approach as Smithy-rs Python are also doing the same: sparckles/Robyn@4ef01e6

@crisidev crisidev requested a review from a team as a code owner September 20, 2022 17:24
@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@hlbarber hlbarber self-requested a review September 21, 2022 11:13
Copy link
Contributor

@hlbarber hlbarber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the case that this approach will be less expressive than a general tower::Layer, even after the response side middleware is also implemented?

A general tower::Layer, or more concretely the struct Middleware<S> { inner: S, ... } it applies around a S, has the ability to hold onto state from the request side to the response side.

For example, I might pop a request header, hold onto it while the S does some handling, and then insert it as a response header. Is this a class of behavior we want to support?

You can see this kind of middleware in Django too
https://docs.djangoproject.com/en/4.1/topics/http/middleware/#writing-your-own-middleware

If this were possible to support then request/response middleware would then become a subclass of this more general middleware.

filename = #{tracing}::field::Empty,
lineno = #{tracing}::field::Empty
);
let guard = span.enter();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is handy if you prefer this style https://docs.rs/tracing/latest/tracing/struct.Span.html#method.entered

@@ -73,17 +77,15 @@ pub trait PyApp: Clone + pyo3::IntoPy<PyObject> {
.getattr(py, "pid")
.map(|pid| pid.extract(py).unwrap_or(-1))
.unwrap_or(-1);
tracing::debug!("Terminating worker {idx}, PID: {pid}");
println!("Terminating worker {idx}, PID: {pid}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use with_default with a fallback/default subscriber here? It's probably more effort than it's worth though.

@crisidev
Copy link
Contributor Author

crisidev commented Sep 21, 2022

Is it the case that this approach will be less expressive than a general tower::Layer, even after the response side middleware is also implemented?

A general tower::Layer, or more concretely the struct Middleware<S> { inner: S, ... } it applies around a S, has the ability to hold onto state from the request side to the response side.

This is an initial implementation of a subset of the features I would like to see available in middleware handling, but there are certain features that will be either extremely hard or not possible to do with PyO3: for example there is no way to store generics inside a Python compatible structure. I will handoff this to @unexge after this initial support for middleware is done. I think after it we can sit down together and discuss possible improvement and a more general implementation.

For example, I might pop a request header, hold onto it while the S does some handling, and then insert it as a response header. Is this a class of behavior we want to support?

You can see this kind of middleware in Django too https://docs.djangoproject.com/en/4.1/topics/http/middleware/#writing-your-own-middleware

We already discussed about using a class to express middleware instead of just functions, which will end up with something very similar to what Django does.

If this were possible to support then request/response middleware would then become a subclass of this more general middleware.

FYI, I would like to see most if not all the middleware we vend to customers to be implemented directly in Rust for performance and safety reasons, but I think it would be nice for customers to be able to implement simple middleware directly in Python.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@crisidev crisidev enabled auto-merge (squash) September 22, 2022 16:56
@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@crisidev crisidev merged commit e5c8cf3 into main Sep 22, 2022
@crisidev crisidev deleted the crisidev/oxipy-middleware branch September 22, 2022 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python-server Python server SDK server Rust server SDK
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants