Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when using client.clone() for server (proxy) #1315

Closed
klausi opened this issue Sep 10, 2017 · 4 comments
Closed

Memory leak when using client.clone() for server (proxy) #1315

klausi opened this issue Sep 10, 2017 · 4 comments
Labels
A-client Area: client. C-bug Category: bug. Something is wrong. This is bad!

Comments

@klausi
Copy link
Contributor

klausi commented Sep 10, 2017

I'm writing a reverse proxy and use the HTTP client to connect to any upstream server. Code is in https://github.com/klausi/rustnish/blob/goal-06/src/lib.rs#L150 . The server is leaking memory so I must be doing something wrong.

Steps to reproduce:

  1. Make sure you have any upstream HTTP service running locally on port 80, for example the default Apache install on Ubuntu which will give you a dummy page at http://localhost/

    sudo apt install apache2
    
  2. Run rustnish reverse proxy on port 9090:

    git clone --branch goal-06 [email protected]:klausi/rustnish.git
    cd rustnish
    cargo run --release
    
  3. Get PID of rustnish process and memory usage:

    ps aux | grep rustnish
    

    The 6th column is the memory usage of the process, something like 20124 (~20MB)

  4. Fire 1 million requests at the reverse proxy with Apache Bench:

    ab -c 4 -n 1000000 http://localhost:9090/
    

    (This takes ~130 seconds on my computer)

  5. Check memory usage again:

    ps aux | grep rustnish
    

    The 6th column should show something like 284920, which is ~280MB!

So although we do not cache any requests or keep them around otherwise memory usage is growing without freeing up anymore.

One solution to the problem is this patch:

diff --git a/src/lib.rs b/src/lib.rs
index 520dd24..a591b99 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -170,7 +170,6 @@ pub fn start_server_background(
             let http = Http::new();
             let listener = TcpListener::bind(&address, &handle)
                 .chain_err(|| format!("Failed to bind server to address {}", address))?;
-            let client = Client::new(&handle);
 
             let server = listener.incoming().for_each(move |(sock, addr)| {
                 http.bind_connection(
@@ -180,7 +179,7 @@ pub fn start_server_background(
                     Proxy {
                         port: port,
                         upstream_port: upstream_port,
-                        client: client.clone(),
+                        client: Client::new(&handle),
                     },
                 );
                 Ok(())

Avoiding the client.clone() call fixes the memory leak but degrades the runtime performance. Apache Bench with 1 million requests took 220 seconds instead of 130 seconds before. So the client_clone() call seems to be the right thing to do, but the clones are not dropped correctly when one request handling is done?

@seanmonstar
Copy link
Member

The Client contains a pool of connections internally. Are the requests to different hosts? The pool is lazy about ejecting expired connections, so that could be part of it...

@klausi
Copy link
Contributor Author

klausi commented Sep 16, 2017

The requests are always to the same host, http://localhost/ in this example. I will try to find out what happens on a cloned client and what exactly is growing in memory.

@seanmonstar seanmonstar added A-client Area: client. C-bug Category: bug. Something is wrong. This is bad! labels Sep 16, 2017
@klausi
Copy link
Contributor Author

klausi commented Sep 17, 2017

After spending half an hour of adding where T: Debug, to all the things in pool.rs I was able to debug log the pool after 20 requests:

Pool { inner: RefCell { value: PoolInner { enabled: true, idle: {}, parked: {"http://localhost:80": [Sender {
 inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { 
inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { 
inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { 
inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { 
inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }, Sender { inner: (Weak) }]}, timeout: 
Some(Duration { secs: 90, nanos: 0 }) } } }

The hashmap of parked things is growing linearly with the requests performed. The only call to remove() entries form the parked HashMap is in the put() function of Pool, but it is never called. So maybe we need to clean up the parked HashMap in other places as well?

@seanmonstar
Copy link
Member

Great work investigating this! Thanks!

The "parked" senders are Checkouts that are "racing" for either a newly connected socket, or an existing socket to become idle, whichever comes first. Your PR is definitely going in the right direction, it just needs to be a little more targeted to only remove the relevant sender, I'll look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-client Area: client. C-bug Category: bug. Something is wrong. This is bad!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants