Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage not accurate on Heroku #7

Closed
schneems opened this issue Apr 7, 2014 · 45 comments
Closed

Memory usage not accurate on Heroku #7

schneems opened this issue Apr 7, 2014 · 45 comments

Comments

@schneems
Copy link
Member

schneems commented Apr 7, 2014

From Heroku:

2014-04-07T04:16:14.403204+00:00 app[web.1]: PumaAutoTune (358s): All is well measure#puma.resource_ram_mb=488.5341796875 measure#puma.current_cluster_size=6 measure#puma.max_worker_limit=Infinity
2014-04-07T04:16:15.416413+00:00 heroku[web.1]: source=web.1 dyno=heroku.5385926.fb0a01d2-dbc7-46f1-aae9-d2eddc627596 sample#load_avg_1m=0.15 sample#load_avg_5m=0.06
2014-04-07T04:16:15.416714+00:00 heroku[web.1]: source=web.1 dyno=heroku.5385926.fb0a01d2-dbc7-46f1-aae9-d2eddc627596 sample#memory_total=516.06MB sample#memory_rss=510.52MB sample#memory_cache=0.02MB sample#memory_swap=5.52MB sample#memory_pgpgin=153581pages sample#memory_pgpgout=22883pages

Note that puma.resource_ram_mb=488.5341796875 while memory_total=516.06. I can't find docs on the behavior of Pss and whether swap memory gets reported.

For PumaAutoTune. I'm thinking we'll need to add a fudge factor similar to that in PumaWorkerKiller, though I would like to get this library as accurate as possible.

This may help a bit: https://www.kernel.org/doc/Documentation/filesystems/proc.txt

Related: schneems/puma_auto_tune#9

@schneems
Copy link
Member Author

schneems commented Apr 7, 2014

I attempted to include swap from smaps and it doesn't look like it does anything:

2014-04-07T04:39:05.280486+00:00 heroku[web.1]: source=web.1 dyno=heroku.5385926.369b952b-5f15-4655-8c40-3cf56ca2de1b sample#load_avg_1m=0.12
2014-04-07T04:39:05.280975+00:00 heroku[web.1]: source=web.1 dyno=heroku.5385926.369b952b-5f15-4655-8c40-3cf56ca2de1b sample#memory_total=573.77MB sample#memory_rss=511.93MB sample#memory_cache=0.00MB sample#memory_swap=61.84MB sample#memory_pgpgin=176571pages sample#memory_pgpgout=45515pages
2014-04-07T04:39:05.281567+00:00 heroku[web.1]: Process running mem=573M(112.1%)
2014-04-07T04:39:05.281777+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2014-04-07T04:39:08.011804+00:00 app[web.1]: PumaAutoTune (248s): All is well measure#puma.resource_ram_mb=500.4775390625 measure#puma.current_cluster_size=5 measure#puma.max_worker_limit=Infinity
2014-04-07T04:39:08.143982+00:00 app[web.2]: PumaAutoTune (248s): All is well measure#puma.resource_ram_mb=500.267578125 measure#puma.current_cluster_size=5 measure#puma.max_worker_limit=Infinity

Here's my the commit on my branch to add in swap: 17e6b9a

@schneems
Copy link
Member Author

schneems commented Apr 7, 2014

As an experiment I tried putting in a task that looped through all smaps files to get the RAM usage, but this comes out much higher than what Heroku reports:

2014-04-07T18:02:36.494027+00:00 app[web.2]: Experiment: measure#memory_total=268.771484375
2014-04-07T18:02:37.428483+00:00 heroku[web.2]: source=web.2 dyno=heroku.5385926.7c03cf14-af7a-4549-94cf-9bbd4af6e198 sample#memory_total=211.09MB sample#memory_rss=211.09MB sample#memory_cache=0.00MB sample#memory_swap=0.00MB sample#memory_pgpgin=74027pages sample#memory_pgpgout=19988pages

Here's the file: https://gist.github.com/schneems/10025798

@jeremyhaile
Copy link

I guess there is no way to get this information directly from heroku, other than using log drains - right?

I took a look at their API and don't see any memory reporting. I wish heroku offered their "runtime-metrics" via their API for each dyno.

@jeremyhaile
Copy link

Have you tried reaching out to heroku support to see if there is a command you could run to get the same value they are using?

@jeremyhaile
Copy link

Have you looked at how oink is doing it? Do any of these approaches match heroku?

https://github.com/noahd1/oink/blob/master/lib/oink/instrumentation/memory_snapshot.rb

@schneems
Copy link
Member Author

I work for Heroku 😀

So this is actually a linux concern http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ basically linux is not container aware when it comes to containers like LXC (which Heroku uses). While you can get the memory from outside the container easily, getting it from within, accurately, is really really hard (perhaps impossible?). As you mentioned we do drain the information via log-runtime-metrics, unfortunately they are getting this info from outside of the container.

Another suggestion someone gave me at Railsconf was to take a look at how Passenger measures memory as they do something similar to puma_auto_tune (so i've been told).

@jeremyhaile
Copy link

Ah - I see. Well what I think would be nice is if there was a way to use the heroku gem to query for the latest memory check. For example, if the values that log-runtime-metrics is displaying in the log could be stored somewhere and then retrieved via the API (i.e. the heroku gem)

Make sense? Obviously this isn't a tiny change, but would really help with things like puma_auto_tune

@ksoderstrom
Copy link
Contributor

I briefly checked the Passenger code, and while they seem to have several memory measurements, this looks interesting: https://github.com/phusion/passenger/blob/master/ext/common/Utils/ProcessMetricsCollector.h#L495

They're using PSS + private dirty RSS + swap. I've been quite busy lately, but will try to get some time to test and see how that would work with puma_auto_tune/puma_worker_killer.

@schneems
Copy link
Member Author

Thanks for the tip. I'm running on a mac, if you want an example snap check
out test/fixtures. looks like Private_Dirty is the key we would be
referencing. Heres some quick googling for links

http://stackoverflow.com/questions/17594183/what-does-private-dirty-memory-mean-in-smaps

On Wed, Apr 30, 2014 at 11:58 AM, Karl Söderström
[email protected]:

I briefly checked the Passenger code, and while they seem to have several
memory measurements, this looks interesting:
https://github.com/phusion/passenger/blob/master/ext/common/Utils/ProcessMetricsCollector.h#L495

They're using PSS + private dirty RSS + swap. I've been quite busy lately,
but will try to get some time to test and see how that would work with
puma_auto_tune/puma_worker_killer.


Reply to this email directly or view it on GitHubhttps://github.com//issues/7#issuecomment-41821276
.

@ksoderstrom
Copy link
Contributor

I tested with PSS + dirty + swap, but the results are far from accurate.

2014-04-30T22:26:41.747623+00:00 heroku[web.1]: source=web.1 dyno=heroku.23103076.0038ac92-bc5d-4a8b-84d3-798ead3adfa7 sample#load_avg_1m=0.00
2014-04-30T22:26:41.747883+00:00 heroku[web.1]: source=web.1 dyno=heroku.23103076.0038ac92-bc5d-4a8b-84d3-798ead3adfa7 sample#memory_total=314.22MB sample#memory_rss=314.21MB sample#memory_cache=0.01MB sample#memory_swap=0.00MB sample#memory_pgpgin=271176pages sample#memory_pgpgout=190736pages
2014-04-30T22:26:44.006693+00:00 app[web.1]: PumaAutoTune (179s): All is well puma.resource_ram_mb=509.7822265625 puma.current_cluster_size=5

I've also tested a bit with different calculations for master and child processes. I believe that RSS + dirty + swap for the puma master process, and dirty + swap for the workers should give a pretty accurate result, but so far no luck...

@schneems
Copy link
Member Author

schneems commented May 1, 2014

I played with this for a bit. It looks like they're using real_memory = swap + private_dirty (https://github.com/phusion/passenger/blob/master/ext/common/Utils/ProcessMetricsCollector.h#L126).

When i try the same thing https://gist.github.com/schneems/4ac2983f0abf3e200ec9

2014-05-01T19:47:38.792533+00:00 app[web.1]: Experiment: measure#puma_memory_total=271.8359375
2014-05-01T19:47:39.230994+00:00 app[web.1]: Experiment: measure#memory_total=367.96875
14-05-01T19:47:39.226449+00:00 heroku[web.1]: source=web.1 dyno=heroku.5385926.f178766a-0af7-4379-8dc9-f030f419af94 
sample#memory_total=320.13MB sample#memory_rss=320.00MB sample#memory_cache=0.07MB sample#memory_swap=0.06MB sample#memory_pgpgin=102888pages sample#memory_pgpgout=20950pages

Neither 272.17 or 367.96 are close to the Heroku reported 320mb. It's interesting though that they're both about 15% off and in opposite directions. So if you average them together you get 319 which is pretty freaking close. This is voodoo math right now as I have no good reason why we should be doing this average other than it just happens to get close. Also this method only works when using a container as we're grabbing the entire memory of the whole OS.

@ksoderstrom
Copy link
Contributor

But only using swap + private_dirty doesn't make sense, except for comparison between workers. To calculate the total memory size we need to include the shared memory somewhere, either as RSS (once?) or PSS per process. I still don't understand why including RSS in the master process doesn't give better results.

@ksoderstrom
Copy link
Contributor

I did some more testing, and found something interesting while checking the memory of an instance only running bash:

Running `bash` attached to terminal... up, run.3434
~ $ ps ax
  PID TTY      STAT   TIME COMMAND
    1 ?        S      0:00 ps-run
    2 ?        S      0:00 bash
    3 ?        R+     0:00 ps ax
~ $ cat /proc/2/smaps | grep 'Rss' | awk '{ s+= $2 } END { print s }'
2052
~ $ cat /proc/2/smaps | grep 'Shared_Clean\|Shared_Dirty' | awk '{ s+= $2 } END { print s }'
1336
~ $ cat /proc/2/smaps | grep 'Private_Clean\|Private_Dirty' | awk '{ s+= $2 } END { print s }'
716
~ $

RSS should be Shared_Clean + Shared_Dirty + Private_Clean + Private_Dirty. Private_* is private to that process, and Shared_* is of course shared between multiple processes. In this case, RSS for pid 2 (bash) is 2052 kb, the shared parts are 1336 kb, and the memory private to this process is 716 kb.

My guess would be that this instance would use about 2mb of memory. Sounds reasonable since the RSS for bash is 2 mb, right?

2014-05-01T23:18:40.900237+00:00 heroku[run.3434]: source=run.3434 dyno=heroku.23103076.d44dfb2a-d74e-43a8-8b7c-307fc462ac68 sample#memory_total=1.01MB sample#memory_rss=0.73MB sample#memory_cache=0.28MB sample#memory_swap=0.00MB sample#memory_pgpgin=1395pages sample#memory_pgpgout=1137pages

Turns out that memory_rss is closer to only the private memory of the bash. Does this mean that the shared memory might actually be shared with other containers or the host? I've tried to find info about this, but haven't been able to yet.

If memory can be shared between containers and/or host, it will be even harder to figure out the memory use for a specific process...

@allaire
Copy link

allaire commented Jun 12, 2014

@schneems Would I hit this nasty bug on something else than Heroku? Let see a basic digital ocean VPS running Ubuntu 12.04/14.04 ?

@schneems
Copy link
Member Author

Basically measuring effective memory on any linux system seems to be impossible :( If you know you have the whole machine you could watch swap usage and minimize that, but it's not ideal.

schneems added a commit that referenced this issue Jun 16, 2014
@bdmac
Copy link

bdmac commented Jun 25, 2014

Any update on this lately? I switched back to puma_worker_killer from puma_auto_tune (not that that is all that relevant) and am seeing PWK report memory usage of 819mb when heroku reports 307mb.

@schneems
Copy link
Member Author

I changed the calculation back to use RSS which is what Unicorn Worker Killer uses. RSS is a better stop gap measure as it reports that more memory is used as opposed to less. It's still not perfect.

We're working on rolling out some tools internally to better expose memory. It's a surprisingly hard problem. It will likely take awhile.

Richard Schneeman

On Wed, Jun 25, 2014 at 4:40 PM, Brian McManus [email protected]
wrote:

Any update on this lately? I switched back to puma_worker_killer from puma_auto_tune (not that that is all that relevant) and am seeing PWK report memory usage of 819mb when heroku reports 307mb.

Reply to this email directly or view it on GitHub:
#7 (comment)

@bdmac
Copy link

bdmac commented Jun 26, 2014

Somewhat strange that it's so tough when Heroku reports on this in the logs (via log_runtime_metrics) and obviously knows when to issue R12 warnings as well. As a potentially terrible solution could you just periodically parse it out of the logs haha? Another idea would be to have whatever is monitoring memory write the current memory usage to an ENV var whenever it outputs to the log?

@schneems
Copy link
Member Author

The way memory is exposed outside the container is very different than how it's exposed inside. More info http://fabiokung.com/2014/03/13/memory-inside-linux-containers/

@nathany
Copy link

nathany commented Jun 26, 2014

@schneems Would this gem still be useful for relative memory usage? Had memory jump by 369MB on one dyno today. I'm finding it surprisingly difficult to determine which action was the culprit. A lot can happen in the 20-seconds worth of requests between log-runtime-metrics output. Thinking of adding memory_total to our lograge logs.

@igorbernstein
Copy link

@maxim
Copy link

maxim commented Jul 18, 2014

Could you folks clarify whether this issue is irrelevant if hosting without containers (simple VPS like DigitalOcean/Linode)?

@bdmac
Copy link

bdmac commented Jul 18, 2014

Would also be interested in knowing if this affects PX dynos on heroku since they are not multitenant?~ Brian

On Thu, Jul 17, 2014 at 6:26 PM, Maxim Chernyak aka hakunin
[email protected] wrote:

Could you folks clarify whether this issue is irrelevant if hosting without containers (simple VPS like DigitalOcean/Linode)?

Reply to this email directly or view it on GitHub:
#7 (comment)

@schneems
Copy link
Member Author

@maxim YMMV. RSS or PSS are both approximate measures. Honestly it doesn't matter where you run they're only going to be indicators of if you're using more or less memory, but not ever a byte-by-byte measure of exact memory.

@bdmac a PX dyno is a container that takes up the ENTIRE instance, so it will be better...but not perfect. This buys you the ability to use existing OS tools like top to get an accurate view of resources being consumed. It's impossible to know if you're in a container if the container has access to the WHOLE machine, so right now this library doesn't have any special case code for the PX case.

The issue comes really from the nebulous nature of memory rather than if it's run in a container or not. What number does DigitalOcean/Linode use to start throttling your app, i.e. how do you know when your app has started swapping to disk because no more physical memory is available?

Hard as it is to believe, memory measurements when it comes to an OS is almost always an approximation at best. Trying to match our approximation to the host's approximation is the hard part. I've never run this anywhere in production other than Heroku, would be interested in experiences.

@maxim
Copy link

maxim commented Jul 25, 2014

@igorbernstein btw was looking into oink, and found this, so probably not going to work well for puma.

@schneems I believe newrelic also uses rss for memory measurements, and their graphs always correlate perfectly with me getting "cannot allocate memory" errors, so perhaps my mileage will be pretty spot on? It's not like the function of "restart puma riiight about nnnow" can ever be precise, false positives are not really an issue as long as all the true positives are prevented. So considering the above, is this usable in production?

@maxim
Copy link

maxim commented Jul 27, 2014

@schneems Heh, I forgot that this isn't the puma_auto_tune repo, but hopefully it was clear that that's what I was referring to.

@maxrosecollins
Copy link

Has this been fixed yet?

@schneems
Copy link
Member Author

Has this been fixed yet?

It can't really be "fixed" from inside of a container until containers expose that information.

@fernandokosh
Copy link

And about XEN? Any issues?

@pratik60
Copy link

@schneems The first line on your readme for puma auto-tune, says don't use until this is fixed. Should we just ignore this and use it?

@schneems
Copy link
Member Author

Use at your own risk. I recommend puma worker killer rolling restarts instead.

@watsonian
Copy link

Just as an update, I was playing around with this some and it appears as though memory usage is accurate until you start swapping. After that, it fails to account for the swap and constantly shows up as being below the max memory limit because everything over that is getting swapped out.

As such, if you have a very regular memory leak that happens over the course of several minutes, it should be possible to catch it and restart with a fairly decent level of accuracy so long as you leave enough headroom in the percentage (e.g., 0.9 rather than 0.98) and a short enough frequency.

It's also worth noting that up until Oct 28, 2015, the file system cache was being included in the memory total reported by the platform (see here) which could have thrown off the calculation before as well.

@jjb
Copy link

jjb commented Jan 21, 2017

What about an approach which has an app process its own logs using a middleware?

simplest approach: set up a log drain to send the app's logs back to itself. the middleware picks it up, looks at the memory info, discards everything. drawbacks:

  1. generally adding complexity to production app
  2. adding load to production app (although even with enormous log volume if it's the first middleware i bet it wouldn't affect the app's performance much).
  3. log would be posted to random dynos so a dyno might go several minutes without know its own memory info.

more complicated approach to deal with problem 2: make standalone app which accepts log drain and nothing else. it filters out the memory info and sends it back to the production app. now the production app just needs to deal with a couple POSTs/minute.

even more complicated approach to deal with problems 2 and 3: same standalone app above, but instead of sending logs back to production app, the production app can query the standalone app on a schedule. this might offer other interesting opportunities for coordination. although it's getting into territory that is probably anathema to 12-factor.

@schneems
Copy link
Member Author

Check out https://github.com/noahd1/oink it sounds similar to that proposal

@ghost
Copy link

ghost commented May 6, 2018

@schneems any reason why PSS was totally deleted? Before the gem had an option which type to choose GetProcessMem.new(Process.pid, mem_type: 'rss') (https://github.com/schneems/get_process_mem/tree/edd767cff8a7463e756bac323742e711d445380f).

There are a lot libraries which uses PSS as well: https://github.com/propella/multimem (ruby, it does the thing but slower then this one), https://www.selenic.com/smem/ (python), https://github.com/ColinIanKing/smemstat (Python).

There is no problem with PSS. It is not an inaccurate. It's just PSS. For some processes it's better to use RSS, but for others - PSS. I really don't understand why PSS was deleted.

@schneems
Copy link
Member Author

schneems commented May 6, 2018

@vfreely this would be a better question for a new issue, it’s not related to the core issue here other than I was using pss to measure. The short answer is that I don’t remember. I think maybe it was because it wasn’t supported in all OSes but I’m not 100% positive.

@ghost
Copy link

ghost commented May 6, 2018

@schneems got it, thanks for answer

@schneems schneems changed the title PSS Memory usage not accurate Memory usage not accurate May 7, 2018
@schneems schneems changed the title Memory usage not accurate Memory usage not accurate on Heroku May 7, 2018
@iGEL
Copy link

iGEL commented Aug 17, 2018

@schneems I was wondering whether the recent addition (Docker 1.13, January 2017) of the statistics /sys/fs/cgroup/memory might be of use here?

@schneems
Copy link
Member Author

Heroku runs on LXC containers do you know if that cgroup interface was implemented there as well?

@iGEL
Copy link

iGEL commented Aug 17, 2018

@schneems No, sorry, I don't know, as we don't use it. However, it would be useful as it seems to become a defacto standard.

In general, this problem applies also to Docker and all other cgroup based containers, doesn't it?

@schneems
Copy link
Member Author

The somewhat canonical issue I've been watching is moby/moby#8427 (comment). If that is indeed a defacto standard then I would post there to see what other people think of its use.

@schneems
Copy link
Member Author

Kinda, but get_process_mem's job is to find the memory of a specific process while that file tells us the memory of the whole system. We could maybe add helper methods like GetProcessMem.heroku_private_space_mem and GetProcessMem.heroku_private_space? (so we can know when we're in a private space), i'm not sure what the cannonical way of knowing is, though I guess presence of that file would be a pretty good indication.

@ioquatix
Copy link

@schneems How does Heroku determine the space used by a container?

@schneems
Copy link
Member Author

How does Heroku determine the space used by a container?

It's reported externally to the host OS. We don't document the exact mechanism anywhere.

I want to close this issue in the interest of cleaning up issues. It's a much bigger problem than can be solved in just this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests