Replies: 11 comments
-
So if take some of the ideas here, we could set an We would have very trending fragments in a fastest in-memory cache, and all fragments stored on disk would need to be regenerated once in a while (which I think is acceptable). |
Beta Was this translation helpful? Give feedback.
-
Rails can only have 1 cache, so a combined in memory cache and file cache isn't possible, unless someone writes it :) I don't think memory cache is a viable option, since it eats scarce RAM, and would be a separate duplicate copy for each of our 3(?) workers. Redis is great, as is memcached in it's own way, but the effort (and monetary cost) of converting to either, just so things will keep working as they do now doesn't seem defensible. So I'm left with thinking we should duct tape up the FileStore to keep working in a sustainable way, and leave it at that until it's a real problem. |
Beta Was this translation helpful? Give feedback.
-
I agree, it's probably not worth spending much time on caching until performance becomes a problem - IIRC regenerating large competition pages from scratch took less than 3s, I don't know about registration pages, though.
Converting to memcached looks pretty straightforward: http://edgeguides.rubyonrails.org/caching_with_rails.html#activesupport-cache-memcachestore and I assume someone implemented a Redis store as well.
The So: |
Beta Was this translation helpful? Give feedback.
-
I like the simplicity of a I don't have any experience with redis, and am counting on you more experienced web devs to chime on when it's time to bite that bullet :) |
Beta Was this translation helpful? Give feedback.
-
Both Redis and memcached are pretty easy to deal with. At least compared to
other server software. You can even put them on the same server as the app,
though we should probably get one with more Ram in that case.
As long as you don't use Redis for anything else than caching, developers
don't need to install it. But then you might as well just use memcached.
On Wed, Jan 25, 2017 at 17:35 Jeremy Fleischman ***@***.***> wrote:
I like the simplicity of a expires_at plus a cron job to clean up expired
entries.
I don't have any experience with redis, and am counting on you more
experienced web devs to chime on when it's time to bite that bullet :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1143 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAuGK-xVJP_Ou40F_JplaECXA2yzwKPXks5rV-psgaJpZM4LtexH>
.
--
Procrastinators - the Leaders of Tomorrow
|
Beta Was this translation helpful? Give feedback.
-
Could you elaborate on that? I tried googling the difference between memcached and redis, and landed on this SO answer: http://stackoverflow.com/a/11257333. It makes it sound like if you're starting something new, you should just use redis. Are there situations in which you would recommend using memcached instead of redis? |
Beta Was this translation helpful? Give feedback.
-
memcached is a cache. Period. It does caching very well, and nothing else.
Redis is an alternative data storage solution to MySQL etc. It can do most
anything, including caching. I assumed that the simple specialized tool
would be better than the hack-of-all-trades, but according to that SO
article that is no longer true.
One thing I worry about is the temptation to put some data in Redis, since
it's so simple and cool. The added complexity is invisible at first.
The other problem is making it harder to be WCA developer if it's used for
non caching functionality.
We got into Redis because it is the backend to Resque, when DelayedJob
failed us.
…On Fri, Jan 27, 2017 at 11:36 PM, Jeremy Fleischman < ***@***.***> wrote:
As long as you don't use Redis for anything else than caching, developers
don't need to install it. But then you might as well just use memcached.
Could you elaborate on that? I tried googling the difference between
memcached and redis, and landed on this SO answer:
http://stackoverflow.com/a/11257333. It makes it sound like if you're
starting something new, you should just use redis. Are there situations in
which you would recommend using memcached instead of redis?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1143 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAuGK-vIiJOzzDd5l_tIgJ6u728IFEYeks5rWuHYgaJpZM4LtexH>
.
--
Procrastinators - the Leaders of Tomorrow
|
Beta Was this translation helpful? Give feedback.
-
I fear we may have gotten to scared about the memory usage of redis. And if we don't like it, we can switch back. Redis should be a lot better for us than having to constantly clear the cache and remove files and have the server be down and if none of us are around, it won't be going back up any time soon. Redis also has an option to persist storage to disk if we want the cache to persist when we restart the server. I heavily recommend we switch to something that isn't filestore. |
Beta Was this translation helpful? Give feedback.
-
Just a heads up regarding redis, since we recently ran into this issue: https://redis.io/topics/faq#background-saving-fails-with-a-fork-error-under-linux-even-if-i-have-a-lot-of-free-ram (tl;dr you need at least as much free RAM as Redis is currently using, otherwise saving to disk won't work* or you need to enable * (Automatically) compacting the append only file (AOF) will also not work if you don't have enough RAM. Not compacting the log will lead to horribly long startup times. |
Beta Was this translation helpful? Give feedback.
-
@coder13, we tend to have hundreds of megabytes of cached data on disk. If we switch to Redis, my understanding is that all of that data is going to be stored in memory instead. As I understand it, this isn't a question of how efficient Redis is, it's a fundamental issue with how much stuff we cache. |
Beta Was this translation helpful? Give feedback.
-
You *can* have Redis persisting its data to disk but that's a questionable
thing to do when using it for caching. The advantage is that you have a
full cache if you ever reboot Redis, but I don't know that it's worth the
bother.
Redis will keep as much cached data in memory as you allocate to it. It
will swap out old entries as needed to make place for new ones. This is
independent of how much is allocated to the current file system cache.
…On Sun, Feb 11, 2018 at 2:53 PM, Jeremy Fleischman ***@***.*** > wrote:
@coder13 <https://github.com/coder13>, we tend to have hundreds of
megabytes of cached data on disk. If we switch to Redis, my understanding
is that *all* of that data is going to be stored in memory instead. As I
understand it, this isn't a question of how efficient Redis is, it's a
fundamental issue with how much stuff we cache.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1143 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAuGKzA0ijyHISILNyzDZEqUVNTzl2J0ks5tT29VgaJpZM4LtexH>
.
--
Procrastinators - the Leaders of Tomorrow
|
Beta Was this translation helpful? Give feedback.
-
Here is a dedicated issue to continue the discussion started in #1137:
@viroulep
I had a quick look at the cache directory on prod:
I don't feel confortable just sending the modification knowing we may multiply this number.
I couldn't find a maximum for the cached content size in the configuration files nor could I find a time where the cache expires.
I think we should modify the production environment to set a maximum to the cache store.
Assuming the eviction policy used is LRU, that would still ensure that "trending" parts of the website stays in the cache.
Reading the Rails doc about caches I noticed we could also have some cache stored in the RAM, but I couldn't find information about how to easily setup a multilevel cache (similar to L1/L2/... in CPUs). If this is actually something doable, having a small memory cache before hitting the filesystem would be a nice optimization to do.
A quick gem searching returns this, maybe that's something to look into (not part of this PR though :p)
@jfly
The rails caching guide has the following to say about
ActiveSupport::Cache::FileStore
:Dang! Good catch. This is definitely something we need to fix. Is it even possible to specify a max size on our
ActiveSupport::Cache::FileStore
? My google fu is not being very effective at answering this.@timhabermaas
I don't think so. You probably need to use more sophisticated solutions like Memcached or Redis.
@larspetrus
You can always just delete everything from the cache directory.
So are these cache files accumulated from when the Rails app first was deployed?
I don't see an expiry date in the code. 1-2 weeks would maybe be the right level. Once everyone who went has looked at a competition, there should be very little traffic.
That's assuming the FileCache removes expired files.
@jfly
We spin up a new server semi-regularly, so this is only since the server was last deployed, which looks like 24 days:
How do you set a expiry date in the code?
@larspetrus
Wow. That's a lot more data than I expected.
I think you just pass in
expires_in: 10.days
to thecache()
function, but the pages I look at aren't super clear, and I have to leave now.@timhabermaas
expires_at
does indeed delete the cached file (and not just ignore it):handle_expired_entry
callsdelete_entry
of the file store.So, we could add
expires_at: x
to thecache
calls, but the expired cache pages need to be hit in order to be deleted. Thanks to the auto-expiring nature [1] of the cache keys this won't happen [2], so the file size gains are effectively 0. Besides that most of the cached content (probably > 90%) wouldn't actually need an expiration date since e.g. old competitions are completely static.Assuming the size of the cache directory is a problem [3], here's what we could do: There's
Cache#cleanup
which specifically cleans up expired entries. This could be used in some cron-like fashion to clean up old unused cache keys. We will then need to live with response time spikes every 10 days or so, but that's probably fine.The LRU strategy @viroulep suggested would be a much nicer solution, though, because it will keep frequently visited pages always cached. Sadly, after digging more into the file store, I still think there's no way to achieve this without in-memory stores. We don't happen to run redis anyway?
[1] We could change the cache keys to
[competition.id, view], expires_in: 10.days
and this problem will go away. But this will lead to stale data and we then have to arbitrarily decide on the expiration date (too high = "why's the competition not yet posted, I uploaded it two days ago!", too low = unnecessary performance hit for all competition pages). Not a fan.[2] e.g. we won't ever hit the the old cache of competitions which results have been updated since the
results_updated_at
timestamp is part of the cache key.[3] The directory should grow linearly with the amount of competitions (and now locales). I haven't done any number crunching, but this might work out fine for quite some time depending on the available disk size?
Beta Was this translation helpful? Give feedback.
All reactions