Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.40.0: Unexpected disk cache grows #102

Open
nixargh opened this issue Jan 25, 2024 · 3 comments
Open

v0.40.0: Unexpected disk cache grows #102

nixargh opened this issue Jan 25, 2024 · 3 comments

Comments

@nixargh
Copy link

nixargh commented Jan 25, 2024

Hi, team!

We used 0.39.1 before and disk cache consumption was quite stable, like 100GB from 1TB partition.
But with 0.40.0 cache started to grow continiously and reached about 600GB in a few hours. Rollback fixed the issue.
So something has changed and I don't understand what will happen with geesefs when cache partition is full.
Updated: our system makes a lot of writes and moves but few reads. So I believe it is write cache that grows.

  • Can you shed some lite on how disk cache works?
  • Maybe give me some advise about valid parameters etc?

Parameters we use:
/usr/local/bin/geesefs --endpoint 'https://s3.host' --region foo --storage-class STANDARD --uid 5001 --gid 5001 --no-checksum --memory-limit 3072 --read-ahead-large 20 --max-flushers 32 --max-parallel-parts 32 --part-sizes '50' --single-part 50 --cache '/media/cache/s3fuse/%i' -o allow_other --cheap --no-specials '%i' '/media/%i'

@vitalif
Copy link
Collaborator

vitalif commented Feb 6, 2024

Hi. Sorry for not answering :-)
It's a great surprise to me that there are real users of disk cache :-)
Disk cache in fact never was a 100% complete implementation - it had always missed eviction i.e. cache size was never limited.
Before 0.40 there was an ugly "popularity" tracking implementation, but I thought it was a complete mess and removed it in 0.40. In fact some people who tried disk cache before 0.40 even filed bugs similar to "why is the disk cache empty?!" - because the idea was that only "popular" files were evicted from memory to the disk. And "non-popular" files were simply removed from memory without being copied to the disk.
Your issue means that it was limiting the disk cache so it was useful to some extent :)
I can think about returning it back, at least in some modified form...
What's your use case by the way, do you have a subset of "hot" files?

@nixargh
Copy link
Author

nixargh commented Feb 6, 2024

Hi, thanks for answering )
As I wrote before, we do a lot of writes and only few reads, so I believe these are not "hot" files at all but a kind of intemediate files on they way to S3.

@vitalif
Copy link
Collaborator

vitalif commented Feb 6, 2024

In that case I suppose it's better for you to disable disk cache at all :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants