Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear thread local values on UTF8TaxonomyWriterCache.close() #12013

Closed
wants to merge 2 commits into from

Conversation

vigyasharma
Copy link
Contributor

UTF8TaxonomyWriterCache keeps a ThreadLocal BytesRefBuilder, the bytes for which, are not removed on thread close. This leads to memory leaks.

The change uses CloseableThreadLocal instead, and closes out the object on cache.close()
Addresses #12000

@rmuir
Copy link
Member

rmuir commented Dec 12, 2022

Thanks, it looks better to use CloseableThreadLocal than ThreadLocal... but I don't understand what this "cache" is doing and why it actually documents that it never frees memory. Especially as a default!

As a followup can we open an issue to move away from Threadlocal storage here by default? I don't understand why we have this "cache" with this description:

/** A "cache" that never frees memory, and stores labels in a BytesRefHash (utf-8 encoding). */

Also, I'm trying to remove ThreadLocal (and CloseableThreadLocal) usages elsewhere in the codebase (e.g. #11998). These threadlocals are problematic in java apps that have many threads (or high thread churn rate), just as CloseableThreadLocal documents and attempts to workaround.

At a glance, seems to me that using a bounded LRU cache would be much more reasonable...

@rmuir
Copy link
Member

rmuir commented Dec 12, 2022

Also (summoning my inner @uschindler), curious what the perf impact is if we simply remove these caches completely. Maybe they are not relevant to the performance anymore due to faceting changes (e.g. storing labels in docvalues rather than payloads) ? Maybe they are even hurting performance?

@uschindler
Copy link
Contributor

Please remove the cache. Totally useless. It hurts more because the escape analysis can't work.
When tiered compilation stepped in, this is just a useless extra

My rule: Never ever cache object instances to prevent object creation and help GC. This was needed in in Java 6 and java 7, but not anymore. It has opposite problem: it actually creates those objects on heap and stresses GC.

@rmuir
Copy link
Member

rmuir commented Dec 13, 2022

Yeah, I'm highly suspicious of this cache. I dug into this and found that previously stored fields (!) were used for this lookup. So no surprise there was a cache around it, since this is a slow approach!!!

But @gautamworah96 fixed this, and these days BinaryDocValues are used instead: see #10490

IMO there is no sense in caching BinaryDocValues lookups into the heap, it is already mmap'd and the operating system will do this.

@vigyasharma
Copy link
Contributor Author

... but I don't understand what this "cache" is doing and why it actually documents that it never frees memory.

My understanding is that TaxonomyWriterCache caches ordinals for all categories created in the index so far, so that categories use the same ordinals when facet labels are added.

It seems that we need such a cache to get ordinals for newly added categories from documents that are still pending flush.
From TaxonomyWriterCache#put() docstring:

   * <p>The reason why the caller needs to know if part of the cache was cleared is that in that
   * case it will have to commit its on-disk index (so that all the latest category additions can be
   * searched on disk, if we can't rely on the cache to contain them).

However, with faster BinaryDocValue fields, maybe we don't need the "never evicting UTF8TaxonomyWriterCache" anymore. And we could make LruTaxonomyWriterCache as the default?

I can close this PR and make that change if it makes sense. Or we can merge this change and take it up in a separate issue. Are there faceting specific benchmarks that can help validate the change?

@rmuir
Copy link
Member

rmuir commented Jan 8, 2023

OK, crazy that having no cache at all could cause unnecessary amounts of flush/reopens. We want at least a small one.

I can close this PR and make that change if it makes sense. Or we can merge this change and take it up in a separate issue. Are there faceting specific benchmarks that can help validate the change?

+1 to make the LRU cache the default. And please, lets also remove this ThreadLocal-based cache completely in main branch, and deprecate in 9.5

@vigyasharma
Copy link
Contributor Author

Created a separate PR - #12092 to remove support for UTF8TaxonomyWriterCache from main. Will close this PR.

@vigyasharma
Copy link
Contributor Author

PR - #12093 to deprecate UTF8TaxonomyWriterCache in 9.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants