Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inter-segment I/O concurrency. #13509

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Commits on Jun 20, 2024

  1. Inter-segment I/O concurrency.

    When searching across multiple segments, one doesn't need to wait until the
    first segment is done collecting to start doing the I/O for terms dictionary
    lookups in the next segment. However, doing so introduces a risk that the
    search on the first segment needs to visit so much data that it in-turn evicts
    data that we had prefetched for the second segment before we start searching
    this second segment. So we need some way to control the amount of inter-segment
    I/O concurrency that we allow. I went for a threshold on the sum of the max doc
    of the segments for which we do I/O concurrently, the reasoning being that you
    can search many small segments concurrently since they won't load much into the
    page cache anyway, but you need to be more careful with larger segments. This
    heuristic is not perfect as it only looks at what happens in a single thread
    and only looks at `maxDoc` rather than e.g. the on-disk size of data, but I
    would still expect it to work well enough in practice. I opted for a
    conservative default value of 1,000,000. Said otherwise, Lucene will do (part of
    the) I/O concurrently for as many segments as possible whose sum of `maxDoc`
    doesn't exceed 1,000,000.
    
    We should do the same for collectors, but we cannot do it at the moment because
    we have a number of implementations that expect a segment to be fully collected
    before `Collector#getLeafCollector` is called on the next segment. So I am
    leaving it for a follow-up change.
    jpountz committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    3cea643 View commit details
    Browse the repository at this point in the history