Skip to content

Conversation

pracucci
Copy link
Contributor

What this PR does:
This is an internal refactoring with no user impact. In the PR #3354 I received the feedback:

feel like error-handling code could be clearer by using errgroup. Also splitting the concerns would help -- one goroutine doing the scan, other N goroutines doing opening. Now both are mixed.

In this PR I'm addressing this feedback splitting the logic between the "scan" and the "opening".

Which issue(s) this PR fixes:
N/A

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still bit more complicated than I would prefer. Suggestion to use errgroup would give us 1) handling of error (no need for errs and errsMx), 2) cancellation of context (right now if opening of one TSDB fails, other will still be tried), 3) simplified waiting (no need for wg).

When using errgroup, I'd suggest to spawn dir-walker as another goroutine, which can return its own error. Dir walker should react on group context cancellation, and by closing channel via defer, it will let tsdb-opening goroutines to stop as well.

Currently this code can return multiple errors – using errgroup will only return first one [I don't think this matters too much], but we can still log them all – as it is now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By returning here, we lose a consumer of queue channel, which can lead to blocking of dir walker, when it tries to send another item to the channel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh good point!

@pracucci pracucci force-pushed the refactor-ingester-tsdb-opening-concurrency branch from 0e92126 to 492dab6 Compare October 27, 2020 08:47
@pracucci
Copy link
Contributor Author

Thanks @pstibrany for your thoughtful code review. Could you take another look, please?

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this is unnecessary since dir-walker reacts on groupCtx.Err() and closes queue.

@pracucci pracucci force-pushed the refactor-ingester-tsdb-opening-concurrency branch from 492dab6 to 1b0e31d Compare October 27, 2020 14:31
@pracucci pracucci merged commit a62f938 into cortexproject:master Oct 27, 2020
@pracucci pracucci deleted the refactor-ingester-tsdb-opening-concurrency branch October 27, 2020 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants