-
Notifications
You must be signed in to change notification settings - Fork 698
Fix the IOStats computation #1710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
index/scorch/scorch.go
Outdated
| result.VisitFields(func(f index.Field) { | ||
| atomic.AddUint64(&s.stats.TotBytesIndexedAfterAnalysis, | ||
| analysisBytes(f.AnalyzedTokenFrequencies())) | ||
| if segment.CollectIOStats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend against adding this flag in scorch_segment_api.
Instead make it a scorch config option and apply this only while -
- Estimating usage within bleve
- Reporting usage reported by zap again from bleve
Leave zap's computation on always.
d359c60 to
ff04e75
Compare
ca57327 to
1f157d2
Compare
|
Unit test |
0b7fd9d to
f8f966f
Compare
|
Turns out, |
|
|
||
| contentFieldMapping.IncludeInAll = true | ||
| tmpIndexPath2 := createTmpIndexPath(t) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean up of this path is missing?
Also, should we need to defer it as we could try it right after the index close? (doesn't matter much)
It could also be written like a table-driven test since the variations across run are only a few values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
5555143 to
2acbf6a
Compare
| idx.Close() | ||
|
|
||
| return statValError | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Insert a new line at the end of a method.
| t.Fatal(err) | ||
| } | ||
| cleanupTmpIndexPath(t, tmpIndexPath4) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
2acbf6a to
351885c
Compare
The newly introduced stats:
num_bytes_read_at_query_time - computes the bytes read from disk while query
num_bytes_indexed_after_analysis track the bytes read - computes the bytes written to disk while indexing
are mainly used to track the disk utilisation for the current index. Currently, the num_bytes_indexed_after_analysis (now changing to num_bytes_written_at_index_time) considered only the total bytes of the tokens after the analysis of a field's content. However, a user can further store a field, enable doc values to be stored, enable location information of a term to be stored or even include the field's content in _all field. All these options incur additional cost in terms of disk utilisation and have to be considered in the stats. The PRs #119 and #125 and the current one aim to achieve these changes.