-
Notifications
You must be signed in to change notification settings - Fork 107
mt-index-cat: support custom patterns and improve bench-realistic-workload-single-tenant.sh #1042
Conversation
cmd/mt-index-cat/main.go
Outdated
fmt.Println(" the chances need to add up to 100") | ||
fmt.Println(" operation is one of:") | ||
fmt.Println(" pass (passthrough)") | ||
fmt.Println(" <number>rcnw (replace number random consecutive nodes with wildcards") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a confusing sentence replace number random consecutive nodes with wildcards
. I think i understand how it's meant, but hard to read it a few times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about replace <number> random consecutive nodes with wildcards
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that would be much clearer because the number
doesn't look like a word in the sentence
|
||
// random choice between replacing a node with a wildcard, a char with a wildcard, and passthrough | ||
func pattern(in string) string { | ||
mode := rand.Intn(3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't rand
need a seed? otherwise it will keep repeating the same pattern on every run i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mst@mst-nb1:~/abc$ cat main.go
package main
import "math/rand"
import "fmt"
func main() {
fmt.Println(rand.Intn(10))
fmt.Println(rand.Intn(10))
fmt.Println(rand.Intn(10))
fmt.Println(rand.Intn(10))
fmt.Println(rand.Intn(10))
}
mst@mst-nb1:~/abc$ go build main.go
mst@mst-nb1:~/abc$ ./main
1
7
7
9
1
mst@mst-nb1:~/abc$ ./main
1
7
7
9
1
mst@mst-nb1:~/abc$ ./main
1
7
7
9
1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this tool is used for repeated benchmark runs, it seems useful to have a consistent generated workload. no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you could say that, yeah...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're not doing cryptography here so I think this is fine.
// and otherwise, sometimes valid patterns are produced, | ||
// but it's also possible to produce patterns that won't match anything (if '.' was taken out) | ||
chars := rand.Intn(5) | ||
pos := rand.Intn(len(in) - chars) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if len(in)
is <5
then it is possible that this will result be rand.Intn(<=0)
(depending on chars
). That would panic according to: https://golang.org/pkg/math/rand/#Intn
|
||
// round rounds number d to the nearest r-boundary | ||
func round(d, r int64) int64 { | ||
neg := d < 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
round appears to be prepared to deal with negative numbers as if they were positive numbers, but the list of else if
s in roundDuration
appears to assume all numbers are positive. I think for consistency it would be better to either deal with negative numbers correctly or not everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
roundDuration rounds durations. it seems to be implied to me that durations can't be negative.
i see your point though but i would rather not cripple a utility function because it's currently only being called by roundDuration.
we can argue both ways on this but it's not worth the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course a duration can be negative. The sign tells you if the duration is the amount of time before or after the reference time.
Via a comment, at
metrictank/cmd/mt-index-cat/main.go
Line 192 in cda12b7
// set this after doing the query, to assure age can't possibly be negative |
You state that age cant be negative, but that is not true. If the local time where this code is running is wrong, then age can easily be negative. Likewise, if the localtime on the server that is writing to the index is wrong the age can also end up negative.
Both of these are problems that we should not be masking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But looking at the code, negatives already look to be handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true re clock wrong.
also callers of mt-index-cat could technically call the age() function on any integer, which is a 2nd reason how a duration could be negative.
while @woodsaj is correct that we don't seem to break upon negative durations, @replay also brings up a good point that roundDuration doesn't behave as expected when given negative numbers (it should round them also). i will fix
os.Exit(-1) | ||
} | ||
|
||
// we one or more of "<chance> <operation>" followed by an input string at the end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the we
wasn't supposed to be there
@replay thank you. PTAL |
PR has been stuck for 2 weeks... |
@Dieterbe can you rebase first. |
func patternCustom(in ...interface{}) string { | ||
usage := func() { | ||
fmt.Println("usage of patternCustom:") | ||
fmt.Println("input | patternCustom <chance> <operation>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it needs to be made clear that <chance> <operation>
can be repeated. convention for that is
input | patternCustom <chance> <operation>[ <chance> <operation>...]
|
||
// percentage chance, and function | ||
func patternCustom(in ...interface{}) string { | ||
usage := func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is duplicated in flag.Usage(). It should be split out to its own function to use for both?
patternCustomUsage(withExample bool) string {
out := bytes.NewBuffer()
out.WriteString(fmt.Sprintln("patternCustom: transforms a graphite.style.metric.name into a pattern with wildcards inserted according to rules provided:"))
...
return out.String()
}
fmt.Println("the chances need to add up to 100") | ||
fmt.Println("operation is one of:") | ||
fmt.Println(" pass (passthrough)") | ||
fmt.Println(" <number>rcnw (replace <number> random consecutive nodes with wildcards") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consecutive nodes starting from where?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually mean
replace <number> consecutive nodes, from a random position with, a wildcard
what happens if the number of replacements is > than the (numberOfNodes - startPos)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also needs to be made clear that <number>
is a single digit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, i clarified this.
if there's not enough nodes/characters, mt-index-cat will error out. that behavior doesn't need to be explicitly mentioned in the usage though.
fmt.Println("operation is one of:") | ||
fmt.Println(" pass (passthrough)") | ||
fmt.Println(" <number>rcnw (replace <number> random consecutive nodes with wildcards") | ||
fmt.Println(" <number>rccw (replace <number> random consecutive characters with wildcards") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the consecutive characters contained within a single node or will the potentially span multiple nodes.
what if the number of replacements is > (totalChars - startChar)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not node aware. i suppose someone could add a node aware version if the need arises.
if the number is too high, see previous comment.
use 1 simple pattern instead of a mix of various patterns. we simply don't do enough requests to have that become a consistent pattern. Also, we have to filter for which metrics to use. Without filtering we would get all metrics back, e.g.: 1 some.id.of.a.metric.1 1 some.id.of.a.metric.10 1 some.id.of.a.metric.100 1 some.id.of.a.metric.1000 1 some.id.of.a.metric.10000 1 some.id.of.a.metric.100000 ... 1 some.id.of.a.metric.99999 This meant replacing a single char with a '*' could match many more series then we meant to. e.g. the top one, replacing 1 with * would match ALL series, resulting in very unfair benchmark runs. Now we know that we only ever query for 1 or 10 series.
cda12b7
to
f104ca7
Compare
should be good to merge now |
# this selects exactly 10k series that will match the regex, out of which we randomly replace 1 char with a wildcard, resulting in queries usually for 1 series, and sometimes for 10 series (depending on whether the replaced char falls within the dynamic part an the end of the name or not) | ||
# so 20/25 chances for 1 series, 5/25 chances for 10 | ||
# as we execute 100Hz*300s=30k requests, this should give us a plenty high cache hit rate (while still testing the uncached code path) | ||
# in practice, the cache rate sometimes looks fairly low and i'm not sure why. but anyway (seeing about 15% hit partial and the rest are misses), and only in the last minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sentence is strange but anyway (seeing about 15% hit partial and the rest are misses), and only in the last minute
. what is in the last minute, those hits/misses?
# in practice, the cache rate sometimes looks fairly low and i'm not sure why. but anyway (seeing about 15% hit partial and the rest are misses), and only in the last minute | |
# in practice, the cache rate sometimes looks fairly low and i'm not sure why (seeing about 15% hit partial and the rest are misses). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
No description provided.