Conversation
gbanasiak
left a comment
There was a problem hiding this comment.
Integration tests grind to a halt in a test that happens after the newly added streams. I haven't looked why exactly that happens but it might be necessary to add cleanup of items added in logging-streams challenge in https://github.com/elastic/rally-tracks/blob/master/elastic/logs/tasks/index-setup.json which is shared between challenges. Note integration tests for elastic/logs reuse the same ES cluster. Whatever is left by the previous tests and not cleaned-up by current test can potentially affect the outcome.
Similarly, the new challenge should be removing data streams.
|
Thanks for the comments, @gbanasiak - I addressed your comments:
|
|
@gbanasiak seems like it still has the same problem. Is it because it's crunching too much data? Based on the params it looks like it will handle 72GB of data which seems like a lot. OTOH the other test cases do that too? |
gbanasiak
left a comment
There was a problem hiding this comment.
The problem is definitely due to state in which ES cluster is left after streams challenge test. The streams challenge test takes about a minute to complete which is fine. The following test was taking around a minute earlier, and now gets stuck for more than 1h.
Please take a look at further comments. If this doesn't help, the test should be reproduced locally, and state of ES cluster verified.
|
@gbanasiak Thanks for the look. This is working with I solved this with the reroute because that was the easiest way, maybe there is a better one to represent this. My understanding was that it's https://github.com/elastic/rally-tracks/pull/816/files#diff-352462dbff160a0065b6f5dadc84cf7005070e2353a0663774cbf3b9d62165dbR267 what is actually creating these data streams, and the new challenge is not calling the setup task on purpose (because it's all inlined for this special case). I just re-tested this locally and it seems to leave behind a |
|
Should we maybe introduce a param similar to rally-tracks/elastic/logs/track.json Line 223 in 24c4b0b and based on that change the data streams mentioned in I tried to contain the necessary changes to the challenge, but maybe that's the wrong way of doing it. |
|
@gbanasiak testing locally I found an issue, let's see whether it works now: I used instead of it somehow didn't complain about this, but it also didn't do anything, I guess it just got the default parameters this way. Also for
Who should I report this to? The question whether we should change the track.json via a parameter, create a completely separate track or keep it in the challenge like right now still stands though. |
|
Still times out, but it gets a little further now (two tests instead of one after the streams one) - I don't know enough about rally to be able to tell what's going on here. Is this new test just pushing it over the edge? I can't reproduce any issue locally, when I run the challenge on my local machine, I'm getting a clean system afterwards: I think I need some knowledgable eyes here. A related question - with deleting the data stream after the test, the "final score" that's output in the console after the race concludes does not include numbers about ingest time and size after merge anymore. Is this captured by the side in the metrics store? I'm interested in
|
|
@flash1293 Thanks for pointing out Note More fundamentally, the cleanup is added at the end of the challenge which is unusual and may affect Rally telemetry taken at the end of the race. In my initial comment I was suggesting to expand cleanup steps shared between the challenges in Is the reroute absolutely necessary? Couldn't you achieve the same through conditional modification of composable templates used in Thanks for pointing out the documentation bug, I'll raise a Rally PR for that. |
facepalm - will fix, thanks!
The name of the data stream doesn't matter too much, but we need all of the data from the different data sets in a single data stream, that's the whole idea. Not sure how to achieve that without a reroute. We could list the I guess forking the track completely is still possible, but I would like to avoid if we can. |
Ok, let's try that. Can we use something more telling than |
Looking at this closer, the |
It didn't work for me, it failed with some error about "missing operands" |
Can you confirm it wasn't caused by the first problem noted in #816 (comment) (operation attributes at a wrong JSON document level) ? |
Co-authored-by: Grzegorz Banasiak <grzegorz.banasiak@elastic.co>
|
Let's see whether it works with your latest change. |
|
@gbanasiak looks like there is no templating support for ingest pipelines... do you have another idea? We can also move it back to the challenge as last resort. |
Ah, of course,
Yup, that's the only option I think. Then conditional is not needed. At some point we might consider this challenge runnable in serverless which is when corresponding integration test should be added. |
|
@gbanasiak Alright, moved the pipeline back. The plan is to make this available on serverless as well within the next months, I can adjust once we are there. |
|
@flash1293 For backporting: what is the oldest ES version that supports the newly added challenge? |
|
@gbanasiak It shouldn't be backported, just 9.2 and soon (hopefully) serverless should be supported. |
|
Related to https://github.com/elastic/streams-program/issues/368
Set up the
logsdata stream like streams does it and throw theelastic/logsdataset at it.This challenge measures indexing performance and storage usage, a comparable challenge to measure query performance is located in https://github.com/elastic/rally-tracks/blob/master/elastic/logs/challenges/logging-chicken.json
The processing pipeline still contains a couple polyfills for features that didn't land yet, but this is expected for now - I will keep it in sync with what happens in practice.