-
Notifications
You must be signed in to change notification settings - Fork 837
Description
From #3547: During conversion of some old data, we have found chunks that have duplicate labels written in them. As long as label values are the same, we can still convert such data.
Bugfix #3547 normalize such labels, but in some situations that's not enough. For example, if original chunks data have following chunks:
- Chunk 1:
{"__name__"="up", "instance"="a"} - Chunk 2:
{"__name__"="up", "instance"="a", "instance"="a"}
Bugfix #3547 will normalize chunk 2 to {"__name__"="up", "instance"="a"}, which is correct, but computed "series ID" (in Cortex) for these two chunks is different. That means that chunks will belong to different series during conversion, and will be processed at different time. This then causes problems during final block build, when series are added to the index == blocksconvert tries to add the same labelset to the TSDB index multiple times.
We can detect this situation when adding series to the index, and can either ignore duplicate label sets (this however loses data), or merge chunks into single series.