Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Change the kafka-versions file into JSON/YAML format #2040

Closed
tomncooper opened this issue Oct 1, 2019 · 10 comments
Closed

[Enhancement] Change the kafka-versions file into JSON/YAML format #2040

tomncooper opened this issue Oct 1, 2019 · 10 comments

Comments

@tomncooper
Copy link
Contributor

Is your feature request related to a problem? Please describe.

At the moment the kafka-versions file is in a custom format that makes adding information difficult as you then need to check and possibly update every method that reads it.

Describe the solution you'd like

Changing this file to a standard format such as JSON or YAML would mean standard parsing libraries could be used to extract the information in each relevant method and any additional information added later to the kafka versions would be ignored by those methods that do not use it.

Describe alternatives you've considered
Leaving it how it is and avoiding adding any additional information to the file.

Additional context

I am working on the process for upgrading ZK when the new Kafka version requires a newer version. In order to do this robustly I need to add the ZK version information to the kafka-versions file. This would mean changing all the sections of Strimzi that touch this file so I thought we could bite the bullet and do this work to change the file format now to hopefully save time in the future.

@tombentley
Copy link
Member

I guess one consideration for the schema would be the ease of using jq to extract the various parts needed. But how about this as a starting point:

{
  "2.3.0": {
    "sha1sum": ...,
    "protocol": "2.3",
    "format": "2.3",
    "default": true,
    "3rd-party-libs": "2.3.x"
  },
  ...
}

The other choice would be an array of those inner objects (with an extra version key), rather than the outer object.

In the future we could consider adding a url key if, for example, there was a url for downloading kafka nightlies.

@tomncooper
Copy link
Contributor Author

I was thinking of an array, but this is likely to be parsed into a map (with the version as key) at some point anyway so we might as well store it as one.

Do you think JSON and jq rather than YAML and yq are the better way to go? I have no preference either way.

@tombentley
Copy link
Member

I don't have a strong opinion. The build already depends on yq, so in that sense YAML might be better as it's one less build-time requirement. YAML also supports comments, which might occasionally be useful.

@tomncooper
Copy link
Contributor Author

So the YAML version might be something like:

--- 
"2.2.1": 
  3rd-party-libs: 2.2.x
  default: false
  format: 2.2
  protocol: 2.2
  sha1sum: asdajslkdjaskjdlksjdl

"2.3.0": 
  3rd-party-libs: 2.3.x
  default: true
  format: 2.3
  protocol: 2.3
  sha1sum: asdajslkdjaskjdlksjdl

The only issue I can think of is do we already pull in a Java YAML library? Does fabric8 include something to do this? I am thinking specifically of io.strimzi.operator.cluster.model.KafkaVersion.

@tombentley
Copy link
Member

Yes fabric8 depends on jackson's YAML support, which we already use directly in a few places.

@tombentley
Copy link
Member

We could filter the sha1sum and 3rd-party-libs from the file on the CO classpath, since those are not runtime concerns.

@ppatierno
Copy link
Member

tbh I am not a fan of using a potential value of a field as a key (I am referring to using "2.3.0" instead of having an array of objects and then the field "version": "2.3.0"). I think it makes less readable the YAML or JSON. I think it's something Kubernetes team doesn't use as well.

@tombentley
Copy link
Member

I think the reasons behind why kube doesn't do this don't apply here.

@ppatierno
Copy link
Member

True but that representation remains to be less readable to me.

@tomncooper
Copy link
Contributor Author

I have switched to using an array of maps as that is easier to parse in the KafkaVersion file, but I need to see how this effects usingyq in the command line scripts.

scholzj pushed a commit that referenced this issue Oct 12, 2019
* Refactored kafka versions file into YAML format:

* Added additional unit tests for version parsing
* Refactored docker image build script
* Refactored helm chart build script
* Refactored config model build scripts
* Refactored documentation snip scripts and make file entries
* Refactored unit and integration tests

Signed-off-by: Thomas Cooper <[email protected]>

* Added multiple default version check and unit test

Signed-off-by: Thomas Cooper <[email protected]>

* Updated versions field descriptions

Signed-off-by: Thomas Cooper <[email protected]>

* Refactored yaml parsing code

Signed-off-by: Thomas Cooper <[email protected]>

* Refactored all kafka-versions functions into common tools script

Signed-off-by: Thomas Cooper <[email protected]>

* Refactored unit tests to use centralised kafka version string definitions

Signed-off-by: Thomas Cooper <[email protected]>

* Refactored tests to use single method to create kafka versions lookup

Signed-off-by: Thomas Cooper <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants