Skip to content

Conversation

@KyleSchoonover
Copy link
Contributor

Jira

Tests

  • My PR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

@zcsizmadia
Copy link
Contributor

Nice work. Any reason not to use a 3rd party library, eg. https://github.com/dotnet/command-line-api (still beta, but this will be the MS official parser in the fitire), or https://github.com/commandlineparser/commandline? The reason I am asking is that down the road new features need to be added. E.g. supporting multiple -s and -p files, maybe at the same time. Custom written command line parser can quickly become hard to maintain

@KyleSchoonover
Copy link
Contributor Author

Nice work. Any reason not to use a 3rd party library, eg. https://github.com/dotnet/command-line-api (still beta, but this will be the MS official parser in the fitire), or https://github.com/commandlineparser/commandline? The reason I am asking is that down the road new features need to be added. E.g. supporting multiple -s and -p files, maybe at the same time. Custom written command line parser can quickly become hard to maintain

I agree that using the command line parser would be a good idea assuming we support multiple -s and -p. But this was me spending an hour doing a POC. The only reason I didn't use a library was because it hasn't been touched for 2 years (commandlineparser). As for command-line-api, I haven't looked into it, I personally would wait until it's out of beta. Looks like they are targeting .NetStandard 2.0 which shouldn't cause us compatibility issues. Speaking from experience at MS, dogfooding too early means a lot of rewrites.

@KyleSchoonover
Copy link
Contributor Author

Does anyone know why this would throw for logical and / or? Or does it just throw on it because people don't know how to use it?

@kordos
Copy link
Contributor

kordos commented Mar 10, 2022

Idea with two parts parameter is nice however parsing it looks quite complex and it would require good unit test coverage to prove that all that changes work. if we could use some library for parsing command line arguments then it would be a lot simpler.

@KyleSchoonover
Copy link
Contributor Author

@kordos I can productionize this. I actually want to switch to a "instance" class as opposed to the static Program class. This way we could also allow people to reference the project and build their own execution / automation / etc. I have some time on Friday. So I will submit something then.

@zcsizmadia Do you want me to just bake in your testing and version stuff as well?

@zcsizmadia
Copy link
Contributor

zcsizmadia commented Mar 10, 2022

I would like to discuss first what is the plan with a new command line parsing:

  1. What is the problem we try to solve?
  2. What are the issues and limitations with the current one?
  3. What new features and bug fixes require new command line arguments or new parsing if the current parsing is not sufficient?
  4. Implement the command line parsing in-house or use a 3rd party library?
  5. What use cases we have with a new command line parsing? Examples what a new avrogen could do (via scripting examples)?

One obvious command line requirement I see, which I mentioned earlier, is to support multiple input files (I am working on https://issues.apache.org/jira/browse/AVRO-1724). As far as I can tell right now, the current command line parsing can support it.

I would recommend to discuss these first before we get to actual implementations.

@martin-g @RyanSkraba Any thought on new command line features? Any bugs feature requests might require more advanced command line parsing?

@KyleSchoonover
Copy link
Contributor Author

@zcsizmadia

Please note, this was just a POC. Functional, yes, but more to get an idea of how we could update it. I personally don't like all the functionality lumped into the AvroGen.cs file.

What is the problem we try to solve?

I think breaking up the giant if statement logic. Hopefully making it easier to maintain as new parameters are added.

What are the issues and limitations with the current one?

Testability, in regards that we can't really unit test the functionality.
Extensibility. Not necessarily us extending it, but exposing the appropriate functionality for people to reference it and automate their "AvroGen" process.

What new features and bug fixes require new command line arguments or new parsing if the current parsing is not sufficient?

Bugs:
You can call with additional -s and -p parameters, but it is not handled if someone tries. (generates unexpected results)

Current bug list

I have a feeling it's going to be additional features that are added:
ie: Display versioning, Change how the data is ouput to the directory. Support for multiple input.

Implement the command line parsing in-house or use a 3rd party library?

I would say in house for now. I agree that a 3rd party library could dramatically reduce the maintenance. But we may want this to be a follow up story for when MS (dotnet) releases their version. Then we know it's still being supported.

What use cases we have with a new command line parsing? Examples what a new avrogen could do (via scripting examples)?

I think this is a more technical maintenance gain than a user gain. The bigger user gain would be adding new parameters and better support for existing parameters.

@zcsizmadia
Copy link
Contributor

Great answers.Personally I am heavily in favor of thirs party library. Every project I needed command line parsing, the simplicity, readibility, features of a third party library is very hard to beat. I worked with the 3 major ones, and currently I am using System.CommandLine in my project. I would vote against that just simply because it needsa work between betas. Maintaining a command line parser which is in par with other libraries, is not a trivial task IMO.

My current feature I am working on is the multifile support, and to be honest, with minimal modification the current implementation can be used. However , if new features require more advanced parsing, there will be need for a modern command line parser, either in-house or 3rd party.

@kordos
Copy link
Contributor

kordos commented Mar 11, 2022

I also prefer third party library as in avrogen we should focus on improving that tool instead of writing and maintaining own implementation of command line parser. Although I've never used any in C# however in other languages it was as simple as calling some method like getArgument() or getOption() so it is hard to imagine that .Net don't have stable command line parser...
If we go with building in house command line parser I'm afraid that it will increase entry point for new contributors as some changes may also be introduced to the parser...

@KyleSchoonover
Copy link
Contributor Author

I'm more for implementing the minimum requirements to parse the command line sent. Parsing command line is actually the easy part. Making the data useful is the harder part.

See GetTwoPartArgument and GetTwoPartArguments. All we need is a three part version.

@KyleSchoonover
Copy link
Contributor Author

KyleSchoonover commented Mar 11, 2022

@kordos and @zcsizmadia Updated the POC to be a bit more like what I would expect.

The Generator class is more testable for the functionality
The AvroGen class is mainly parameter definition and parsing now.

And if you are wondering the Usage method and supporting methods took the longest. Stole the WordWrap method from a stack overflow article.

@KyleSchoonover
Copy link
Contributor Author

@martin-g @RyanSkraba @zcsizmadia @kordos
Between the mailing list, jira, and github is anyone against having a Avro-cSharp slack group? Really want to get feedback from @martin-g and @RyanSkraba before going down this route. But I do have a slack group set up. https://join.slack.com/t/avro-csharp/shared_invite/zt-15hamhx29-ONkVjmZy4oexPb8O0IitKg

@kordos
Copy link
Contributor

kordos commented Mar 15, 2022

Great. I'll have check your changes in my spare time

@RyanSkraba
Copy link
Contributor

Hello! There's an ASF slack already! Let me know if you want an invitation to join -- but remember, all decisions should be discussed on the mailing list.

I have to admit, I'm a bit lost around all of the C# PRs!

@martin-g
Copy link
Member

martin-g commented Mar 17, 2022

+1 to use the mailing list for common discussions!
Discussions about specific code changes are better to in in the PR comments, like here.
Otherwise you can find me in ASF Slack #avro channel. But I live in GMT+2 and I don't have many overlapping hours with USA citizens.

@KyleSchoonover
Copy link
Contributor Author

@zcsizmadia Let me know if you are on the mailing list. We should probably have an all up conversation around avrogen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants