Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add intdot as version scheme #166

Open
tschmidtb51 opened this issue Apr 27, 2022 · 9 comments
Open

add intdot as version scheme #166

tschmidtb51 opened this issue Apr 27, 2022 · 9 comments

Comments

@tschmidtb51
Copy link
Contributor

I suggest to add a version scheme intdot that is defined as follows:
To compare 2 versions a and b, do:

A = a.split('.'); B = b.split('.')   // split each version on .
for i in 0..min(len(a), len(b):
  if (int(a[i]) > int(b[i])):
     return 'A greater than B'
   elseif (int(a[i]) < int(b[i])):
     return 'B greater than A'

if len(a) > len(b):
  return 'A greater than B'
if len(b) > len(a):
  return 'B greater than A'
return 'A matches B'

This will cover 80% of the currently used version schemes that do not follow one of the well-known version-schemes.

Thoughts?

Flagging @pombredanne for comments.

@tschmidtb51
Copy link
Contributor Author

@pombredanne What would be the process to get this included?

@immqu
Copy link

immqu commented Nov 13, 2024

This scheme would essentially be like semver but would allow more labels, e.g., 1.2.3.4.5, correct?

@tschmidtb51
Copy link
Contributor Author

More or less: It just splits at the . and compares all other as integers. So your example is right as well as 2020.0001 and 2.002.10.1.1.1.1.1.
However, if there would be any prerelease or build part (according to SemVer - this would be ignored...

@immqu
Copy link

immqu commented Nov 15, 2024

So, we allow any kind of prerelease/build part, like 1.39.2828-alpha and 1.39.2828.pre, but simply ignore it in the comparisons?

@tschmidtb51
Copy link
Contributor Author

I'm currently not sure, what the best way would be. The algorithm in the description does a straight cast to int... Would that be appropriate? Or would 1.1-100 be then interpreted as [1, -99]? That would not be intended. Also, 10e2 should not be parse into 10*10^2...

So, please provide a suggestion as you go.

@immqu
Copy link

immqu commented Nov 15, 2024

The easiest is to write a regex that separates the two groups, see my current draft.

@matt-phylum
Copy link
Contributor

matt-phylum commented Nov 15, 2024

Things to be careful of:

  • Is "0b10" 0 or 2 or 10 or invalid?
  • Is "010" 8 or 10?
  • Is "0x10" 0 or 10 or 16 or invalid?
  • Is "a10" 10?
  • Is "10a" 10? ECMAScript parseInt thinks so.
  • Is "1a0" 1 or 10? parseInt says 1.
  • Is "١٠" (Arabic-Indic digits) 10? It is in Java but not in ECMAScript (formerly Javascript). Implementing this depends on having decent Unicode support. (univers draft bug: "1" < "٠")
  • Is "𐒡𐒠" (Osmanya digits) 10? These are newer characters outside the Unicode BMP so it won't be 10 if your Unicode implementation is old or broken (like Maven's). It's easy to say that broken Unicode implementations are wrong, but Unicode versioning is a much more complicated issue to deal with because most implementations will just use whatever version is provided by the most convenient string implementation and they'll behave almost the same. If vers is supposed to describe an existing version scheme (why else would vers support this?), exact alignment here may be difficult. These characters are valid in Maven versions, but they are treated as letters instead of numbers.
  • Is "2147483648" (2^31) a valid version number? Some languages don't support unsigned integers.
  • Is "4294967296" (2^32) a valid version number? Some languages don't handle 64-bit integers or may perform a lossy conversion to floating point.
  • Is "9007199254740992" (2^53) less than "9007199254740993"? In ECMAScript these can be equal.
  • Is "18446744073709551616" (2^64) a valid version number? Some languages don't support 128-bit integers.
  • Is "01" greater than, less than, or equal to "1"? It's less in lexicographical sorting and greater for some broken digit-by-digit comparisons (univers draft bug), but probably should be equal.
  • Is "1𐒠" greater than, less than, or equal to "10"? This is similar, but the zero is the Osmanya digit 0, and it's very common for Unicode string implementations to report the length of "1𐒠" as 3.

The safest option is to specify that the digits must be ASCII digits and include either a maximum supported size or some test cases with excessively large numbers to catch implementations that have unexpected limits.

@tschmidtb51
Copy link
Contributor Author

@matt-phylum Thank you for your comments and insights.

I would suggest:

  • ASCII digits-only
  • test cases with excessively large numbers
  • no support for non-digits => break-off at the first one
  • leading 0 are ignored

@immqu
Copy link

immqu commented Nov 21, 2024

See the univers PR here: aboutcode-org/univers#148

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants