Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary, octal, and hexadecimal literals and formatting #1968

Merged
merged 11 commits into from
Sep 26, 2020

Conversation

clnhlzmn
Copy link
Contributor

Adds support for binary, octal, and hexadecimal literals.

@clnhlzmn clnhlzmn changed the title Feature/bin oct hex literals Binary, octal, and hexadecimal literals and formatting Sep 18, 2020
@josdejong
Copy link
Owner

Looks really good and straightforward at first sight Colin! Thanks a lot 👍

I'll do some more indepth review soon. Maybe a first feedback: I think we should write a section about these new features in the docs, on the expression parser syntax page and/or maybe also on the numbers page, not sure.

@clnhlzmn
Copy link
Contributor Author

Looks really good and straightforward at first sight Colin! Thanks a lot 👍

Thanks! I am working on some more improvements to this that I will push soon.

@clnhlzmn
Copy link
Contributor Author

clnhlzmn commented Sep 21, 2020

In the last commit I modified the way the literals are parsed and formatted. The literals are parsed as 32 bit 2s complement signed integers. This change causes the following comparisons to be true:

math.evaluate('0xffffffff') === -1
math.evaluate('0o37777777777') === -1
math.evaluate('0b11111111111111111111111111111111') === -1

This is contrary to the way these literals are treated in plain JS where they would be treated as unsigned integers, i.e.:

0xffffffff === 4294967296
0o37777777777 === 4294967296
0b11111111111111111111111111111111 === 4294967296

I believe it makes more sense to treat binary/octal/hex literals as signed integers because that's what the bitwise operators in mathjs and plain JS operate on.

Similarly the format functions bin, oct, and hex now will format -1 as '0b11111111111111111111111111111111', '0o37777777777', and '0xffffffff' respectively. Before the last commit the format functions just used Number.toString(base) without any additional modification which would format -1 as '-1'.

This commit also adds checking that a literal doesn't exceed the size of a 32 bit 2s complement signed integer in the parser and checks that the value to be formatted is an integer and is within the size of such integers. For example, math.evaluate('0x100000000') will throw an SyntaxError and math.hex(2**32) will also throw an error where before the last commit they would not.

I am not sure if it makes sense for this addition to behave this way or not.

Please let me know what you think @josdejong.

@josdejong josdejong mentioned this pull request Sep 23, 2020
@josdejong
Copy link
Owner

I did some more thorough testing, and this really works like a charm. Nice job Christopher!

Some feedbacks:

  1. Parsing as signed integers makes a lot of sense to me.
  2. The checks for "overflow" and integer input are very nice! That prevents nasty surprises
  3. I think we should still write a section in the docs of the expresssion about bin/oct/hex input and formatting.
  4. Should we make the number of bits (currently 32) configurable? Or is this just the most common number of bits? 64 bits doesn't work for regular numbers, so it makes sense, though when using BigNumbers it would be possible. however, making this configurable and using BigNumber under the hood would also complicate matters quite a bit.

@clnhlzmn
Copy link
Contributor Author

I did some more thorough testing, and this really works like a charm. Nice job Christopher!

It's Colin, but thanks!

1. Parsing as signed integers makes a lot of sense to me.

Me too.

2. The checks for "overflow" and integer input are very nice! That prevents nasty surprises

Thanks!

3. I think we should still write a section in the docs of the expresssion about bin/oct/hex input and formatting.

I will get on this as soon as I have a chance.

4. Should we make the number of bits (currently 32) configurable? Or is this just the most common number of bits? 64 bits doesn't work for regular numbers, so it makes sense, though when using BigNumbers it would be possible. however, making this configurable and using BigNumber under the hood would also complicate matters quite a bit.

I do think the number of bits should be configurable. I like the idea of using BigNumbers behind the scenes, but I understand that would be quite a larger undertaking (maybe?). I had an idea that if you use a literal, like 0xff, you get a signed 32 bit int (which could be a BigNumber or a regular number) by default. Maybe then there could be a configuration that changes that default behavior for literals. Then there would also be separate functions that would parse strings as different sized integers for example int8('0xff') would result in -1 and int16('0xff') would give you 255. Then the formatting functions would have to be extended with different versions for different sizes also. That doesn't seem like the best way to do it though.

I think ideally (and I'm pretty sure you mentioned this elsewhere) there would be separate types (Int8, Int16, etc.) behind the scenes and a literal like 0xff will give you Int32 by default and if you want something else you use a constructor with a string like int64('0xff'). Then the format functions would work with those types automatically. As far as I can tell, though, that would require an exponential amount of work in all the other math functions to make those sized types play nice.

@josdejong
Copy link
Owner

Sorry for mixing up your name Colin :(

Ok let's think through making number of bits configurable.

It may be quite easy to implement it as BigNumbers, we'll have to see.

I was also thinking: JavaScript does not have types like Int32, but instead of using BigNumber, we could also use BigInt. One issue though is that none of the mathjs functions yet supports BigInt, so we would need to implement something there first. And Internet Explorer doesn't support BigInt, so we would need to fall back to number and limit the max bits to 32 on this browser for example.

@josdejong
Copy link
Owner

Maybe we should support both number and BigNumber and simply respect the config option {number: 'number' | 'BigNumber'}. In case of number, we would have to limit the config option bits to 32.

@clnhlzmn
Copy link
Contributor Author

I was thinking the types Int32 and friends would be mathjs types that would be implemented using BigNumber under the hood.

@josdejong
Copy link
Owner

Ah, I get what you mean. So we could make new custom data types for it like Int32, Int64, etc. This is possible but then we have to implement support for those types in all functions, which is a lot of work. "Just" using BigNumber/number (depending on config) would make it much easier since all functions already support those data types.

@clnhlzmn
Copy link
Contributor Author

I added a small section in the syntax documentation.

@josdejong
Copy link
Owner

Thanks for adding docs, looks good.

I think this PR is ready to go, unless we want to directly implement support for configurable number of bits and integration with an other type like BigNumber or BigInt or anything. I think this first step will not conflict with those extensions so I expect we can do that safely as a separate, second iteration. What do you think?

@clnhlzmn
Copy link
Contributor Author

I think this is a good first step too. I agree the other improvements can come later as a second step.

@clnhlzmn clnhlzmn marked this pull request as ready for review September 26, 2020 14:06
@josdejong
Copy link
Owner

👍 merging now.

Thanks again!

@josdejong josdejong merged commit f5d843b into josdejong:develop Sep 26, 2020
@clnhlzmn
Copy link
Contributor Author

Thanks!

@josdejong
Copy link
Owner

Published now in v7.3.0 🎉

@clnhlzmn clnhlzmn deleted the feature/bin-oct-hex-literals branch October 12, 2020 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants