Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Octal and Hexadecimal notation in ip""? #9187

Closed
dns2utf8 opened this issue Nov 28, 2014 · 28 comments
Closed

Deprecate Octal and Hexadecimal notation in ip""? #9187

dns2utf8 opened this issue Nov 28, 2014 · 28 comments
Labels
needs decision A decision on this change is needed
Milestone

Comments

@dns2utf8
Copy link

Hi all

I noticed, that julia does some funny stuff in the REPL when you enter the folowing:
julia> ip"001.021.201.012"
the Answer:
ip"1.17.201.10"

Regards

@Keno
Copy link
Member

Keno commented Nov 28, 2014

This is intended. Numbers with leading 0's are interpreted as octal as required by the RFC. Same behaviror, e.g. in ping:

bash$ ping 001.021.201.012
PING 001.021.201.012 (1.17.201.10)

@Keno Keno closed this as completed Nov 28, 2014
@Keno
Copy link
Member

Keno commented Nov 28, 2014

I realized I misspoke. Rather than following an RFC, we follow the behavior of inet_aton:

http://man7.org/linux/man-pages/man3/inet_aton.3.html

Looking back at this I wonder if that was the correct decision. Might be better to throw an error rather than interpret is as octal. Hmm...

@Keno Keno reopened this Nov 28, 2014
@Keno Keno added the needs decision A decision on this change is needed label Nov 28, 2014
@Keno Keno changed the title ip"a.b.c.d" has some bugs Deprecate Octal and Hexadecimal notation in ip""? Nov 28, 2014
@Keno
Copy link
Member

Keno commented Nov 28, 2014

As a datapoint, quoting from https://tools.ietf.org/html/draft-main-ipaddr-text-rep-02

New applications, however, SHOULD accept only the syntax given in section 3.

which is dotted decimal notation. So I guess the two options here are:

  • Keep the current behavior
  • Have ip"001.021.201.012" be an error

In the latter case, we probably also want to deprecate all non-dotted decimal variants of an ipv4 spelling.

@StefanKarpinski
Copy link
Sponsor Member

Hard to say. inet_aton is pretty standard. Then again just sticking with dotted decimal does reduce the possibility of confusion.

@pao
Copy link
Member

pao commented Nov 28, 2014

We're talking about IPv4 addresses only, right? Canonical IPv6 is in hex.

@Keno
Copy link
Member

Keno commented Nov 28, 2014

Yes, that's right.

@sbromberger
Copy link
Contributor

My two cents (since I'm the originator of #9321):

String constructors for IPv4 should be assumed to be in base 10 and should not error on leading 0s. If 0x prefixing works for hex, great. But 0-padded quads should never be interpreted as octal. Right or wrong, there is network equipment that outputs IPv4 addresses in %03d%03d%03d%03d format, and it is intuitive.

@Keno
Copy link
Member

Keno commented Dec 12, 2014

should not error on leading 0s.

As it stands, the leading 0 octal behavior is unfortunately very common, if we do this change I think there does need to be an error by default, so that the issue will be brought to people's attention. I would however not be opposed to a keyword argument that allows leading 0s.

@sbromberger
Copy link
Contributor

As it stands, the leading 0 octal behavior is unfortunately very common

I've not seen octal octets in dotted-quad notation designed for human consumption in the 20+ years I've been doing this. I realize this is just one anecdatum, but I can point to many instances where network equipment represents decimal dotted quad with 0-padding. (A google search for 192.168.000.001 proves enlightening.)

@sbromberger
Copy link
Contributor

As another suggestion: allow octal but require 4 digits per octet: a leading 0 plus 3 significant digits (the same way we'd require 2 digits for hex).

@Keno
Copy link
Member

Keno commented Dec 12, 2014

When I said common, I didn't necessarily mean in output, but in tools that use is as their input. I'd also be weary of introducing another format that does something different from everything else.

@sbromberger
Copy link
Contributor

The issue is that there is no "everything else". At the very least, there are two camps: devices that interpret any length octet that leads with 0 as an octal, and devices that don't. Of the devices that don't, some significant subset actually promote 0-padded decimal octets.

My brain hurts.

@Keno
Copy link
Member

Keno commented Dec 12, 2014

Yes, I agree that there is the two cases you mentioned. But wouldn't having 0010.1.1.1 and 010.1.1.1 (that's what I understood from your proposal above) be different, not be different from either of the two camps?

@sbromberger
Copy link
Contributor

I don't think so: see, e.g., https://en.wikipedia.org/wiki/IPv4#Address%20representations.

@Keno
Copy link
Member

Keno commented Dec 12, 2014

That pages cites inet(3) which precisely has 0010.1.1.1 be the same as 010.1.1.1. Don't get me wrong, I agree that octal or hexadecimal notation are rarely used in practice (thus this issue in the first place).

@sbromberger
Copy link
Contributor

I get it. If we follow inet(3) all it means is that the string constructor (from the referenced PR) would need to do some (relatively) fairly complex validation. I guess that's unavoidable.

I still think we ought to consider warnings whenever a string contains something that will not be interpreted as decimal octets....

@Keno
Copy link
Member

Keno commented Dec 12, 2014

parseipv4 follows inet(3). The question in this issue is whether we want to keep doing that or disallow octal. There's always the option of having a flag, but as @JeffBezanson said in the other issue that seems undesirable. I would be slightly less uncomfortable with adding a flag that allows fixed-width decimal dot, because as you mentioned it is often used as an output.

@sbromberger
Copy link
Contributor

Yes. I think IPv4 addresses should be interpreted consistent with inet(3). I'd keep doing that without any flag, but possibly raise a warning if the string is interpreted as anything other than decimal octet.

I am not in favor of adding a flag that allows fixed-width decimal dot, because then the representation of the object will not match its intrinsic value. Down that road lies madness.

@JeffBezanson
Copy link
Sponsor Member

I'm not sure I see the value of a warning; this would be data-dependent and you wouldn't see it until your code happens to get input in a certain format. To avoid the warning, you'd have to write your own quasi-parser.

@sbromberger
Copy link
Contributor

Just to close out my thoughts on this with another data point: the ipaddress module in Python 3 does not allow 0-padding of IP addresses in creation. (There are lots of other issues with this module, but perhaps this is informative):

>>> import ipaddress
>>> a = ipaddress.ip_address("192.001.002.010")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/seth/dev/python/cpython/Lib/ipaddress.py", line 54, in ip_address
    address)
ValueError: '192.001.002.010' does not appear to be an IPv4 or IPv6 address

@dns2utf8
Copy link
Author

dns2utf8 commented Jan 2, 2015

I tested this address:

julia> ip"042.042.0042.0042"
ip"34.34.34.34"
$ ping 042.042.0042.0042
PING 042.042.0042.0042 (34.34.34.34) 56(84) bytes of data.

My personal opinion is: this is broken.
I would tolerate octale numbers in the following format: 0XXX

So I would like to see this behavior:

julia> ip"042.042.0042.0042"
ip"42.42.34.34"

How ever, mixing formats should probably triger a warning.

EDIT: cleared the late night misstake of 008

@sbromberger
Copy link
Contributor

What am I missing in the last example?

Octal digits are 0-7 inclusive.

@dns2utf8
Copy link
Author

dns2utf8 commented Jan 2, 2015

ah, it is getting late.
Thanks sbromberger.

@sbromberger
Copy link
Contributor

I'd blame Julia's 1-indexing for that mistake :)

@StefanKarpinski StefanKarpinski added this to the 0.6.0 milestone Sep 13, 2016
StefanKarpinski added a commit that referenced this issue Jan 25, 2017
Remove octal/hexadecimal parsing from IPv4. Fixes #9187
@dns2utf8
Copy link
Author

🎉 this was one of my first issues 🎉

kleinhenz pushed a commit to kleinhenz/julia that referenced this issue Jan 26, 2017
kleinhenz pushed a commit to kleinhenz/julia that referenced this issue Jan 26, 2017
@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented May 12, 2017

@Keno "As a datapoint, quoting from https://tools.ietf.org/html/draft-main-ipaddr-text-rep-02"

I think we should follow section 3 (and 4.1 Be Stringent in What You Accept) fully.

I see from tests e.g. this was allowed:

[I'm sorry, tried to quote code, but ``` made worse..]

julia> parse(IPv4, "192.0xFFFF")
ERROR: UndefVarError: ipv4_leading_zero_error not defined
 in parse(::Type{IPv4}, ::String) at ./REPL[335]:10

no longer, but also not useful, but still e.g. allowed:

julia> parse(IPv4, "192.65535")
ip"192.0.255.255"

julia> IPv4("192.11")
ip"192.0.0.11"

@KristofferC
Copy link
Sponsor Member

Please quote your code.

@StefanKarpinski
Copy link
Sponsor Member

@PallHaraldsson: is this a bug report on the deprecation of this being broken?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests

8 participants