Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: ipv4 property #128

Closed
jvanasco opened this issue May 1, 2017 · 3 comments
Closed

proposal: ipv4 property #128

jvanasco opened this issue May 1, 2017 · 3 comments

Comments

@jvanasco
Copy link
Contributor

jvanasco commented May 1, 2017

I wanted to propose a property to support potential IPv4 addresses.

This is definitely offtopic as an IPv4 isn't a TLD - however the input to this library can often be a URL and it is often used as part of URL/HOST validation. Detecting this within the package makes it a bit easier for developers to support handling situations where a the 'domain' is invalid, however it is a valid resource.

A possible implementation is below -- in the absence of a suffix or subdomain, the domain is checked to match 4 groups of integers 0-255.

@property
def ipv4(self):
    """
    Returns the ipv4 if that is what the presented domain/url is

    >>> extract('http://127.0.0.1/path/to/file').ipv4
    '127.0.0.1'
    >>> extract('http://127.0.0.1.1/path/to/file').ipv4
    ''
    >>> extract('http://256.1.1.1').ipv4
    ''
    """
    if not self.suffix and not self.subdomain:
        domain_parts = self.domain.split('.')
        if len(domain_parts) == 4:
            if all([i.isdigit() for i in domain_parts]):
                if all([int(i) < 255 for i in domain_parts]):
                    return self.domain
    return ''
@john-kurkowski
Copy link
Owner

Seems good so far! There's already an IPv4 regex in the source. There should be plenty in the wild. It could shorten that example implementation.

Could you provide some code examples of tasks this makes easier? That's awkward to do without the new property?

@jvanasco
Copy link
Contributor Author

jvanasco commented May 2, 2017

One common use case is to ask "Hey, is this URL-looking thing (probably) a valid URL?"

There are a lot of ways to potentially write that as code, but it often largely boils down to something like...

    if tldextract(candidate).registered_domain or othercode.is_ipv4(candidate):

the idea is to just consolidate it a bit:

    extracted = tldextract(candidate)
    if extracted.registered_domain or extracted.ipv4:

it's not too hard to do the ipv4 stuff in othercode, but doing it in tldextract keeps others from reimplementing things and avoids another parsing of a full URL (if that is the input). in other words, if candidate is a url and not a host, it's considerably cheaper (and more convenient) to have the ipv4 status accessible.

@john-kurkowski
Copy link
Owner

Thank you. Got it.

Before, I was thinking, what could users infer from existing properties, like tldextract(candidate).domain. That alone could indicate a valid URL on their intranet. Not necessarily a valid public URL though. Then I wondered, should this proposal call it

@property
def is_possibly_a_valid_public_url()
    return self.registered_domain or (not self.suffix and not self.subdomain and IPV4_RE.match(self.domain))

But we should only give users the tools to answer what their definition of "possibly" is. We shouldn't define it for them. ipv4 is such a tool. I like your original proposal. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants