Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tldextract --update can be configured to download private domains #65

Merged
merged 3 commits into from
Apr 1, 2015
Merged

Conversation

kristjanr
Copy link
Contributor

fixes #64

@@ -281,7 +281,7 @@ def _add_extra_suffixes(self, suffixes):
return frozenset(suffixes)
return suffixes

TLD_EXTRACTOR = TLDExtract()
TLD_EXTRACTOR = TLDExtract(include_psl_private_domains=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like where this is going. I agree we must persist all public/private domains, but only include private results if the user opts into them. However this edit will include private domains for anybody who uses the export extract, just a few lines down. That would undo #19.

@kristjanr
Copy link
Contributor Author

another try to fix #64

@@ -374,12 +374,13 @@ def main():
parser.add_argument('-c', '--cache_file', help='use an alternate TLD definition file')

args = parser.parse_args()
tld_extract = TLDExtract(include_psl_private_domains=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anybody who runs tldextract --update and later uses the same cache file will be forced to have public/private domains in their results, even if they only want public domains.

Ideally, we store metadata in the persisted file so results can be filtered at runtime.

For the 1.x series though, I think I'd be happy with include_psl_private_domains being configurable here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the ideal fix, I opened #66.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the 1.x series though, I think I'd be happy with include_psl_private_domains being configurable here.

Bump. Sorry if it wasn't clear, this is the only blocker on this PR. Should be an easy update.

(I don't think anybody's gonna tackle #66 anytime soon. Don't hold your breath for it.)

@kristjanr kristjanr closed this Mar 23, 2015
@kristjanr kristjanr reopened this Mar 23, 2015
@kristjanr
Copy link
Contributor Author

Huh?

@john-kurkowski
Copy link
Owner

If you make include_psl_private_domains configurable, defaulting to false, I can accept this. Otherwise, the PR will inflict private domains on everybody.

@john-kurkowski john-kurkowski changed the title #64 fix https://github.com/john-kurkowski/tldextract/issues/64 tldextract --update can be configured to download private domains Apr 1, 2015
john-kurkowski added a commit that referenced this pull request Apr 1, 2015
tldextract --update can be configured to download private domains

fixes #64
@john-kurkowski john-kurkowski merged commit a421a73 into john-kurkowski:master Apr 1, 2015
@john-kurkowski
Copy link
Owner

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tldextract --update does not download private domains
2 participants