Skip to content
This repository has been archived by the owner on Apr 19, 2024. It is now read-only.

.cat domain breaks link detection #58

Closed
harava opened this issue Nov 9, 2012 · 2 comments
Closed

.cat domain breaks link detection #58

harava opened this issue Nov 9, 2012 · 2 comments

Comments

@harava
Copy link

harava commented Nov 9, 2012

The link detection doesn't work with .cat domains. For example, in the case of http://lol.cat/, it incorrectly detects it as http://lol.ca.

This also applies to some other sponsored top-level domains.

@stfnm
Copy link
Contributor

stfnm commented Nov 9, 2012

Yeah, the problem is that the regular expression which is used for URL matching (see here) is quite picky.

The "cat" domain would have to be added explicity; something like:

((((https?|ftp):\\/\\/)|www\\.)(([0-9]+\\.[0-9]+\\.[0-9]+\\.[0-9]+)|localhost|([a-zA-Z0-9\\-]+\\.)*[a-zA-Z0-9\\-]+\\.(com|net|org|info|biz|gov|name|edu|cat|[a-zA-Z][a-zA-Z]))(:[0-9]+)?((\\/|\\?)[^ \"]*[^ ,;\\.:\">)])?)|(spotify:[^ ]+:[^ ]+)

However I believe it would be better to rewrite this regex completely, as it is much too complex and by far doesn't cover all the existings domains nowadays anyway.

This is by the way a duplicate of issue #3.

@FauxFaux
Copy link
Owner

Duplicate of #3.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants