You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The bot currently uses the following regex (https?://\S+) to find links, this has a very annoying edge case, which I will break down now:
Discord allows users to wrap links with < and > to hide embeds automatically (such as <https://example.com>)
If a link is wrapped with <>, discord will correctly recognize where the URL ends, furthermore, it stops highlighting the link in blue, which allows users to type normal messages after the link without using space/whitespace for separation>
Within the bot, because the regex terminates with \S+ we only ever stop matching the URL after whitespace is found.
If the bot detects an evil tracking parameter and the message author has not included whitespace after the link, their proceeding message will always have its first word considered part of the evil parameter and trimmed.
(In the example the word "haha" was trimmed)
A less relevant side-effect which can also be seen in the example is that this will also cause the image to be embedded despite the author having deliberately wrapped it with <> to avoid an embed.
Proposal:
Change the existing URL regex to a new format (see below)
Proposed: https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*) taken from stackoverflow seems accurate enough I guess.
Other options are available.
The text was updated successfully, but these errors were encountered:
The bot currently uses the following regex
(https?://\S+)
to find links, this has a very annoying edge case, which I will break down now:<https://example.com>
)\S+
we only ever stop matching the URL after whitespace is found.(In the example the word "haha" was trimmed)
<>
to avoid an embed.Proposal:
Change the existing URL regex to a new format (see below)
Old:
owo/owobot/cogs/simple_commands.py
Line 166 in cd915fe
Proposed:
https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
taken from stackoverflow seems accurate enough I guess.Other options are available.
The text was updated successfully, but these errors were encountered: