Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse autolinks for urls and emails. #11299

Merged
merged 1 commit into from
Jan 21, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion base/markdown/Common/Common.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,5 @@ include("inline.jl")
paragraph,

linebreak, escapes, inline_code,
asterisk_bold, asterisk_italic, image, footnote_link, link]
asterisk_bold, asterisk_italic, image, footnote_link, link, autolink]

37 changes: 37 additions & 0 deletions base/markdown/Common/inline.jl
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,43 @@ function footnote_link(stream::IO, md::MD)
end
end

@trigger '<' ->
function autolink(stream::IO, md::MD)
withstream(stream) do
startswith(stream, '<') || return
url = readuntil(stream, '>')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try url = readuntil(stream, '>', match = ' ')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we hitting this definition already?

function readuntil(stream::IO, delimiter; newlines = false, match = nothing)
withstream(stream) do
buffer = IOBuffer()
count = 0
while !eof(stream)
if startswith(stream, delimiter)
if count == 0
return String(take!(buffer))
else
count -= 1
write(buffer, delimiter)
continue
end
end
char = read(stream, Char)
char == match && (count += 1)
!newlines && char == '\n' && break
write(buffer, char)
end
end
end

I guess I haven't fully grokked what match/count do there, but it looked to me like it was suggesting you'd use < as the match character i.e. matching braces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misread it. I think you're right and need to do:

text = readuntil(stream, '>', match = '<')

I originally read it too quickly and thought setting it to ' ' to would stop reading at whitespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think readuntil returns nothing if the input doesn't have '>'. Try with e.g. Markdown.parse("<link"). Try to add url ≡ nothing && return

url ≡ nothing && return
_is_link(url) && return Link(url, url)
_is_mailto(url) && return Link(url, url)
return
end
end

# This list is taken from the commonmark spec
# http://spec.commonmark.org/0.19/#absolute-uri
const _allowable_schemes = Set(split("coap doi javascript aaa aaas about acap cap cid crid data dav dict dns file ftp geo go gopher h323 http https iax icap im imap info ipp iris iris.beep iris.xpc iris.xpcs iris.lwz ldap mailto mid msrp msrps mtqp mupdate news nfs ni nih nntp opaquelocktoken pop pres rtsp service session shttp sieve sip sips sms snmp,soap.beep soap.beeps tag tel telnet tftp thismessage tn3270 tip tv urn vemmi ws wss xcon xcon-userid xmlrpc.beep xmlrpc.beeps xmpp z39.50r z39.50s
adiumxtra afp afs aim apt,attachment aw beshare bitcoin bolo callto chrome,chrome-extension com-eventbrite-attendee content cvs,dlna-playsingle dlna-playcontainer dtn dvb ed2k facetime feed finger fish gg git gizmoproject gtalk hcp icon ipn irc irc6 ircs itms jar jms keyparc lastfm ldaps magnet maps market,message mms ms-help msnim mumble mvn notes oid palm paparazzi platform proxy psyc query res resource rmi rsync rtmp secondlife sftp sgn skype smb soldat spotify ssh steam svn teamspeak
things udp unreal ut2004 ventrilo view-source webcal wtai wyciwyg xfire xri ymsgr"))

function _is_link(s::AbstractString)
'<' in s && return false

m = match(r"^(.*)://(\S+?)(:\S*)?$", s)
m ≡ nothing && return false
scheme = lowercase(m.captures[1])
return scheme in _allowable_schemes
end

# non-normative regex from the HTML5 spec
const _email_regex = r"^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$"

function _is_mailto(s::AbstractString)
length(s) < 6 && return false
# slicing strings is a bit risky, but this equality check is safe
lowercase(s[1:6]) == "mailto:" || return false
return ismatch(_email_regex, s[6:end])
end

# –––––––––––
# Punctuation
# –––––––––––
Expand Down
3 changes: 2 additions & 1 deletion base/markdown/GitHub/GitHub.jl
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,5 @@ end
github_table, github_paragraph,

linebreak, escapes, en_dash, inline_code, asterisk_bold,
asterisk_italic, image, footnote_link, link]
asterisk_italic, image, footnote_link, link, autolink]

3 changes: 2 additions & 1 deletion base/markdown/Julia/Julia.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ include("interp.jl")
blockquote, admonition, footnote, github_table, horizontalrule, setextheader, paragraph,

linebreak, escapes, tex, interp, en_dash, inline_code,
asterisk_bold, asterisk_italic, image, footnote_link, link]
asterisk_bold, asterisk_italic, image, footnote_link, link, autolink]

2 changes: 1 addition & 1 deletion base/markdown/render/html.jl
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ const _htmlescape_chars = Dict('<'=>"&lt;", '>'=>"&gt;",
'"'=>"&quot;", '&'=>"&amp;",
# ' '=>"&nbsp;",
)
for ch in "'`!@\$\%()=+{}[]"
for ch in "'`!\$\%()=+{}[]"
_htmlescape_chars[ch] = "&#$(Int(ch));"
end

Expand Down
5 changes: 5 additions & 0 deletions test/markdown.jl
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,11 @@ end
@test md"* World" |> html == "<ul>\n<li><p>World</p>\n</li>\n</ul>\n"
@test md"# title *blah*" |> html == "<h1>title <em>blah</em></h1>\n"
@test md"## title *blah*" |> html == "<h2>title <em>blah</em></h2>\n"
@test md"<https://julialang.org>" |> html == """<p><a href="https://julialang.org">https://julialang.org</a></p>\n"""
@test md"<mailto://[email protected]>" |> html == """<p><a href="mailto://[email protected]">mailto://[email protected]</a></p>\n"""
@test md"<https://julialang.org/not a link>" |> html == "<p>&lt;https://julialang.org/not a link&gt;</p>\n"
@test md"""<https://julialang.org/nota
link>""" |> html == "<p>&lt;https://julialang.org/nota link&gt;</p>\n"
@test md"""Hello

---
Expand Down