-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Character encoding issue with autolinking #388
Comments
Yup, I've just hit this same issue. |
I've hit this same issue, too. It will spilt out my UTF-8 char, into link with first part of bytes and other bytes keep outside the link. example |
I'm having the same issue as well. Any ideas of a fix for this @vmg? |
I think the problem is the same as #358 But why a UTF-8 char can be splited... |
I've traced the code and extract the function of It's so wired, because after copying the link into buffer in But if |
BTW, |
I found the point here: https://github.com/vmg/redcarpet/blob/master/ext/redcarpet/autolink.c#L227
That when passing When I modified the if statement into |
See vmg#388 for more details.
Not sure if it is redcarpet related (or upstream-kramdown), but I have the same problem when header contains a UTF-8 character: # dupa
## dópa redcarpet --render with_toc_data test.md
<h1 id="dupa">dupa</h1>
<h2 id="d�pa">dópa</h2> When jekyll makes a build I get the following exception:
Normally I'd use an @vmg hope it helps someway :) |
See vmg#388 for more details.
I'm getting invalid byte sequence in UTF-8, trying to render markdown w/ redcarpet on the following char, but only if it's in the (bash) code block. Outside of the codeblock it works fine. The char is on the first line of the code block.
```bash
¢
```
|
I'm still getting this issue when using autolinking. UTF-8 characters are being split apart when they appear after a piece of text that will be autolinked. For instance:
Is going to cause problems. Is there a fix for this? |
Okay, I'll just pull from repo then. Are there plans of another release? |
I have no idea that is this repo going to merge the patch or not. |
Yeah, I realized that. Ugh. Looks like redcarpet has been abandoned - one of us probably should fork it and apply outstanding merge requests. This particular one is a biggy. |
@vmg - Any chance of a fix for this? This one is bitting me as well. This bug can be easily reproduced like this: renderer = Redcarpet::Render::HTML.new(with_toc_data: true)
md = Redcarpet::Markdown.new(renderer, no_intra_emphasis: true, tables: true, autolink: true, quote: true)
md.render("“[email protected]“")
# => "<p>“<a href=\"mailto:[email protected]%E2\">[email protected]\xE2</a>\x80\x9C</p>\n"
# irb(main):008:0> md.render("“[email protected]“").valid_encoding?
# => false |
See vmg#388 for more details.
Just checked why we are maintaining an own fork as well. @robin850 thanks for your last merges and releases. Do you see any chance to merge this one? Do you need any help? |
Not sure what's causing this:
It's fine without autolinking:
The text was updated successfully, but these errors were encountered: