Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync with cmark-gfm-0.29.0.gfm.5 #315

Merged
merged 3 commits into from
Jun 8, 2023

Conversation

digitalmoksha
Copy link
Collaborator

This syncs with cmark-gfm-0.29.0.gfm.5 release

@digitalmoksha
Copy link
Collaborator Author

Since the version of cmark-gfm in the vendor directory is at a different level, and thus the extensions.txt is the incorrect version, the autolink tests still fail.

But if you copy the latest version of extensions.txt, they pass.

@digitalmoksha
Copy link
Collaborator Author

@kivikakk I think this one is also ready to go.

5 -> 6 will be simple, but 6 -> 7 is going to be a bear 😢

@kivikakk
Copy link
Owner

kivikakk commented Jun 8, 2023

Thank you! Can confirm autolink tests are fixed from here.

Since we're failing spec in CI anyway right now, I've bumped our vendored cmark-gfm to the latest from upstream. I've also adjusted the CI script to always run all tests (instead of stopping at the first failure), to make it easier to see.

Also, 1c2702d is a minor efficiency thing that you might find good to know. String literals are of type &'static str; a non-mutable string slice with static lifetime (i.e. the entire lifetime of the program, since they're stored in the binary .rodata itself). This avoids us having to allocate and deallocate the strings each time email_match runs.

@kivikakk kivikakk merged commit 79ac222 into kivikakk:main Jun 8, 2023
@kivikakk
Copy link
Owner

kivikakk commented Jun 8, 2023

5 -> 6 will be simple, but 6 -> 7 is going to be a bear 😢

How about, once 5 -> 6 is done, I take 6 -> 7?

@digitalmoksha
Copy link
Collaborator Author

Also, 1c2702d is a minor efficiency thing that you might find good to know.

Thanks - I knew I didn't quite have that right.

How about, once 5 -> 6 is done, I take 6 -> 7?

Oh yeah, that would great 😄

@digitalmoksha
Copy link
Collaborator Author

It looks like we don't need the 5 -> 6 sync. That release is just for the security issue GHSA-cgh3-p57x-9q7q, which deals with the autolink extensions and massive number of open image links.

The use of in_bracket_image0 and in_bracket_image1 only actually makes any difference in cmark_inline_parser_in_bracket, https://github.com/github/cmark-gfm/blob/2d65cd3c4bfbbdddc7accefc76392c16bb0cfb6d/src/inlines.c#L1756-1766, which only gets called in https://github.com/github/cmark-gfm/blob/2d65cd3c4bfbbdddc7accefc76392c16bb0cfb6d/extensions/autolink.c#L279-297.

Since you're parsing autolinks a little differently, comrak doesn't seem to suffer from that specific problem. This would then just be adding useless code.

The pathological tests pass, the latest of which has a test for this. You can also verify using

python3 -c 'print("![l"* 100000 + "\n")' | cargo run -- -e autolink

So I think we can consider 5 -> 6 as done/not needed.

wdyt?

@digitalmoksha digitalmoksha deleted the bw-sync-with-0.29.0.gfm.5 branch June 8, 2023 17:09
@kivikakk
Copy link
Owner

I think that's fine, yep.

I've had a quick look at .6 to .7, and it sure is big, but in fact most of these fixes have already been backported as part of the big stuff I did a few months ago. All the big changes I recognise.

There may be autolink issues remaining given the difference in structure, but (a) some I have certainly fixed already in 2f9b43f and f61e8bb, and (b) none that have since been discovered using the quadratic case fuzzer given us by GitHub in #295.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants