Account for front matter when calculating `sourcepos` #494

SamWilsn · 2024-11-28T21:30:58Z

Hopefully this is sufficient to account for the front matter when calculating sourcepos for subsequent nodes.

SamWilsn · 2024-11-28T21:32:31Z

src/parser/mod.rs

+                    start: nodes::LineColumn { line: 1, column: 1 },
+                    end: nodes::LineColumn {
+                        line: lines,
+                        column: delimiter.chars().count(),


This line I'm particularly unsure about. Are columns supposed to be bytes, characters, or actual columns (i.e. accounting for ZWJ, combining diacritics, etc.)?

This question gave me pause too. Instead of me trying to infer it from some of my own code, let's try to get it from the source (pun intended (video unrelated)). If we ask cmark for sourcepos of some text with minimally non-byte-oriented text …

$ echo 'テスト test' | build/src/cmark --sourcepos テスト test

… we see it doesn't make any attempt to interpret UTF-8 whatsoever. This is a bit ¯\_(ツ)_/¯ But it accords with our current behaviour, which makes sense given we modelled on it totally:

$ echo 'テスト test' | cargo run -- --sourcepos テスト test

So, for consistency (and ease of your implementation), I'd say let's start with bytes, and I'll open an issue to think about improving this (#495). There's a few places in parser/mod.rs where we actually use self.column and treat it as a byte count, and while I'm not sure it'd come into play here exactly (they're mostly around leading indent), it's another vote for bytes.

SamWilsn · 2024-11-29T15:45:09Z

Narrator: It was not sufficient.

kivikakk

Thank you so much!

kivikakk · 2024-11-29T20:46:09Z

src/parser/mod.rs

+                    start: nodes::LineColumn { line: 1, column: 1 },
+                    end: nodes::LineColumn {
+                        line: lines,
+                        column: delimiter.chars().count(),


This question gave me pause too. Instead of me trying to infer it from some of my own code, let's try to get it from the source (pun intended (video unrelated)). If we ask cmark for sourcepos of some text with minimally non-byte-oriented text …

$ echo 'テスト test' | build/src/cmark --sourcepos テスト test

… we see it doesn't make any attempt to interpret UTF-8 whatsoever. This is a bit ¯\_(ツ)_/¯ But it accords with our current behaviour, which makes sense given we modelled on it totally:

$ echo 'テスト test' | cargo run -- --sourcepos テスト test

So, for consistency (and ease of your implementation), I'd say let's start with bytes, and I'll open an issue to think about improving this (#495). There's a few places in parser/mod.rs where we actually use self.column and treat it as a byte count, and while I'm not sure it'd come into play here exactly (they're mostly around leading indent), it's another vote for bytes.

kivikakk · 2024-11-29T20:58:16Z

I'll apply the bytes change (and test adjustment) and merge. :)

See discussion at kivikakk#494 (comment).

SamWilsn · 2024-11-29T21:42:03Z

it doesn't make any attempt to interpret UTF-8 whatsoever

I totally understand!

This is just me dreading my next step and I absolutely don't expect any effort on your behalf, but it's going to be a pain to adapt this to work with annotate-snippets, which does do unicode interpretation 😿

kivikakk · 2024-12-04T03:31:08Z

Yes, fair T_T I hope it's not too painful — let me know if I can (try to!) elucidate anything else for you about the base design.

Also, if you ever want a release made with your changes in it, just give me a ping!

SamWilsn · 2024-12-04T04:40:40Z

Appreciate it! I have a few more pull requests to put in, so no rush on a new release.

ryanpeach · 2024-12-17T07:11:59Z

@kivikakk if this is working, my PR ryanpeach/mdlinker#58 would love to see a new release.

kivikakk · 2024-12-17T07:22:43Z

Sure thing!

ryanpeach · 2024-12-17T11:59:39Z

Working for me :)

SamWilsn commented Nov 28, 2024

View reviewed changes

SamWilsn marked this pull request as draft November 29, 2024 15:00

SamWilsn added 2 commits November 29, 2024 11:00

Account for FrontMatter when calculating sourcepos

7a1bdfa

Make sourcepos! work in static contexts

73925f5

SamWilsn force-pushed the sourcepos-front-matter branch from 1807051 to 73925f5 Compare November 29, 2024 16:00

SamWilsn marked this pull request as ready for review November 29, 2024 16:01

kivikakk approved these changes Nov 29, 2024

View reviewed changes

Use bytes for sourcepos columns.

17cc1f1

See discussion at kivikakk#494 (comment).

kivikakk enabled auto-merge November 29, 2024 20:59

kivikakk mentioned this pull request Nov 29, 2024

sourcepos is Unicode-naïve. #495

Open

kivikakk merged commit 0c6e6e6 into kivikakk:main Nov 29, 2024
19 checks passed

SamWilsn deleted the sourcepos-front-matter branch November 29, 2024 21:32

ryanpeach mentioned this pull request Dec 17, 2024

Upgrade cormak for better frontmatter support ryanpeach/mdlinker#53

Closed

ryanpeach added a commit to ryanpeach/mdlinker that referenced this pull request Dec 17, 2024

WIP: Removing a lot of extra code in anticipation of kivikakk/comrak#494

702a934

ryanpeach added a commit to ryanpeach/mdlinker that referenced this pull request Dec 17, 2024

WIP: Removing a lot of extra code in anticipation of kivikakk/comrak#494

ca2324e

ryanpeach added a commit to ryanpeach/mdlinker that referenced this pull request Dec 17, 2024

WIP: Removing a lot of extra code in anticipation of kivikakk/comrak#494

7d831b1

ryanpeach added a commit to ryanpeach/mdlinker that referenced this pull request Dec 17, 2024

WIP: Removing a lot of extra code in anticipation of kivikakk/comrak#494

19ea2fc

ryanpeach added a commit to ryanpeach/mdlinker that referenced this pull request Dec 17, 2024

WIP: Removing a lot of extra code in anticipation of kivikakk/comrak#494

7e9478c

ryanpeach mentioned this pull request Dec 17, 2024

Frontmatter Upgrade ryanpeach/mdlinker#58

Merged

ryanpeach added a commit to ryanpeach/mdlinker that referenced this pull request Dec 17, 2024

WIP: Removing a lot of extra code in anticipation of kivikakk/comrak#494

f7318a7

Neved4 mentioned this pull request Dec 18, 2024

comrak 0.32.0 Neved4/homebrew-tap#110

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Account for front matter when calculating `sourcepos` #494

Account for front matter when calculating `sourcepos` #494

SamWilsn commented Nov 28, 2024

SamWilsn Nov 28, 2024

kivikakk Nov 29, 2024 •

edited

Loading

SamWilsn commented Nov 29, 2024

kivikakk left a comment

kivikakk Nov 29, 2024 •

edited

Loading

kivikakk commented Nov 29, 2024

SamWilsn commented Nov 29, 2024

kivikakk commented Dec 4, 2024

SamWilsn commented Dec 4, 2024

ryanpeach commented Dec 17, 2024

kivikakk commented Dec 17, 2024

ryanpeach commented Dec 17, 2024

Account for front matter when calculating sourcepos #494

Account for front matter when calculating sourcepos #494

Conversation

SamWilsn commented Nov 28, 2024

SamWilsn Nov 28, 2024

Choose a reason for hiding this comment

kivikakk Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

SamWilsn commented Nov 29, 2024

kivikakk left a comment

Choose a reason for hiding this comment

kivikakk Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

kivikakk commented Nov 29, 2024

SamWilsn commented Nov 29, 2024

kivikakk commented Dec 4, 2024

SamWilsn commented Dec 4, 2024

ryanpeach commented Dec 17, 2024

kivikakk commented Dec 17, 2024

ryanpeach commented Dec 17, 2024

Account for front matter when calculating `sourcepos` #494

Account for front matter when calculating `sourcepos` #494

kivikakk Nov 29, 2024 •

edited

Loading

kivikakk Nov 29, 2024 •

edited

Loading