Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEGFAULT? #1

Closed
theHamsta opened this issue Aug 18, 2022 · 14 comments
Closed

SEGFAULT? #1

theHamsta opened this issue Aug 18, 2022 · 14 comments

Comments

@theHamsta
Copy link

I tried to integrate this parser into nvim-treesitter but I encountered a SEFAULT during my editing session

helix-editor/helix#3117 (comment)

ikatyang/tree-sitter-markdown#14 might be useful to set up fuzzing to identify inputs that cause the scanner to crash. "python" in the script needs to be changed to "python2" if you encounter an error when executing the build script in tree-sitter

@aMOPel
Copy link
Owner

aMOPel commented Aug 18, 2022

As the readme says, this parser is very much WIP = Work in Progress.

Naturally you can't use this parser on a full nim file, since some language features can't be parsed yet and completely confuse the parser.

Also I haven't tested this parser whatsoever deployed anywhere, I only used it in my local dev environment.

@aMOPel aMOPel closed this as completed Aug 18, 2022
@aMOPel
Copy link
Owner

aMOPel commented Aug 18, 2022

I made this more obvious in the readme now.

@theHamsta
Copy link
Author

theHamsta commented Aug 18, 2022

This was indeed intended to accompany the WIP status. There is no problem for a parser to be in WIP. Typically, when users are interested they also help to contribute specific features to help to mature the parser.

I opened this issue as SEGFAULTs are typically difficult to detect and typically don't occur when using the parser just in a dev environment. I thought this would be a help during development 🙂

In general, this parser made already a quite solid impression to me, apart from some features that were mentioned in the README

@aMOPel
Copy link
Owner

aMOPel commented Aug 19, 2022

Well. Thanks for the heads up then.

I looked into the commit that fixed the issue in the markdown parser and they removed all asserts and wrapped scanner->scan into a try catch. I guess I'll try that too.

Honestly I have only meddled as little as possible with the CPP code (scanner.cc) and reused a lot of the CPP code from the (official) tree-sitter-python repo, so it surprises me a little that things can go south so much.

@aMOPel aMOPel reopened this Aug 19, 2022
@theHamsta
Copy link
Author

theHamsta commented Aug 19, 2022

One of the team members is developing a Github Action to execute the fuzzing automatically https://github.com/vigoux/tree-sitter-fuzz-action. Maybe it is useful. I haven't tested it yet. The fuzzer outputs example snippets that provoke a parser crash.

The fix for the markdown parser didn't really solve the problem. In the end, we switched to a different parser. Many issues we we had with parsers based on the Python scanner were either related to serialization/deserialization or to checking for the final \0 bit.

@aMOPel
Copy link
Owner

aMOPel commented Aug 19, 2022

Can you point me to those other parsers that used the python scanner? What did they do to fix it?

@aMOPel
Copy link
Owner

aMOPel commented Aug 19, 2022

So the fuzzer revealed that I had a bug in my scanner.cc where it would go in an infinite loop when encountering """ at the end of the file.

Thanks for showing me this tool, quite handy. I have another run going atm and it's approaching # 1000000 does it end somewhen?

@theHamsta
Copy link
Author

No, it never ends. You can just leave it running some time.

@aMOPel
Copy link
Owner

aMOPel commented Aug 19, 2022

I let it run for some time now and it didn't throw again.

The fix is pushed, if you encounter another SEGFAULT, let me know.

@aMOPel aMOPel closed this as completed Aug 19, 2022
@theHamsta
Copy link
Author

Turns out that fixing """ solved also many issues with the parser. Thank you!

@aMOPel
Copy link
Owner

aMOPel commented Aug 19, 2022

I am curious how well it is running now.

Also, are you with the nvim-treesitter repo and possibly writing queries for highlighting etc.?
I would be very open to feedback on the current AST structure, what nodes are redundant and what nodes could be refined or are fields necessary?

@theHamsta
Copy link
Author

nvim-treesitter/nvim-treesitter#3320 the integration of this parser was requested by nvim and helix users. So far it seems to work well. I've never worked with nim, so I don't know how to interpret more complex structs.

@aMOPel
Copy link
Owner

aMOPel commented Aug 19, 2022

That happened quicker than I expected. Thanks :)

I created a bunch of issues now for the missing features to ease contributions. When the parser is done somewhen, I would also write a bunch of queries for the nvim-treesitter repo, in fact, if you look at the readme, that was my plan all along.

However atm I can't dump a lot of time on this, so it will take a little.

@theHamsta
Copy link
Author

No, worries. I'm appreciating what your work! If you want you can take over the queries PR or create a own whenever you feel your parser is ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants