Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BadTokenizationError: An unhandled error occurred processing the document #731

Closed
trunneml opened this issue Aug 1, 2023 · 3 comments · Fixed by #890
Closed

BadTokenizationError: An unhandled error occurred processing the document #731

trunneml opened this issue Aug 1, 2023 · 3 comments · Fixed by #890

Comments

@trunneml
Copy link

trunneml commented Aug 1, 2023

I'm getting an pymarkdown.bad_tokenization_error.BadTokenizationError: An unhandled error occurred processing the document. error linting one of our markdown files.

Core reason is:

pymarkdown/container_blocks/container_helper.py", line 48, in reduce_containers_if_required
    assert block_quote_data.current_count != 0 or block_quote_data.stack_count <= 0
AssertionError

I create a test markdown file that triggers the error:

# Headline 1

- Test List

  > 1) Test1
  > 2) Test2

## Headline 2

Some Text

For some reason adding some text above Headline 2 fixes the problem.

@jackdewinter
Copy link
Owner

Just so you know, this is near the top of the list to look at.

@jackdewinter
Copy link
Owner

...And after fixing other things in the area of this issue, finally getting to it. The assert is currently disabled and pending removal, but it prompted me to look at combinations like that, and there is a slight issue with the close ordering.

added test_nested_three_unordered_block_ordered_with_blank_* tests to make sure the tokenization is going properly, This may resolve some other issues I am seeing with rules not properly handling those scenarios, thinking there is no blank line between the heading and the list. Will create new issues to track that if it ends up being the case.

@jackdewinter
Copy link
Owner

This ended up being a good issue that found 5 issues.

Create #888 and #889

  • one issue was that after a block quote reduction, there was no check if a list also needed reduction. this has been marked in the code as possibly cyclic, and will need deeper nesting to get right. this check needed to be added in the container helper and in the paragraph helper
  • if that reduction may be needed, was not passing the position and the extracted whitespace down to those functions. this is pivotal in determining the list's indent
  • in the HTML renderer, there were cases where the looseness of a list item was not correct. this was fixed by making sure that a block quote start also checks for looseness, not just the list elements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants