Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

https://github.com/jackdewinter/pymarkdown/issues/818 - Initial work #1129

Merged
merged 11 commits into from
Jul 5, 2024

Conversation

jackdewinter
Copy link
Owner

@jackdewinter jackdewinter commented Jul 5, 2024

Initial block of work on base 2 and 3 nesting scenarios for md031

Encountered ~7 parsing and rehydration issues that needed to be addressed as part of this:

  • Issue 1120
    • within Block-List, thematic break can sometimes not report newlines to the
      list block
  • Issue 1122
    • opening a fenced code block in a Bq-List-Bq was closing the outer BQ
  • Issue 1123
    • in some cases within a Bq-List-Bq, not counting the newlines properly
  • Issue 1124
    • list items within a Bq-List-Bq can have incorrect starting text regarding
      the innermost block
  • Issue 1125
    • parsing of blank lines within Bq-List-Bq does not always add the right
      newlines to the list
  • Issue 1126
    • under some circumstances, with a Bq-List-Bq, thematic break can cause
      the block quote to close
  • Issue 1127
    • rehydration can be wrong with indented blocks in Bq-List-Bq

Summary by Sourcery

This pull request addresses multiple parsing and rehydration issues related to nested block quotes and lists in rule MD031. It includes bug fixes, enhancements to the rule's functionality, and updates to the test suite to cover new scenarios. The changelog has also been updated to reflect these changes.

  • Bug Fixes:
    • Fixed issue where within Block-List, thematic break sometimes did not report newlines to the list block.
    • Resolved issue where opening a fenced code block in a Bq-List-Bq was closing the outer block quote.
    • Corrected newline counting in some cases within a Bq-List-Bq.
    • Fixed incorrect starting text for list items within a Bq-List-Bq regarding the innermost block.
    • Addressed parsing of blank lines within Bq-List-Bq to ensure correct newline addition to the list.
    • Fixed issue where under certain circumstances, a thematic break within a Bq-List-Bq could cause the block quote to close.
    • Resolved rehydration issues with indented blocks in Bq-List-Bq.
  • Enhancements:
    • Refactored test cases for rule MD031 to use parameterized tests for better maintainability and readability.
    • Enhanced the rule MD031 to support automatic fixing of issues detected by the rule.
  • Documentation:
    • Updated changelog to reflect the fixed issues related to rule MD031.
  • Tests:
    • Added new test cases for various nesting scenarios in rule MD031 to ensure proper handling of fenced code blocks within block quotes and lists.
    • Removed outdated test case 'test_extra_044c' and added new test cases to cover additional scenarios.

Copy link
Contributor

sourcery-ai bot commented Jul 5, 2024

Reviewer's Guide by Sourcery

This pull request refactors and enhances the handling of nested block quotes and lists in the pymarkdown project. It introduces new classes and methods for better container adjustment and spacing fixes, updates the rule implementation to support automatic fixes, and adds comprehensive test cases to ensure robust handling of various scenarios. The changelog and coverage report have also been updated to reflect these changes.

File-Level Changes

Files Changes
pymarkdown/plugins/rule_md_031.py
pymarkdown/block_quotes/block_quote_count_helper.py
pymarkdown/block_quotes/block_quote_processor.py
pymarkdown/transform_markdown/transform_list_block.py
pymarkdown/block_quotes/block_quote_non_fenced_helper.py
Refactored and enhanced block quote and list processing logic to handle nested scenarios and added support for automatic fixes.
test/rules/test_md031.py
test/test_markdown_extra.py
test/tokens/test_markdown_token.py
Updated and added new test cases to cover various nested block quote and list scenarios, ensuring comprehensive test coverage.

Tips
  • Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
  • Continue your discussion with Sourcery by replying directly to review comments.
  • You can change your review settings at any time by accessing your dashboard:
    • Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
    • Change the review language;
  • You can always contact us if you have any questions or feedback.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jackdewinter - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 7 issues found
  • 🟢 Security: all looks good
  • 🟡 Testing: 2 issues found
  • 🟡 Complexity: 2 issues found
  • 🟡 Documentation: 4 issues found

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

self.__container_token_stack: List[MarkdownToken] = []
self.__pending_container_ends = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider renaming __pending_container_ends for clarity.

The variable name __pending_container_ends might be more descriptive if renamed to something like __pending_container_end_count to better convey that it is a count.

Suggested change
self.__pending_container_ends = 0
self.__pending_container_end_count = 0

@@ -149,6 +152,41 @@ def __handle_bq_whitespace(adjusted_line: str, start_index: int) -> Tuple[str, i
start_index += 1
return adjusted_line, start_index

# pylint: disable=too-many-arguments
@staticmethod
def __xx(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Rename method __xx to a more descriptive name.

The method name __xx is not descriptive. Consider renaming it to something that reflects its functionality.

Suggested change
def __xx(
def parse_block_quote(

if parser_state.token_stack[-1].is_fenced_code_block:
return False, -1
block_quote_character_count = ParserHelper.count_characters_in_text(parser_state.original_line_to_parse[:start_index], ">")
assert parser_state.original_line_to_parse is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider handling the case where original_line_to_parse is None.

While the assert statement ensures original_line_to_parse is not None, it might be more robust to handle this case explicitly, perhaps by raising a custom exception.

Suggested change
assert parser_state.original_line_to_parse is not None
if parser_state.original_line_to_parse is None:
raise ValueError("original_line_to_parse cannot be None")

adjusted_removed_text = original_removed_text[
len(current_leading_spaces) - extra_consumed_whitespace :
]
assert len(current_leading_spaces) <= len(original_block_quote_bleading_spaces)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider removing the commented-out code.

The commented-out code following the assert statement can be removed to keep the codebase clean and maintainable.

@@ -755,6 +751,32 @@ def __get_block_start_index(
avoid_block_starts,
)

@staticmethod
def __get_block_start_index_handle_blank_line(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider renaming method __get_block_start_index_handle_blank_line.

The method name __get_block_start_index_handle_blank_line is quite long. Consider renaming it to something more concise while still descriptive, such as __handle_blank_line_in_block_start.

Suggested change
def __get_block_start_index_handle_blank_line(
def __handle_blank_line_in_block_start(

https://github.com/jackdewinter/pymarkdown/issues/1125
https://github.com/jackdewinter/pymarkdown/issues/1126
https://github.com/jackdewinter/pymarkdown/issues/1127
- [Issue 1120](https://github.com/jackdewinter/pymarkdown/issues/1120)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (documentation): Add a colon after 'Issue 1120'.

For consistency, add a colon after 'Issue 1120'.

Suggested change
- [Issue 1120](https://github.com/jackdewinter/pymarkdown/issues/1120)
[Issue 1120](https://github.com/jackdewinter/pymarkdown/issues/1120):

parser_state.token_stack[stack_index-2].is_list and \
parser_state.token_stack[stack_index-2].matching_markdown_token.line_number != block_leaf_tokens[-1].line_number:
parser_state.token_stack[stack_index-2].matching_markdown_token.add_leading_spaces("")
ContainerBlockNonLeafProcessor.__get_block_start_index_handle_blank_line(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider simplifying the code by keeping the logic inline.

The new code introduces additional complexity without clear benefits. Here are the main concerns:

  1. Increased Indirection: Introducing the __get_block_start_index_handle_blank_line method adds an extra layer of indirection, making the code harder to follow.
  2. Unnecessary Imports: The import for List from typing is not needed and adds to the cognitive load.
  3. Redundant Assertions: The assertion checking if list_token is not None is redundant because cast already ensures the type.
  4. Unnecessary Casting: The use of cast is unnecessary if the type is already known and can be inferred from the context.
  5. Code Duplication: The new method duplicates existing logic, spreading it out and making it harder to maintain.

Consider simplifying the code by keeping the logic inline:

if grab_bag.requeue_line_info:
    POGGER.debug(">>requeuing lines after looking for block start. returning.")
if grab_bag.did_blank:
    assert block_leaf_tokens and block_leaf_tokens[-1].is_blank_line, "should be a blank at the end"
    POGGER.debug(">>already handled blank line. returning.")
    grab_bag.extend_container_tokens_with_leaf_tokens()
    stack_index = len(parser_state.token_stack) - 1
    if stack_index > 2 and parser_state.token_stack[stack_index].is_block_quote and parser_state.token_stack[stack_index-1].is_block_quote and \
       parser_state.token_stack[stack_index-2].is_list:
        list_token = parser_state.token_stack[stack_index-2].matching_markdown_token
        if list_token.line_number != block_leaf_tokens[-1].line_number:
            list_token.add_leading_spaces("")
grab_bag.can_continue = (
    not grab_bag.requeue_line_info and not grab_bag.did_blank
)

This approach keeps the logic in one place, making it easier to follow and maintain.

@@ -590,7 +590,12 @@ def is_inline_image(self) -> bool:
"""
return self.token_name == MarkdownToken._token_inline_image

def adjust_line_number(self, context: PluginModifyContext, adjust_delta:int) -> None:
def adjust_line_number(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider reverting the method signature and docstring to a single line for better readability.

The new code introduces unnecessary complexity and reduces readability. Here are the main issues:

  1. Unnecessary Line Breaks: The method signature and docstring have been split across multiple lines, making the code harder to read and follow. The original code was more concise and easier to understand at a glance.

  2. Redundant Docstring: The added docstring for adjust_line_number is redundant. The method name is self-explanatory, and the original code did not require additional documentation to understand its purpose.

  3. Inconsistent Formatting: The new formatting is inconsistent with the rest of the codebase, which can lead to confusion and make the code harder to maintain.

Here is a revised version that maintains simplicity and readability:

@property
def is_inline_image(self) -> bool:
    """
    Returns whether the current token is an image element.
    """
    return self.token_name == MarkdownToken._token_inline_image

def adjust_line_number(self, context: PluginModifyContext, adjust_delta: int) -> None:
    # By design, tokens can only be modified in fix mode during the token pass.
    if not context.in_fix_mode:
        raise BadPluginFixError(
            f"Token '{self.__token_name}' can only be modified in fix mode."
        )
    if context.is_during_line_pass:
        raise BadPluginFixError(
            f"Token '{self.__token_name}' can only be modified during the token pass in fix mode."
        )
    self.__line_number += adjust_delta

This version keeps the method signature and logic concise, maintains readability, and avoids unnecessary complexity.

Comment on lines +6 to +15
from test.rules.utils import (
execute_configuration_test,
execute_fix_test,
execute_query_configuration_test,
execute_scan_test,
id_test_plug_rule_fn,
pluginConfigErrorTest,
pluginQueryConfigTest,
pluginRuleTest,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Don't import test modules. (dont-import-test-modules)

ExplanationDon't import test modules.

Tests should be self-contained and don't depend on each other.

If a helper function is used by multiple tests,
define it in a helper module,
instead of importing one test from the other.

execute_results.assert_results(
expected_output, expected_error, expected_return_code
)
fixTests = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Convert for loop into list comprehension (list-comprehension)

Copy link

codecov bot commented Jul 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (0d55e32) to head (4a286c4).
Report is 19 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main     #1129      +/-   ##
===========================================
+ Coverage   99.84%   100.00%   +0.15%     
===========================================
  Files         190       190              
  Lines       19832     19960     +128     
  Branches     2502      2511       +9     
===========================================
+ Hits        19802     19960     +158     
+ Misses         19         0      -19     
+ Partials       11         0      -11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jackdewinter jackdewinter merged commit 8a8fb01 into main Jul 5, 2024
14 of 15 checks passed
@jackdewinter jackdewinter deleted the issue-818 branch July 5, 2024 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant