Skip to content

Conversation

@mettta
Copy link
Owner

@mettta mettta commented Nov 21, 2025

Release v0.3.0 — Pagination and Rendering Update

Summary

This release introduces major improvements across the entire HTML2PDF4Doc engine: unified slicing for Table/Grid/PRE, updated page-start logic, reworked preview/print layers, improved media handling, and expanded test coverage.


New Features

Unified Splitting & Slicing Engine

  • Unified slicing pipeline for Table / Grid / PRE / TableLike.
  • “Original-as-first” strategy for Table and PRE slicing.
  • Per-slice capacity to make fit/scaling predictable.
  • Normalised slice edges and cut-line adjustments.
  • TD-level measurement and TD-only overflow scaling.
  • Balanced row slicing for Table and Grid.
  • LoopGuard to stop non-progress iterations.
  • Shared recorder for slice metadata (“parts recorder”).

Page-Start & Page-Break Logic

  • Updated findBetterPageStart() logic with semantic fallbacks.
  • Skipping hidden, thin, or service wrappers during candidate search.
  • Separate handling of first-child and last-child parent context.
  • Unified measurement via getTopForPageStartCandidate().
  • Correct first-page element placement.
  • Reset margins on both sides of page breaks.
  • Improved handling of nested tail chains and long single-child structures.

Preview & Print Layers

  • Introduced the three-layer model: contentFlow, paperFlow, overlayFlow.
  • Paper background moved to a pseudo-element.
  • Header/footer moved to the overlay layer with transparent insets.
  • Header/footer made opaque to prevent border bleed-through.
  • Disabled mask in print mode (Blink duplicated text issue).
  • Added bottom-edge mask for clean page boundaries.
  • Frontpage inserted into content flow.
  • Root element stays hidden until preview is fully generated.

Inline / Flow / Children Handling

  • Standalone inline wrappers promoted to ComplexTextBlock.
  • Dropping non-flow nodes early (absolute/fixed/display:none).
  • Added getFirstChildrenChain / getLastChildrenChain.
  • Improved thin-wrapper unwrapping via resolveFlowElement.
  • Tail detection stops when encountering a text node.
  • Updated getBottomWithMargin and related vertical rhythm logic.

Media

  • Correct subtraction of baseline gap for inline images.
  • DOM parent fallback for baseline correction.
  • Improved handling of replaced elements.
  • Improved tall image and SVG overflow behaviour.

Grid Engine

  • Full Stage-5 splitting support, including overflow branches.
  • GridAdapter: virtual rows, style caches, line-height caches, shell-height caches.
  • Monotonic grid split behaviour across justify/align variants.
  • Static container support and offset-based row detection.
  • Width locking before splitting.

Table Engine

  • Conservative ROWSPAN fallback.
  • TFOOT ignored in consistency checks.
  • Strict page-start registration for first slices.
  • Balanced row slicing with shell-height accounting.
  • Ultra-small final slices absorbed into the previous slice.
  • TD-level measurement for all split decisions.
  • Two-pass splitPoints with sanitisation.
  • Scaling fallback for problematic TDs.

Config & Debug

  • Forced Debug Mode (data-forced-debug-mode).
  • Selenium-friendly forcedModeLog.
  • Normalisation of string boolean config values.
  • Print-mode-only test styles.
  • Structured debug logging.

Testing & Tooling

  • Embedded HTML page source in pytest failures for e2e tests.
  • @focus mode and RUN_ALL override.
  • Extended coverage: PRE, Table, Grid, ROWSPAN, IMG, nested tails, page-start heuristics, monotonic grids.
  • Updated helper functions (assert_element_on_the_page, element_order, direct-children asserts).

Notable Fixes

  • Correct baseline-gap handling for inline images.
  • Fixed false empty-first-slice cases in PRE and Table.
  • Correct parent-context propagation for boundary children.
  • More accurate available-space estimation for page-start.
  • Stable overflow fallback paths for Grid/Table.
  • Correct caption/thead height calculation.
  • Fixed incorrect page-start candidates caused by hidden wrappers.

Breaking Changes

  • Stricter page-start selection may change page boundaries in some documents.

@mettta mettta merged commit db03a4b into main Nov 21, 2025
5 checks passed
@mettta mettta deleted the version_0.3.0 branch November 21, 2025 04:14
stanislaw added a commit to strictdoc-project/html2pdf4doc_python that referenced this pull request Nov 21, 2025
This updates to the latest release: mettta/html2pdf#161.

This brings many stability improvements.
stanislaw added a commit to strictdoc-project/html2pdf4doc_python that referenced this pull request Nov 21, 2025
This updates to the latest release: mettta/html2pdf#161.

This brings many stability improvements.

@full_test
stanislaw added a commit to strictdoc-project/html2pdf4doc_python that referenced this pull request Nov 21, 2025
This updates to the latest release: mettta/html2pdf#161.

This brings many stability improvements.

@full_test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants