Skip to content

Restore sorted cell iteration order with lazy cache#1759

Merged
tonyqus merged 1 commit into
nissl-lab:masterfrom
swyfft-insurance:fix/ks/20260324_sorted-cell-iteration
Mar 24, 2026
Merged

Restore sorted cell iteration order with lazy cache#1759
tonyqus merged 1 commit into
nissl-lab:masterfrom
swyfft-insurance:fix/ks/20260324_sorted-cell-iteration

Conversation

@ken-swyfft
Copy link
Copy Markdown
Contributor

Summary

  • After Replace SortedDictionary with Dictionary in HSSFRow, XSSFRow and SXSSFRow #1753 replaced SortedDictionary with Dictionary in XSSFRow, GetEnumerator() and Cells no longer returned cells in ascending column order — a breaking change for existing consumers who foreach over rows
  • Add a lazily-built sorted cell cache (List<ICell>) that is populated on first read and invalidated on mutation (add/remove/rebuild)
  • Preserves the Dictionary O(1) insert/lookup performance win while restoring the sorted iteration contract
  • Applied consistently to XSSFRow, HSSFRow, and SXSSFRow

Context

Raised by @tonyqus in #1742 (comment) — the SortedDictionaryDictionary change in #1753 means cells returned from GetEnumerator are no longer sorted by column index, which could break existing NPOI-based logic that assumes sorted iteration.

Design

  • Reads dominate writes (sheets are built once, iterated many times), so the sort is cached and reused across multiple iterations
  • Cache is invalidated (null-ed) on any cell mutation — CreateCell, RemoveCell, RebuildCells
  • In-flight enumerators see a stable snapshot (the old list), which is safer than the old SortedDictionary behavior that would throw InvalidOperationException on concurrent modification
  • XSSFRow.OnDocumentWrite() also reuses the cache instead of re-sorting

Test plan

  • TestCellIterationOrderWithSparseColumns passes (net8.0 + net472)
  • All XSSFRow tests pass: 34/34
  • All HSSFRow tests pass: 13/13
  • All SXSSFRow tests pass: 16/16

🤖 Generated with Claude Code

PR nissl-lab#1753 replaced SortedDictionary with Dictionary in XSSFRow for O(1)
insert/lookup, but GetEnumerator() and Cells no longer returned cells
in ascending column order — a breaking change for existing consumers.

Add a lazily-built sorted cell cache (List<ICell>) that is populated on
first read and invalidated on mutation (add/remove/rebuild). This
preserves the Dictionary performance win while restoring the sorted
iteration contract. Applied consistently to XSSFRow, HSSFRow, and
SXSSFRow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tonyqus
Copy link
Copy Markdown
Member

tonyqus commented Mar 24, 2026

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants