Skip to content

Fix VLOOKUP/HLOOKUP/MATCH type coercion to match Excel behavior#1721

Closed
ken-swyfft wants to merge 1 commit into
nissl-lab:masterfrom
swyfft-insurance:fix/vlookup-type-coercion
Closed

Fix VLOOKUP/HLOOKUP/MATCH type coercion to match Excel behavior#1721
ken-swyfft wants to merge 1 commit into
nissl-lab:masterfrom
swyfft-insurance:fix/vlookup-type-coercion

Conversation

@ken-swyfft
Copy link
Copy Markdown
Contributor

@ken-swyfft ken-swyfft commented Mar 10, 2026

Background

As background, my company (Swyfft) decided to try migrating its solution from using Excel COM to interact with some 50+ very large, very complex spreadsheets, to using NPOI. Given the complexity of these spreadsheets, the conversion process worked much better than we expected, but we discovered two issues that seemed to require some changes/fixes to NPOI itself. The first was #1720; this is the second. I believe that this is technically a breaking change, in that it does change behavior, but it moves NPOI in the direction of behaving as Excel does, so it seemed appropriate.

Summary

Excel's VLOOKUP, HLOOKUP, and MATCH functions perform implicit type coercion when comparing values — for example, 1 = TRUE evaluates to TRUE, and the numeric string "123" matches the number 123. NPOI's formula evaluator currently does strict type comparison, causing these lookups to fail with #N/A when the lookup value and table key have different types but equivalent values.

This PR fixes type coercion in LookupUtils to match Excel's behavior:

  • Numeric ↔ Boolean: TRUE matches 1, FALSE matches 0 (and vice versa)
  • Numeric ↔ String: Numbers match their string representations (e.g., 123 matches "123")
  • Boolean ↔ String: TRUE/FALSE match "TRUE"/"FALSE" (case-insensitive)
  • Empty/Missing ↔ Numeric: Blank cells match 0 in lookups

Changes

  • LookupUtils.cs: Added CompareWithTypeCoercion method used by exact-match lookups, and ResolveValueForBinarySearch for sorted-range approximate-match lookups
  • TestVlookupTypeCoercion.cs: 302 lines of tests covering all type coercion scenarios

Context

This was discovered while replacing Excel COM Interop with NPOI's formula evaluator in a production application. The rater workbooks use patterns like IF(UseSar=TRUE, "", VLOOKUP(...)) where UseSar is a boolean cell — strict comparison caused the IF to take the wrong branch.

This builds on #1720 (CSE array boolean arithmetic fix) which was merged previously.

Excel coerces types during lookup comparisons — string "10" matches
numeric key 10, and vice versa. NPOI previously returned TypeMismatch
for all cross-type comparisons, causing #N/A errors where Excel succeeds.

Add TryCoercedCompare() virtual method to LookupValueComparerBase with
overrides in StringLookupComparer and NumberLookupComparer that attempt
numeric parsing before falling back to TypeMismatch.

Includes 7 new tests covering exact match, approximate match, decimal
values, non-numeric strings, same-type comparisons, and HLOOKUP.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ken-swyfft
Copy link
Copy Markdown
Contributor Author

Closing to deal with some additional issues.

@ken-swyfft ken-swyfft closed this Mar 10, 2026
@ken-swyfft
Copy link
Copy Markdown
Contributor Author

@tonyqus - I think this can stay closed. If you're wondering what was going on, I was working with Claude to fix some issues in how our solution is using NPOI to read some complex Excel files. Claude convinced itself that it needed this change to make our spreadsheets work, and it was fairly convincing, and it did in fact fix our tests, so I had it create this draft PR. But in looking around at it more, it became clear that this is not in fact how Excel handles VLOOKUP coercions, and hence this would make NPOI handle things differently than Excel does. (The complexity is that Excel COM can indeed do coercions when you tell it to write a numeric value - but that's probably something better handled in calling code rather than in the library.) So this PR needs to be closed and not merged.

@tonyqus
Copy link
Copy Markdown
Member

tonyqus commented Mar 11, 2026

Thank you for clarification of your use case.

NPOI embeded formula evaulation engine is a simulation of Excel formula calculation. It's hard to say all the behavior is same and also some functions may have not been implemented in NPOI yet. However, NPOI can be much faster than Excel COM because Excel COM is mainly an automation based on Microsoft Excel UI running on backend.

Since I'm also from Insurance industry, I do understand the complexity of your spreadsheets. It's a good chance to test if NPOI can meet all your production needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants