Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with null rows on fetch #96

Closed
achang-s opened this issue Jan 14, 2021 · 4 comments
Closed

Issue with null rows on fetch #96

achang-s opened this issue Jan 14, 2021 · 4 comments

Comments

@achang-s
Copy link

achang-s commented Jan 14, 2021

Hello,

I encountered an issue when processing an excel file with null rows (not empty rows) between valid rows. While NPOI knows the proper size of the sheet collection, the loop condition while (i <= MaxRowNumber && (row = sheet.GetRow(i)) != null) on line 420 of Fetch(ISheet sheet, Type type, Func<string, object, object> valueParser = null) stops looping through the collection if it hits a null row, even if there are more valid rows later on in the sheet.

This is because the index is still looping through the row keys rather than the collection's physical index, see image for reference, GetRow looks for 19 which is null and never proceeds to 19 the collection since its key is 22.

As a stopgap, in my application I preprocess the file by replacing these null rows with empty rows. Thanks.

@mganss
Copy link
Owner

mganss commented Jan 14, 2021

Can you attach a sample Excel file?

@achang-s
Copy link
Author

achang-s commented Jan 14, 2021

null_test.xlsx

Here is a sample file. Row 6 (collection index 5) appears blank but is a null row, the key that GetRow(i) uses is actually 7

Also, in the process of creating this test file, another potential issue came up, I believe I did something like deleting rows after the valid rows, and this generated other entries with a high value. In my own workaround pre-processing the file, I ended up using the sheet's GetRowEnumerator() to traverse the collection instead of row key.

@mganss mganss closed this as completed in 848ebbd Jan 15, 2021
@mganss
Copy link
Owner

mganss commented Jan 15, 2021

Thanks. It's fixed in 5.1.253. Regarding the other issue, could you open a new issue and provide steps to repro?

@achang-s
Copy link
Author

achang-s commented Jan 15, 2021

Thanks @mganss. It seems like the other issue I mentioned is no longer a problem now that the iterator is used and only iterates through the actual number of physically present rows. It was only problematic before because my previous preprocess method was also naively using the row key as an index to loop. Not sure why but on LibreOffice Calc deleting rows in a certain way (right click delete on rows existing in the middle of other valid rows, and selecting Delete entire row) would set the row number values of the later to rows to over 1,000,000 at least when viewed in NPOI's collection (this can be seen when inspecting in the debugger the aforementioned attached file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants