Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(csv): support fieldsPerRecord in CsvParseStream #5600

Merged

Conversation

magurotuna
Copy link
Member

Although the constructor of CsvParseStream accepts fieldsPerRecord option (see https://jsr.io/@std/[email protected]/doc/~/CsvParseStreamOptions) that ensures that every record has the specified (or inferred from the first row) number of fields, this option has no effect at all in the current implementation.
To fix this issue, this patch implements the fieldsPerRecord logic in CsvParseStream together with sufficient amount of test cases.

@magurotuna magurotuna requested a review from kt3k as a code owner August 1, 2024 10:28
@github-actions github-actions bot added the csv label Aug 1, 2024
Comment on lines -344 to +411
if ("skipFirstRow" in testCase) {
options.skipFirstRow = testCase.skipFirstRow;
}
if ("columns" in testCase) {
options.columns = testCase.columns;
}
if ("trimLeadingSpace" in testCase) {
options.trimLeadingSpace = testCase.trimLeadingSpace;
}
if ("lazyQuotes" in testCase) {
options.lazyQuotes = testCase.lazyQuotes;
}
if ("fieldsPerRecord" in testCase) {
options.fieldsPerRecord = testCase.fieldsPerRecord;
}
if ("skipFirstRow" in testCase) {
options.skipFirstRow = testCase.skipFirstRow;
}
if ("columns" in testCase) {
options.columns = testCase.columns;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds fieldsPerRecord, and also reorders skipFirstRow and columns to match the order of properties declared in CsvParseStreamOptions so that it will be easy for us to notice the missing field in the future

Copy link

codecov bot commented Aug 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.38%. Comparing base (577fd9a) to head (97e2a30).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5600   +/-   ##
=======================================
  Coverage   96.37%   96.38%           
=======================================
  Files         466      466           
  Lines       37526    37559   +33     
  Branches     5528     5538   +10     
=======================================
+ Hits        36167    36201   +34     
+ Misses       1317     1316    -1     
  Partials       42       42           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines +357 to +379
error: {
klass: Error,
msg:
"Error number of fields line: 1\nNumber of fields found: 3\nExpected number of fields: 2",
},
},
{
name: "bad quote in bare field",
input: `a "word",1,2,3`,
errorMessage: "Error line: 1\nBad quoting",
error: {
klass: SyntaxError,
msg:
'record on line 1; parse error on line 0, column 2: bare " in non-quoted-field',
},
},
{
name: "bad quote in quoted field",
input: `"wo"rd",1,2,3`,
errorMessage: "Error line: 1\nBad quoting",
error: {
klass: SyntaxError,
msg:
'record on line 1; parse error on line 0, column 3: extraneous or missing " in quoted-field',
},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous test the expected error messages were not asserted. I fixed this and now the error class and message are checked - this reveals that these actual error messages feel a little bit unnatural, in the sense that the first sentence says "line 1" but the second says "parse error on line 0". I think this is inconsistent indexing.
This is not really related to what I'd like to cover in this PR, but maybe we'd like to address it before v1 too?

Copy link
Member

@kt3k kt3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch and nice fix! Thanks!

@magurotuna magurotuna merged commit b85d219 into denoland:main Aug 1, 2024
13 checks passed
@magurotuna magurotuna deleted the magurotuna/csv-stream-fieldsPerRecord branch August 1, 2024 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants