feat(csv): support `fieldsPerRecord` in `CsvParseStream` #5600

magurotuna · 2024-08-01T10:28:38Z

Although the constructor of CsvParseStream accepts fieldsPerRecord option (see https://jsr.io/@std/[email protected]/doc/~/CsvParseStreamOptions) that ensures that every record has the specified (or inferred from the first row) number of fields, this option has no effect at all in the current implementation.
To fix this issue, this patch implements the fieldsPerRecord logic in CsvParseStream together with sufficient amount of test cases.

magurotuna · 2024-08-01T10:30:58Z

csv/parse_stream_test.ts

-        if ("skipFirstRow" in testCase) {
-          options.skipFirstRow = testCase.skipFirstRow;
-        }
-        if ("columns" in testCase) {
-          options.columns = testCase.columns;
-        }
        if ("trimLeadingSpace" in testCase) {
          options.trimLeadingSpace = testCase.trimLeadingSpace;
        }
        if ("lazyQuotes" in testCase) {
          options.lazyQuotes = testCase.lazyQuotes;
        }
+        if ("fieldsPerRecord" in testCase) {
+          options.fieldsPerRecord = testCase.fieldsPerRecord;
+        }
+        if ("skipFirstRow" in testCase) {
+          options.skipFirstRow = testCase.skipFirstRow;
+        }
+        if ("columns" in testCase) {
+          options.columns = testCase.columns;
+        }


This adds fieldsPerRecord, and also reorders skipFirstRow and columns to match the order of properties declared in CsvParseStreamOptions so that it will be easy for us to notice the missing field in the future

codecov · 2024-08-01T10:32:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.38%. Comparing base (577fd9a) to head (97e2a30).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #5600   +/-   ##
=======================================
  Coverage   96.37%   96.38%           
=======================================
  Files         466      466           
  Lines       37526    37559   +33     
  Branches     5528     5538   +10     
=======================================
+ Hits        36167    36201   +34     
+ Misses       1317     1316    -1     
  Partials       42       42

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

magurotuna · 2024-08-01T10:36:39Z

csv/parse_stream_test.ts

+        error: {
+          klass: Error,
+          msg:
+            "Error number of fields line: 1\nNumber of fields found: 3\nExpected number of fields: 2",
+        },
      },
      {
        name: "bad quote in bare field",
        input: `a "word",1,2,3`,
-        errorMessage: "Error line: 1\nBad quoting",
+        error: {
+          klass: SyntaxError,
+          msg:
+            'record on line 1; parse error on line 0, column 2: bare " in non-quoted-field',
+        },
      },
      {
        name: "bad quote in quoted field",
        input: `"wo"rd",1,2,3`,
-        errorMessage: "Error line: 1\nBad quoting",
+        error: {
+          klass: SyntaxError,
+          msg:
+            'record on line 1; parse error on line 0, column 3: extraneous or missing " in quoted-field',
+        },


In the previous test the expected error messages were not asserted. I fixed this and now the error class and message are checked - this reveals that these actual error messages feel a little bit unnatural, in the sense that the first sentence says "line 1" but the second says "parse error on line 0". I think this is inconsistent indexing.
This is not really related to what I'd like to cover in this PR, but maybe we'd like to address it before v1 too?

kt3k

Oh, good catch and nice fix! Thanks!

support fieldsPerRecord in CsvParseStream

97e2a30

magurotuna requested a review from kt3k as a code owner August 1, 2024 10:28

github-actions bot added the csv label Aug 1, 2024

magurotuna commented Aug 1, 2024

View reviewed changes

kt3k approved these changes Aug 1, 2024

View reviewed changes

magurotuna merged commit b85d219 into denoland:main Aug 1, 2024
13 checks passed

magurotuna deleted the magurotuna/csv-stream-fieldsPerRecord branch August 1, 2024 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(csv): support `fieldsPerRecord` in `CsvParseStream` #5600

feat(csv): support `fieldsPerRecord` in `CsvParseStream` #5600

magurotuna commented Aug 1, 2024

magurotuna Aug 1, 2024

codecov bot commented Aug 1, 2024 •

edited

Loading

magurotuna Aug 1, 2024

kt3k left a comment

feat(csv): support fieldsPerRecord in CsvParseStream #5600

feat(csv): support fieldsPerRecord in CsvParseStream #5600

Conversation

magurotuna commented Aug 1, 2024

magurotuna Aug 1, 2024

Choose a reason for hiding this comment

codecov bot commented Aug 1, 2024 • edited Loading

Codecov Report

magurotuna Aug 1, 2024

Choose a reason for hiding this comment

kt3k left a comment

Choose a reason for hiding this comment

feat(csv): support `fieldsPerRecord` in `CsvParseStream` #5600

feat(csv): support `fieldsPerRecord` in `CsvParseStream` #5600

codecov bot commented Aug 1, 2024 •

edited

Loading