Add JSONL support to dolt table {import,export}#10577
Conversation
There was a problem hiding this comment.
Pull request overview
Adds JSONL export support to dolt table export by introducing a new jsonl data format that reuses existing JSON row serialization, along with unit + integration test coverage and updated CLI help text.
Changes:
- Add
.jsonl/--file-type jsonlsupport tomvdataexport writer selection. - Add Go unit tests verifying newline-delimited JSON output using the existing JSON writer with custom header/separator/footer.
- Add Bats integration tests covering JSONL export (including JSON / binary / quoting edge cases) and update CLI help text.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| integration-tests/bats/export-tables.bats | Adds JSONL export assertions and edge-case coverage via Python JSON parsing. |
| go/libraries/doltcore/table/typed/json/writer_jsonl_test.go | New unit tests validating JSONL output behavior via NewJSONWriterWithHeader. |
| go/libraries/doltcore/mvdata/file_data_loc.go | Adds jsonl parsing + writer creation path for export. |
| go/libraries/doltcore/mvdata/data_loc_test.go | Extends data location tests to include .jsonl / jsonl. |
| go/libraries/doltcore/mvdata/data_loc.go | Introduces JsonlFile data format and wiring for extension inference. |
| go/cmd/dolt/commands/tblcmds/export.go | Updates dolt table export help text to mention JSONL and supported formats. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case "jsonl", ".jsonl": | ||
| return JsonlFile |
There was a problem hiding this comment.
DFFromString now treats jsonl as a valid DataFormat, but FileDataLocation.NewReader does not handle JsonlFile. This means dolt table import --file-type jsonl ... will pass the file-type validation (since DFFromString no longer returns InvalidDataFormat) and then fail later with a generic "unsupported format" error. Either add JSONL import support, or keep jsonl out of import validation (e.g., validate allowed file types per-command) / return a clear error when JsonlFile is used for reading.
| case "jsonl", ".jsonl": | |
| return JsonlFile |
|
|
||
| {{.EmphasisLeft}}.json{{.EmphasisRight}} exports a single JSON object containing a {{.EmphasisLeft}}rows{{.EmphasisRight}} array; {{.EmphasisLeft}}.jsonl{{.EmphasisRight}} exports one JSON object per line. | ||
|
|
||
| See the help for {{.EmphasisLeft}}dolt table import{{.EmphasisRight}} as the options are the same. |
There was a problem hiding this comment.
The help text says "See the help for dolt table import as the options are the same", but export now supports jsonl (and also lists sql / parquet) while table import help and validation do not list / support these formats. This is likely to confuse users; consider either updating the wording to avoid claiming parity with import, or explicitly calling out which --file-type values are valid for export vs import (and that stdout export is limited to csv/psv).
| See the help for {{.EmphasisLeft}}dolt table import{{.EmphasisRight}} as the options are the same. | |
| When writing to stdout (omitting {{.LessThan}}|file{{.GreaterThan}} or using {{.EmphasisLeft}}-{{.EmphasisRight}}), only {{.EmphasisLeft}}csv{{.EmphasisRight}} and {{.EmphasisLeft}}psv{{.EmphasisRight}} are supported. | |
| For information on importing data, see the help for {{.EmphasisLeft}}dolt table import{{.EmphasisRight}}; note that its supported file types and options differ from export. |
|
@coffeegoddd DOLT
|
dolt table exportdolt table {import,export}
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var err error | ||
| r.sampleRow, err = r.ReadSqlRow(context.Background()) | ||
| return err == nil, nil |
There was a problem hiding this comment.
VerifySchema drops any error returned by ReadSqlRow (it returns nil error even when ReadSqlRow fails). This can mask malformed JSON / conversion errors during schema verification and cause later failures with less actionable context. Return the actual err (and only keep sampleRow when err == nil) so callers see the correct failure reason.
| var err error | |
| r.sampleRow, err = r.ReadSqlRow(context.Background()) | |
| return err == nil, nil | |
| sampleRow, err := r.ReadSqlRow(context.Background()) | |
| if err != nil { | |
| return false, err | |
| } | |
| r.sampleRow = sampleRow |
| } | ||
|
|
||
| func (r *JSONLReader) ReadRow(ctx context.Context) (row.Row, error) { | ||
| panic("deprecated") |
There was a problem hiding this comment.
This method panics at runtime. Since JSONLReader is declared to implement table.SqlTableReader, any code path that still calls ReadRow (even indirectly) will crash the process. Prefer returning a non-panicking error (or implement the method properly) to avoid a hard failure.
| panic("deprecated") | |
| return nil, fmt.Errorf("ReadRow is deprecated for JSONLReader; use ReadSqlRow instead") |
|
|
||
| func resolveJSONSchema(dEnv *env.DoltEnv, root doltdb.RootValue, opts interface{}) (schema.Schema, error) { | ||
| var sch schema.Schema | ||
| jsonOpts, _ := opts.(JSONOptions) |
There was a problem hiding this comment.
resolveJSONSchema silently ignores a failed type assertion (jsonOpts, _ := opts.(JSONOptions)). If a non-JSONOptions value is ever passed, this will fall through with zero-values (e.g., empty TableName) and produce misleading downstream errors. Check the ok result and return a clear error when opts is non-nil but not JSONOptions.
| jsonOpts, _ := opts.(JSONOptions) | |
| jsonOpts, ok := opts.(JSONOptions) | |
| if opts != nil && !ok { | |
| return nil, fmt.Errorf("invalid JSON options type: %T", opts) | |
| } |
| if r.closer != nil { | ||
| err := r.closer.Close() | ||
| r.closer = nil | ||
| return err | ||
| } | ||
| return errors.New("already closed") |
There was a problem hiding this comment.
Returning an error on a second Close() makes this reader non-idempotent, which can be surprising in Go (where Close is commonly safe to call multiple times, especially with layered defers). Consider returning nil when already closed to make cleanup more robust.
| if r.closer != nil { | |
| err := r.closer.Close() | |
| r.closer = nil | |
| return err | |
| } | |
| return errors.New("already closed") | |
| if r.closer == nil { | |
| return nil | |
| } | |
| err := r.closer.Close() | |
| r.closer = nil | |
| return err |
|
@coffeegoddd DOLT
|
|
fixes #10573 |
Co-authored-by: James Cor <jcor@ucsd.edu>
Co-authored-by: James Cor <jcor@ucsd.edu>
|
@coffeegoddd DOLT
|
|
@coffeegoddd DOLT
|
Adds JSONL (.jsonl / --file-type jsonl) support for both dolt table export (newline-delimited rows reusing existing JSON serialization) and dolt table import (streaming JSONL reader), with updated CLI help plus unit and Bats integration tests.