{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":830561706,"defaultBranch":"main","name":"datafusion","ownerLogin":"connec","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2024-07-18T14:04:13.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/160652?v=4","public":true,"private":false,"isOrgOwned":false},"refInfo":{"name":"","listCacheKey":"v0:1721945783.0","currentOid":""},"activityList":{"items":[{"before":"0a509a2485e1ec058d05316c48b09df81ba1bdee","after":null,"ref":"refs/heads/csv-exec-builder","pushedAt":"2024-07-25T22:16:23.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"}},{"before":"99d158de22caba9c5f290297188d05fec24febec","after":"0a509a2485e1ec058d05316c48b09df81ba1bdee","ref":"refs/heads/csv-exec-builder","pushedAt":"2024-07-24T19:47:37.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"alamb","name":"Andrew Lamb","path":"/alamb","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/490673?s=80&v=4"},"commit":{"message":"fmt","shortMessageHtmlLink":"fmt"}},{"before":"debfd19e1c2d5bf0afb51c94879981478bf01733","after":"99d158de22caba9c5f290297188d05fec24febec","ref":"refs/heads/csv-exec-builder","pushedAt":"2024-07-24T19:36:32.000Z","pushType":"push","commitsCount":10,"pusher":{"login":"alamb","name":"Andrew Lamb","path":"/alamb","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/490673?s=80&v=4"},"commit":{"message":"Add test that CSVExec options are the same","shortMessageHtmlLink":"Add test that CSVExec options are the same"}},{"before":null,"after":"debfd19e1c2d5bf0afb51c94879981478bf01733","ref":"refs/heads/csv-exec-builder","pushedAt":"2024-07-24T12:37:25.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"chore: replace usage of deprecated `CsvExec::new` with `CsvExec::builder`","shortMessageHtmlLink":"chore: replace usage of deprecated CsvExec::new with `CsvExec::buil…"}},{"before":"35198b6ce912743768f81f361a6f743e386ab306","after":null,"ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-22T07:23:02.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"}},{"before":"4d06432e54426e96b218365234a11adb2e42d8fc","after":"35198b6ce912743768f81f361a6f743e386ab306","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-20T19:19:46.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"fix: always checkout `newlines_in_values.csv` with `LF` line endings\n\nThe default git behaviour of converting line endings for checked out files causes the `csv_files.slt` test to fail when testing `newlines_in_values`. This appears to be due to the quoted newlines being converted to CRLF, which are not then normalised when the CSV is read. Assuming that the sqllogictests do normalise line endings in the expected output, this could then lead to a \"spurious\" diff from the actual output.","shortMessageHtmlLink":"fix: always checkout newlines_in_values.csv with LF line endings"}},{"before":"b9cc96b5036e027619b37562703494aa830d7afc","after":"4d06432e54426e96b218365234a11adb2e42d8fc","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-20T17:23:29.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"Merge remote-tracking branch 'origin/main' into csv-newlines-in-values","shortMessageHtmlLink":"Merge remote-tracking branch 'origin/main' into csv-newlines-in-values"}},{"before":"356f46bd807d0b8bc67a539290fd86d46a6e20c0","after":"b9cc96b5036e027619b37562703494aa830d7afc","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-20T17:21:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"fix: always checkout `*.slt` with LF line endings\n\nThis is a bit of a stab in the dark, but it might fix multiline tests on\nWindows.","shortMessageHtmlLink":"fix: always checkout *.slt with LF line endings"}},{"before":"ed0075d416f6f1890dc529ed37718bc1ef7df033","after":"356f46bd807d0b8bc67a539290fd86d46a6e20c0","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-20T10:32:19.000Z","pushType":"push","commitsCount":15,"pusher":{"login":"alamb","name":"Andrew Lamb","path":"/alamb","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/490673?s=80&v=4"},"commit":{"message":"Merge remote-tracking branch 'apache/main' into csv-newlines-in-values","shortMessageHtmlLink":"Merge remote-tracking branch 'apache/main' into csv-newlines-in-values"}},{"before":"8c2d98d0345a9cfac8cd27fb8d8d5fe7315c099a","after":"ed0075d416f6f1890dc529ed37718bc1ef7df033","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-19T22:06:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"chore: suppress lint on too many arguments for `CsvExec::new`","shortMessageHtmlLink":"chore: suppress lint on too many arguments for CsvExec::new"}},{"before":"34dcdb0199125924cd2aeca7779b5d08d06ad357","after":"8c2d98d0345a9cfac8cd27fb8d8d5fe7315c099a","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-19T21:55:27.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"fix: typo in config.md","shortMessageHtmlLink":"fix: typo in config.md"}},{"before":"9ca9065bf2f943d4f028883799592e9bac1bc907","after":"34dcdb0199125924cd2aeca7779b5d08d06ad357","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-19T11:39:13.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"docs: document `datafusion.catalog.newlines_in_values`","shortMessageHtmlLink":"docs: document datafusion.catalog.newlines_in_values"}},{"before":"5321e25df65fb776bbcfb0b1dbc1e31ea2ec36d3","after":"9ca9065bf2f943d4f028883799592e9bac1bc907","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-19T11:31:45.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"test: add/fix sqllogictests for `newlines_in_values`","shortMessageHtmlLink":"test: add/fix sqllogictests for newlines_in_values"}},{"before":null,"after":"5321e25df65fb776bbcfb0b1dbc1e31ea2ec36d3","ref":"refs/heads/csv-newlines-in-values","pushedAt":"2024-07-18T14:04:27.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"connec","name":"Chris Connelly","path":"/connec","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/160652?s=80&v=4"},"commit":{"message":"feat!: support `newlines_in_values` CSV option\n\nThis significantly simplifies the UX when dealing with large CSV files\nthat must support newlines in (quoted) values. By default, large CSV\nfiles will be repartitioned into multiple parallel range scans. This is\ngreat for performance in the common case but when large CSVs contain\nnewlines in values the parallel scan will fail due to splitting on\nnewlines within quotes rather than actual line terminators.\n\nWith the current implementation, this behaviour can be controlled by the\nsession-level `datafusion.optimizer.repartition_file_scans` and\n`datafusion.optimizer.repartition_file_min_size` settings.\n\nThis commit introduces a `newlines_in_values` option to `CsvOptions` and\nplumbs it through to `CsvExec`, which includes it in the test for whether\nparallel execution is supported. This provides a convenient and\nsearchable way to disable file scan repartitioning on a per-CSV basis.\n\nBREAKING CHANGE: This adds new public fields to types with all public\nfields, which is a breaking change.","shortMessageHtmlLink":"feat!: support newlines_in_values CSV option"}}],"hasNextPage":false,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"Y3Vyc29yOnYyOpK7MjAyNC0wNy0yNVQyMjoxNjoyMy4wMDAwMDBazwAAAASJewxr","startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wNy0yNVQyMjoxNjoyMy4wMDAwMDBazwAAAASJewxr","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wNy0xOFQxNDowNDoyNy4wMDAwMDBazwAAAASC_W9z"}},"title":"Activity · connec/datafusion"}