Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Custom delimiters for unflatten #454

Open
jpmckinney opened this issue Aug 6, 2024 · 0 comments
Open

Discussion: Custom delimiters for unflatten #454

jpmckinney opened this issue Aug 6, 2024 · 0 comments

Comments

@jpmckinney
Copy link
Contributor

I notice that some CSVs uploaded to the OCDS Data Review Tool use semi-colons.

With commas:

ocid,id,date,tag,initiationType,tender/id
ocds-1234567-abc,ocds-1234567-abc-1,2000-01-02T00:00:00Z,tender,tender,abc

With semicolons:

ocid;id;date;tag;initiationType;tender/id
ocds-1234567-abc;ocds-1234567-abc-1;2000-01-02T00:00:00Z;tender;tender;abc

Some possible behaviors:

  1. Leave as is. With above example, field is read in as "ocid;id;date;tag;initiationType;tender" which shows up under additional fields.
  2. Allow a dialect to be passed in. This defers all responsibility to the calling code.
  3. Add a sniff boolean argument. If enabled, flatten-tool sniffs the dialect. The sample size and/or possible delimiters could also be passed in.

For CoVEs, flatten-tool's unflatten is called within lib-cove's convert_spreadsheet, which is called by a CoVE's view. The flattentool_options are derived from arguments to convert_spreadsheet – except for paths, encoding (utf-8-sig, cp1252, latin_1), metatab_vertical_orientation (True), convert_titles (True). So, whatever new arguments are added to unflatten will need to be added to convert_spreadsheet.

I think (2) is best, as it gives the most flexibility to the calling code.

@jpmckinney jpmckinney changed the title Discussion: Custom delimiters for unflatten? Discussion: Custom delimiters for unflatten Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant