-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize (or rename) headers #396
Comments
A non-csvkit solution would be to first pipe the CSV file through a little script I wrote called cat manual.csv | header -r "some,sane,column,names" | csvsql --query "..." Hope this helps in the meantime. |
Other options if you don't have access to additional utilities: $ { head -1 input.csv | sed -e 's/bad/good/g' ; tail +2 input.csv ; } | csvsql or
|
I would also like this feature -- specifically for making CSV header rows lowercase when appropriate. |
@pudo Want to take a stab at a PR? |
I'd be in favor of finding a way to make this part of agate, either as a flag to the table constructor or as a |
@smari , in case you want a solution now [edit: I should have payed attention to when you actually commented here], here's a way to lowercase the column names using
PS. I'm not saying anything about whether or not this should be a feature of either |
Copying over this comment from #525... In case it helps anyone else, you can lowercase all column names in a Postgres database after you import using this magical incantation from Stack Overflow: \t on
select 'ALTER TABLE '||'"'||table_name||'"'||' RENAME COLUMN '||'"'||column_name||'"'||' TO ' || lower(column_name)||';'
from information_schema.columns
where table_schema = 'public' and lower(column_name) != column_name
\g /tmp/go_to_lower
\i /tmp/go_to_lower |
I'm very much in favor of this feature, but I'm not sure where it should live since it affects so many things. Maybe it's part of |
Waiting on wireservice/agate#660 in case agate API changes. |
May be able to be inspired from master...culebron:master |
With master...culebron:master as a guide, this is my implementation: https://github.com/smnorris/bin/blob/master/shampoo |
Another way to lowercase header:
|
fix broken code block
Related: wireservice/agate#668 |
@jeroenjanssens, I found it in your dsutils repo: https://github.com/jeroenjanssens/dsutils/header I have list of account IDs in a bash variable that I want to write to CSV:
Now I can use your header command to set the header like this.
Note that I'm actually using in2csv first to be reasonably sure that the input is a valid, single-column CSV file without a header. in2csv even adds the header "a" here, but I can't find a way to change that value! So I use Here's what in2csv would output:
And here's what header would output:
It seems like the right place for csvkit to support would be a new option in the in2csv command that works like the |
A way to add a header on a pipe output in a single command without external tooling: csvcut -c '3,10,7' input.csv \
| ( echo "col1,col2,col3" ; cat -; ) > output.csv |
csvkit 2.0.0 adds Some other potential normalizations implemented here: https://github.com/dannguyen/csvmedkit/blob/main/csvmedkit/utils/csvnorm.py For inspiration, csvmedkit has a csvheader command: https://github.com/dannguyen/csvmedkit/blob/main/csvmedkit/utils/csvheader.py |
This is specifically with regards to
csvsql
, where loading a CSV file withSome manually entered - header (TM)
will give you a data structure that is really hard to query. But I think having the ability to essentially slugify and transliterate a headers would be useful for other tools likecsvclean
aswell.I tried looking into how this could be done, and it looks like
parse_column_identifiers
(here) could be an appropriate place, but that would require pulling the option through many intermediate functions. Is there a better place?The text was updated successfully, but these errors were encountered: