CSV Column Rename #530

milafrerichs · 2016-01-27T20:54:50Z

thanks for csvkit and your other cool tools like agate and proof.

I wanted to ask if you would integrate a new part into csvkit.

CSVRename

csvrename would allow you to change the header columns of your dataset. agate has a similar tool for agate.Table and I often have to rename columns.

Right now I'm using the header shell script from the book Data Science for the Command Line Toolkit
It works but everybody has to install it by themselfes so I cannot share my data pipelines(Makefiles).

I already created a csvrename and it would work as follow:

Rename/Replace all headers

Replace all the colum headers with new ones, as long as the list has the same length as the columns.

csvrename -n e,d,c,b,a

Rename specific column headers

csvrename -n d,c -c b,a

I potentially would add another argument to select the columns by index:

csvrename -n d,c -i 2,1

What do you think? Or do you have another easy way to do it?
Thanks.

The text was updated successfully, but these errors were encountered:

jpmckinney · 2016-01-27T20:59:50Z

@onyxfish Is your opinion the same as in #310 (comment) ?

onyxfish · 2016-02-05T18:18:17Z

I think so. This still feels like it crosses the line over into the realm of things the command line is a bad environment for. Two comma separated, quoted lists of columns names are just not a very clear way of expressing this behavior—and the length of the commands gets unwieldy very fast.

That being said, I had a need for something like this just this week, so I can feel the pain.

@jpmckinney What do you think?

jpmckinney · 2016-02-05T18:25:34Z

I think the common case would be to rename one or two columns, not all the columns, in which case the length of the command is fine. I have needed this, too, when, for whatever reason, the government changed one header in one file in a set of files.

onyxfish · 2016-02-05T18:28:01Z

It's been my experience that typically when those kind of changes happen columns are also inserted and removed, as was the case in #310. For instance, this is the case with Census Bureau County Business Patterns data files, which pickup a new column suddenly in 2008. It's opening the door to that cascade of related "slight tweak" problems that I'm leery of.

jpmckinney · 2016-02-05T18:30:45Z

I prefer #245 over #310 (and #245 would fix the underlying issue that led to #310). I would like a solution for #245.

onyxfish · 2016-02-05T18:35:07Z

That's reasonable. That would also have resolved my issue with the CBP data. Happy to consider that as an extension to agate.Table.merge.

jpmckinney · 2016-02-05T19:20:32Z

csvstack is currently streaming, and I'd like to preserve that.

onyxfish · 2016-02-05T19:44:34Z

Well for what it's worth this is now implemented in agate. It should be pretty straightforward to duplicate the logic for the csvkit streaming interface.

jpmckinney · 2016-02-09T03:57:30Z

Noting that the method is agate.Table.rename

jpmckinney · 2016-06-09T01:59:48Z

Thank you for suggesting this new CSV tool. However, the maintainers have decided to not author, merge or maintain new tools; there is simply not enough time to do so. Our focus is instead on making the existing tools as good as possible.

We encourage you to create and maintain your own tool as a separate Python package. You may want to use the agate library, which csvkit uses for most of its CSV reading and writing. Doing so will make it easier to maintain common behavior with csvkit’s tools.

metasoarous · 2017-03-08T05:59:01Z

This is disappointing, quite frankly. This would be a very useful feature for quickly cleaning up data from the command line. I don't at all see how this crosses into "the realm of things the command line is a bad environment for". On the contrary, I've been able to write rather intricate scripts/pipelines for data munging, and this is the sort of thing I hate having to drop down to sed for.

jpmckinney · 2017-03-08T14:37:40Z

@metasoarous That was not the reason for closing the issue - re-read the last comment before closing. If you want this feature, implement it. The maintainers are not your free labor.

metasoarous · 2017-03-08T15:37:35Z

I read the comment before writing. I also didn't demand that anyone work on it. The original poster indicated they'd already created an implementation. I was only making a plea/suggestion that it be considered for inclusion, and am saddened not only about this issue but that you and the rest of the team are categorically apposed to any new tools. It's your project though. Obviously you have the right to do what you like with it. As a user, I just wanted to let you know how I feel about it.

cosmoKenney · 2021-04-22T15:39:43Z

renaming columns can be done with csvsql. Just alias the names:

select "My First Column" as "FirstColumn", "My Second Column" as "SecondColumn" --...

jpmckinney · 2024-05-02T17:30:05Z

Open issue: #396

jpmckinney added the feature label Jan 27, 2016

onyxfish mentioned this issue Feb 5, 2016

Allow merge to combine tables with different column names/orders wireservice/agate#465

Closed

jpmckinney mentioned this issue Feb 5, 2016

csvstack: Use agate.Table.merge #562

Closed

jpmckinney added the Low Priority label Mar 27, 2016

jpmckinney added new tool and removed feature Low Priority labels Jun 4, 2016

jpmckinney closed this as completed Jun 9, 2016

jpmckinney mentioned this issue Mar 28, 2017

Made a tool to rename columns, want to integrate it #814

Closed

jpmckinney mentioned this issue May 2, 2024

csvsed (or csvgrep with replace) #1057

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSV Column Rename #530

CSV Column Rename #530

milafrerichs commented Jan 27, 2016

jpmckinney commented Jan 27, 2016

onyxfish commented Feb 5, 2016

jpmckinney commented Feb 5, 2016

onyxfish commented Feb 5, 2016

jpmckinney commented Feb 5, 2016

onyxfish commented Feb 5, 2016

jpmckinney commented Feb 5, 2016

onyxfish commented Feb 5, 2016

jpmckinney commented Feb 9, 2016

jpmckinney commented Jun 9, 2016

metasoarous commented Mar 8, 2017 •

edited

Loading

jpmckinney commented Mar 8, 2017

metasoarous commented Mar 8, 2017 •

edited

Loading

cosmoKenney commented Apr 22, 2021

jpmckinney commented May 2, 2024

CSV Column Rename #530

CSV Column Rename #530

Comments

milafrerichs commented Jan 27, 2016

CSVRename

Rename/Replace all headers

Rename specific column headers

jpmckinney commented Jan 27, 2016

onyxfish commented Feb 5, 2016

jpmckinney commented Feb 5, 2016

onyxfish commented Feb 5, 2016

jpmckinney commented Feb 5, 2016

onyxfish commented Feb 5, 2016

jpmckinney commented Feb 5, 2016

onyxfish commented Feb 5, 2016

jpmckinney commented Feb 9, 2016

jpmckinney commented Jun 9, 2016

metasoarous commented Mar 8, 2017 • edited Loading

jpmckinney commented Mar 8, 2017

metasoarous commented Mar 8, 2017 • edited Loading

cosmoKenney commented Apr 22, 2021

jpmckinney commented May 2, 2024

metasoarous commented Mar 8, 2017 •

edited

Loading

metasoarous commented Mar 8, 2017 •

edited

Loading