-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable easier transformations of multiple columns in DataFrame #342
Comments
Thanks Wes! I didn't realize you'd posted this, but was actually coming to the mailing list to suggest a transform function (much like in R).
Reassignments could be implemented in several ways, that I can think of:
a.
where transform can accept similar arguments to DataFrame? E.g., b.
c.
Depending on the implementation though, (1) may be better. In R, I believe any replacement of values of a subset will copy/modify the entire data frame and reassign the value to the original symbol, which leads to its inefficiency... but so in that case something like
is effectively the same as writing
which is a convenient way of writing
and so on. But if in pandas, individual columns rather than the entire DataFrame can be modified, then the reassignment to the entire pd DataFrame might not be the best idea. And a (1)-type implementation could be general enough to work around the limitation of "setting on mixed-type frames only allowed with scalar values" which are allowed in R - I'm not sure if it was a deliberate decision on your part to not allow this, but if not, could be useful in certain situations. For instance, permitting operations like
Though, to be honest I've caught a bit of the functional-style bug so I'm a bit biased against partial reassignment over returning new values from functions, but I guess reassignment and rebinding is generally the way to go with large data sets... (and it would provide a consistent experience for R users). |
I implemented option #1
in the above referenced commit. Wasn't very difficult in the end. Can address other kinds of transformations if we want at a later time. |
Thanks Wes - sorry for my extremely delayed response. But this is fantastic On Mon, Dec 19, 2011 at 6:21 AM, Wes McKinney <
|
The problem I have now is that I don't have the option to set types when reading data from a sql query, so it would be good if I could parse different data types for multiple columns. I don't know if something like this has been implemented yet, but it would look something like this:
|
things like
should be possible in a mixed-type DataFrmae, per the mailing list discussion
The text was updated successfully, but these errors were encountered: