ENH: Enable unary math operations for pandas, sqlite#1071
ENH: Enable unary math operations for pandas, sqlite#1071cpcloud wants to merge 3 commits intoibis-project:masterfrom
Conversation
0dc84c3 to
13a0422
Compare
d6c9085 to
3322844
Compare
|
@wesm can you take a look here? bit of a rabbit hole with decimals, otherwise just adding unary ops to series. |
wesm
left a comment
There was a problem hiding this comment.
Minor comments, but otherwise LGTM
| if column_name in schema: | ||
| ibis_type = dt.validate_type(schema[column_name]) | ||
| elif dtype == np.object_: | ||
| inferred_dtype = infer_dtype(df[column_name].dropna()) |
There was a problem hiding this comment.
Yikes. I guess we should make a NaN-friendly type inference function someplace (seems like an oversight in infer_dtype originally)
There was a problem hiding this comment.
can u post an issue in pandas tracker about this
| def execute_series_unary_op(op, data, scope=None): | ||
| function = getattr(np, type(op).__name__.lower()) | ||
| if data.dtype == np.dtype(np.object_): | ||
| return data.apply(functools.partial(execute_node, op, scope=scope)) |
There was a problem hiding this comment.
Is Series.apply different from Series.map (in behavior or performance)?
There was a problem hiding this comment.
So, it looks like Series.map accepts a dict, to support a simple CASE-statement-like operation, as well as callables, whereas apply only deals with callables. For callables, both methods use the same underlying function lib.map_infer to call the passed in callable in a Cython loop. Here we could use either since we're only dealing with callables.
ibis/pandas/execution.py
Outdated
| def execute_series_log_with_base(op, data, base, scope=None): | ||
| if data.dtype == np.dtype(np.object_): | ||
| func = np.vectorize(functools.partial(execute_node, op, scope=scope)) | ||
| return pd.Series(func(data, base), index=data.index, name=data.name) |
There was a problem hiding this comment.
Perhaps this bit could be factored out into a helper function since it's repeated a couple times (in case it's useful in future execution rules)
|
|
||
| def _floor_divide(t, expr): | ||
| left, right = map(t.translate, expr.op().args) | ||
| return sa.func.floor(left / right) |
There was a problem hiding this comment.
Does integer division in postgres yield doubles?
There was a problem hiding this comment.
No, it yields integers. This is implemented so that it works regardless of the type of left and right.
Implement decimal for pandas Add SQLite unary ops Fix operations in postgres that require numeric
|
Merging on green. |
Implement decimal for pandas
Add SQLite unary ops
Fix operations in postgres that require numeric