Replies: 8 comments 4 replies
-
If you want to use register_function appropriately, I believe you actually need to call it off of the table's Give that a shot. In either case, a more efficient solution might be to create a hybrid column that you can groupby in Table2 (using the "sum" aggregator), then join on to Table1. Pseudo-code as follows:
|
Beta Was this translation helpful? Give feedback.
-
Thanks! Great catch on the typo Your pseudocode is much appreciated for the groupby. My use case involves playing around with different time variations and selection types in general in table 2, and am searching for a more modular solution to manipulating selections or filters withing groups Do you think the delayed and execute decorators can play a role in solving this? - I haven't played around with it, but I assumed they only work for vaex-standard functions. |
Beta Was this translation helpful? Give feedback.
-
In that sense, could you not write a modular function that converts the datetime column into a specified time bin (i.e. minute, hour, day, etc.?), then do the join approach? Given that you're working with financial tick(?) data, I'm assuming you're looking to ultimately optimize time bins in some way to generate a signal of interest. In essence, if you can modularly generate a bin ID, then you can still use the join approach. Would you be able to provide a working example with your starting tables and an example of some of the parameters you would iterate over, along with the expected result for one or two cases? I'm happy to try to generate some working code given the groundwork. It will also be useful to understand to what extent this is a bug with Vaex. |
Beta Was this translation helpful? Give feedback.
-
That sparked an idea to look at binby the underlying code is written and I'm thinking to look further into (mapping, delayed, execution). The surface level functionality is identical, but in my use case, I'd like to perform unique selections dependent on the value of each row, which is why the bin function hasn't sufficed. In the sample below and the initial post. I dwindle the concept down to just summing the values for simplicity. But I appreciate your investigating this with me.
|
Beta Was this translation helpful? Give feedback.
-
I dug around a bit. I think the issue with using
The actual printout of this attempt is interesting. On my computer it seems to evaluate things in no particular order (or even multiple times!), which I assume is related to the parallelization.
I think you could probably use table1.apply() in some way, but that solution is not vectorized nor parallelized, so it's extremely slow. At that point I would advocate for just looping over the unique (date, ticker) combinations, making a dataframe, then concat'ing them all together at the end. But this is basically brute forcing what you're after. I know you explained how the "sum" case was a simplification of your ideal filtering, but here's how I achieved it using groupby():
|
Beta Was this translation helpful? Give feedback.
-
Hi, short reply (vacation mode now)
In the latest alpha it is, but I think this should be solvable without https://vaex.io/docs/tutorial.html#The-escape-hatch:-apply . I haven't had the time to fully understand your problem, but it seems groupby and join should work right? If not, why not. cheers, Maarten |
Beta Was this translation helpful? Give feedback.
-
PS: I move this to discussions, we just opened that, and this is a good candidate for it. |
Beta Was this translation helpful? Give feedback.
-
I want to create a selection before performing the aggregate groupby. Is the most efficient way of doing that to use the filter function or is there another function with a parameter similar to select that will be better?
|
Beta Was this translation helpful? Give feedback.
-
Table 1
Table 2
Table1[sum vol] = sum of all volume data in Table 2 matching the date and ticker of each corresponding row in Table 1
Note: x is a column full of 1s because I wasn't able to return just an integer without multiplying it against a column
Beta Was this translation helpful? Give feedback.
All reactions