-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NamedTuples with lots of missing data #36712
Comments
It's very unclear what this issue is about with all these fuzzy statements and claims that Julia should somehow announce that you shouldn't use unions in data science or something. Please reopen an issue written in such a way that it is clear what the issue is, (what code ran, what was the result, what was the expected result), leaving out irrelevant side comments and ultimatums. |
Which comments are irrelevant? I have some code and the result above??? |
I quoted them. Again, feel free to open a new issue where you leave out that stuff. A good format is:
|
Ok, well, I still don't see why those comments aren't relevant, but see #36713 |
I agree this issue lacks clarity:
As for the actual issue, it seems like the request is to just extend some of the inference improvements from #32699 to apply in more cases? These kinds of issues are definitely tricky to spell out exactly; #32699 originally came out at last year's JuliaCon when @JeffBezanson and I sat down for a half hour and walked through some code; it was only then that we were both able to fully communicate and understand what the issue was and what we could reasonably do in Base. I'm not saying that you need a private meeting with Jeff to get anything done, but I'm just trying to emphasize that #32699 was really hard for me to explain or write up in an issue, because there are lots of steps and factors that come into play. I think it'd be helpful in this case to provide as much context as possible: what is the code? what isn't inferring like you'd expect? how is the specific uninferrable case affecting larger code flow? |
If you would like context, you can see the benchmarks.jl file of LightQuery on master. This code:
This code runs in 0.4 seconds (beating DataFrames). As as I add column 3, which has an eltype of |
Ok, now that's a concrete, actionable issue. In the future, please start with that. |
Ok, will do |
Given that rows in tables are essentially named tuples of a bunch of values, many or most of which are of type
Union{Missing, T}
, I think it would be good to prioritize dealing with these kind of types in a reasonable way if Julia is aiming to be used for data science. I've been waiting a while for #32699 but it doesn't seemed to have helped much. For exampleAlso, #31909 means that you can't use union missing data as captures. Given that these types are kind of messy, it would be understandable if Base cannot support them, but if so, please make an announcement so that the data ecosystem can fully shift to alternatives such as DataValues. Apologies if there is an issue for this open already.
The text was updated successfully, but these errors were encountered: