-
Notifications
You must be signed in to change notification settings - Fork 31
Make ID creation functions jittable #905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Some of the tests are failing, because I still have to adjust the expected values to the new IDs, but others, like |
|
Thanks! We cannot ask that of users (think of doing some query on the data and then remembering that you have to fix |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
It was only problematic for the In 4fbb227 I encountered some test where parents were married to their children, I assumed that was a mistake and fixed them. |
|
Thanks a lot!!
Great, thanks! And agreed. However, we should check these things upfront. Should we add this to the
For sure. @MImmesberger, do we still have that open issue for sanity checks? If so, could you please add this to the list? |
Apologies, this was a bit cryptic I guess. I changed the base to the current "HEAD" so the issue becomes apparent. In the tests, we are currently pre-computing the ids so we have the num_segments parameter. Search for It would be best if we could get rid of this altogether. But we need the How about changing the group ids upon return, so that they are consecutive numbers? We could then simply pass the number of observations as Other ideas welcome! |
hmgaudecker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Just a few small suggestions.
|
Thanks for the clarification, I understand the problem now. |
|
The thing is timing. The best value would be the maximum of the group ids. But we only know that once we have computed them. To avoid that, I'd use the length of the data, which is an upper bound when redefining the contents as suggested above. |
|
Your idea is probably the best then, I did not realize that segment_sum was implemented so that it returns an array the size of the highest segment_id. I will change all ID's to be consecutive numbers starting at 0. Using the length of the inputs as num_segments will be fine, I don't think using a little bit too much memory there will be a problem. |
Me neither! Did you, @timmens ?
Excellent! |
|
This is my last update. |
Sure, this was just meant FYI! |
for more information, see https://pre-commit.ci
|
Thanks! Changing the two I think there are two ways out:
What do you think? Other ideas welcome, ofc! |
for more information, see https://pre-commit.ci
|
I think the first solution is good. Might make it easier for people to write their own ID functions that depend on other ID's, it's much easier if you know they are consecutively numbered starting at 0. It's also probably just better if num_segments does not get too big.
There are now nearly all tests on GETTSIM failing, on the first look unrelated to ID's, should I try to fix this or should this be tackled in a different PR? |
|
Excellent! Apologies I didn't manage to look much into it today. Just had a brief look at the failures, many indeed seem to be because we ares using some constructs that Jax cannot handle. Will improve with #908 / #914, though I could use a hint on what to do to make lookup work ( This as an aside, some test failures do seem related. This one, for example: |
|
In this specific case you might be able to convert the dictionary to an array, use a base year e.g. 1900 and then use I now updated |
|
Looks great and I think it can be merged once the comments in the code are addressed. We'll probably want to postpone this mechanism here:
until we are in the process of redesigning the interface (early June). However, please revert the changes to the test cases that changed id's to 0-based consecutive numbers, so that we don't forget. Please also open an issue outlining the solution strategy. |
|
I think I addressed all your comments and I also reverted the test cases, that test the ID's. Was this correct, or was I supposed to revert the reordering and the changes to the test execution too? |
|
Perfect! We just want to support these cases in Jax, so we should keep them around. Test execution is perfect as is. |
What problem do you want to solve?
Make the ID creation functions in Gettsim and Mettsim jittable.
Todo