-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Implement JSON serialization and deserialization for Info objects #13489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
drammock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff looks more or less as I was imagining. My only question is why the separation of __getstate__ and to_dict (and similar for __setstate__ and from_dict). What's the advantage of splitting like that? Why not call _make_serializable within __getstate__ and omit the to_dict method? (and similar question for _restore_objects within __setstate__)
agramfort
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
hey @drammock, Good and hard question, I think my logic was to try to disturb the current mne.info API as little as possible.
and the JSON only supports primitives (
|
I don't think that's right, pickling sometimes also requires some conversions; see https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled. Maybe another way to put my question would be this: what is the disadvantage of doing everything inside |
|
hmmm, i think it is possible @drammock. I find it particularly more interesting to have a dedicated function to make the behavior more explicit. But I understand that mne info is a dictionary and should behave as such, without needing an extra function. or even assume that we always gonna have json_safe=True, and just serialize everything as json, I think that would potentially break the API with the .fiff save function, but I'm really not sure, but it would be not super complicated to fix. By indicating your preference, I can implement what is most convenient here; I don't have a strong preference. |
I don't understand. Are you saying "I want something in the public API (not a dunder)"? Or are you saying "I want to support pickling, and I want pickling to behave differently than serialization intended for JSON"? Or something else? What I've been trying to express all this time is that if we need JSON-compatible serialization (which you seem to want) then why not implement it entirely using the built-in dunders? Why do we need separate to/from_dict methods, and why do those need to behave differently than the dunders? A JSON-compatible serialization should ALSO be pickle-compatible. So why not put the code all in one place?
Yes, of course we need to properly reassemble the object from its serialized form. What I'm saying is that all the code to do that can live inside
saving to FIFF uses a completely different code path ( |
|
I finally understood what you were saying from the beginning! Sorry for my mental delay, now everything is serialization in the setstate and getstate |
|
Now, the function may not seem necessary, but I'm keeping it to maintain version control of the MNE, especially considering the significant governance agenda within MNE Info. I'm afraid that without this version marker, we might not be able to read a JSON file in 2 or 4 years, but I could be wrong. |
|
So, answering the question, I think we should have a public function that makes things more straightforward. |
FWIW we have been writing to/from |
|
Ah, okay, I can drop them the version reference, no problem for me. Do you think I should remove the function call to_dict as well and only leave from_dict? |
larsoner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to send a couple of inline comments
Regarding making functions public, the fact that this doesn't work:
>>> json.dumps(mne.create_info(1, 1000., "eeg"))
...
TypeError: Object of type ndarray is not JSON serializable
suggests that it is worth having something public so people can do .to_dict
>>> json.dumps(mne.create_info(1, 1000., "eeg").to_dict())
'{"acq_pars": null, "acq_stim": null, "ctf_head_t": null, "description": null, "dev_head_t": null, "dev_ctf_t": null, "dig": null, "experimenter": null, "utc_offset": null, "device_info": null, "file_id": null, "highpass": 0.0, "hpi_subsystem": null, "kit_system_id": null, "helium_info": null, "line_freq": null, "lowpass": 500.0, "meas_date": null, "meas_id": null, "proj_id": null,
...
and having a to_dict/from_dict pair makes sense
Co-authored-by: Eric Larson <[email protected]>
|
@larsoner, @drammock, many thanks for the iterations. I'm happier with the code; it's much cleaner and uses all the good and excellent design. I feel like I'm always learning from you, thank you so much for your time reviewing! If you see any additional points, please let me know so I can address them 🙏🏽 :) |
|
I removed the mne_version as previously requested. |
larsoner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay one minor tweak: I don't know if to_dict is specific enough. I think to_json_dict is probably better, paired with from_json_dict or similar.
And one major one that I only now just realized: I think the _mne_type gynmastics should actually only be done in from_json_dict, and to_json_dict. __setstate__ and __getstate__ didn't need these before, so having them now seems unnecessary as well. On main I can pickle.dumps(info) and it works, and the result is shorter than on this PR. This makes me worry that we will incur unnecessary conversion overhead in things like parallel processing (which uses pickling) by routing all __setstate__ and __getstate__ mechanics through the _mne_type conversions.
argh. I think @larsoner has a good point, sorry that this is the opposite of what I pushed for you to do @bruAristimunha |
|
I like to_json_dict / from_json_dict
… Message ID: ***@***.***>
|
larsoner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay this looks nicely isolated now -- thanks @bruAristimunha !
Reference issue (if any)
Fixes #13487.
What does this implement/fix?
Introduce methods to convert Info objects to and from JSON-serializable dictionaries, enhancing data handling and storage capabilities. Adjust tests and pre-commit hooks accordingly.
Additional information