-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to read and manipulate file meta #184
Conversation
b175e5b
to
eb7d76d
Compare
Remaining questions: 1. Do file meta fields overlap with fields in the dicom? If so, we need to be checking each field if it is a filemeta or not before replace. As implemented now, the fields are parsing over equivalently and it is assumed that file meta fields do not appear in the dicom and vice versa. 2. Do we need to make any special changes to the file meta, e.g., perhaps the size? I have never manipulated it before, and I am hoping that pydicom save handles these changes, but if not we should do it manually. Signed-off-by: vsoch <[email protected]>
eb7d76d
to
39d8e01
Compare
Hi @vsoch, Many thanks for working on this very quickly! I like the idea of having a section within the deid.dicom recipe file. Just like we have %Header section (https://pydicom.github.io/deid/examples/recipe/), maybe we could have something like %File_meta section for the file meta. My intuition would be that we wouldn't require any changes to be done to the file meta unless the user has that as part of the deid.dicom recipe. I am not sure TBH about that... Many thanks again! Regards Adil |
Yes that's correct! The PR here adds
If this is the case, then my strategy was right on! Do you want to test the branch here? The last remaining bit is if we need to do something special for save. Let's wait for @wetzelj feedback then ping some of the other maintainers to ask. |
I also like the idea of having a separate %file_meta section. We would, however, probably want to ensure that standard %header actions don't overlap and act against filemeta attributes (thinking of expanders and %values based rules). I don't think we would want a recipe set up as below to remove the MediaStorageSOPInstanceUID that it just replaced:
Instead, if we did want to remove fields ending with UID from both sections of the file, we'd want to build a recipe like:
|
Oh yes that's perfect! |
I am wondering why do we need to have |
That was just a sample to indicate the way we would/would not want the actions to be applied - that wasn't anything we'd really want to do. :) |
That’s a good point - but @wetzelj to go in the other direction if fields are unique why even have the second section? A rule to replace all ending with UID should hit those in file meta, why should they be special? But I do think we perhaps want to protect some of the fields like the size and transfer syntax from change? |
Oh I see, that makes sense |
My initial intuitive understanding was that because the data elements part of the file_meta are inherently separate for the dicom file data elements, it would probably make sense to have a seperate section for them within the recipe (to access the dicom data elements it is enough to do |
I guess you're right. It probably doesn't make much sense to split it out. My statement was coming from my line of thinking that filemeta fields were special, and should be somewhat protected from "bad"/overly wide recipes... but really, that's not the case. Honestly, I'm with @adildahlan and don't have a strong opinion on this either. |
We can actually handle this easily because the dicom.file_meta is just another kind of dataset, so we add it as a dataset to parse (and save fields for). Since the fields we add to our lookup data structure are the same ones that point back into the dicom and they are accessed easily, we can actually just parse them akin to any other header. Okay so here is my thoughts for the recipe:
So those are my thoughts for moving forward - remove the special section and treat the fields like any other, and then protect ones that shouldn't be changed. Let me know what you think! |
Sounds perfect! |
Agreed. This solution will be great! |
okay fantastic! I'll get in a PR after the end of the work day. Thanks for the quick, fun discussion y'all! |
instead of having a special section, we can include file-meta parsing alongside the header parsing, as the names are unique. The one tiny detail is that we want to be careful with fields that should not be touched, so those are now added to the set of "skip." This set also includes other like PixelData that was not previously being loaded, which I would consider a bug (that should now be fixed) Signed-off-by: vsoch <[email protected]>
Okay @wetzelj @adildahlan this should be ready for review! Let me know if there are tests you think we should add, or if your use case is not addressed. |
Hi @vsoch ! And I have my recipe as follows:
And in my code I define each of these new variables as done in. this example code https://github.com/pydicom/deid/blob/master/examples/dicom/recipe/deid-dicom-example.py However, I am getting this error I hope you could help me figure out what's wrong Many thanks! Adil |
Oh interesting, looks like an issue with save. Can you share with me a dummy example to reproduce the issue? Are you just using the provided cat.dcm but updated the deid recipe? |
Hi @vsoch ! Oh I see. I am using the data I downloaded from the following link. I would need to check if I am allowed to share the code as it is a private project I am collaborating on. Can you please try running your script with the dataset in the link above? If you don't get the same error maybe then I can check with my supervisor if I can share bits of the code with you. Apologies about that... Many thanks! |
it's okay I was able to reproduce on save:
Let me take a look, I might need to ask for help from the other maintainers of pydicom on this one! |
Oh perfect! Many thanks! Temperarly, I found a way to go around the issue with anonymising the MediaStorageSOPInstanceUID. I did so by running the anonymisation script without having in the recipe the MediaStorageSOPInstanceUID in it. Then afterwards I loop through all the anonymised dicom files and if the MediaStorageSOPInstanceUID dataelement doesn't match the SOPInstanceUID then I change the former to the later and save the dicom file as shown below.
This worked so I guess saving the dicoms file after changing the MediaStorageSOPInstanceUID works but probably not with having MediaStorageSOPInstanceUID in the recipe. Many thanks for all your help! |
there is currently a bug that when we add (or otherwise manipulate a field) we assume retrieving the field from the main dicom. We actually need to check if the field is from the file meta, and change behavior based on that. So these changes should fix a bug of running the example and trying to save the dicom. Signed-off-by: vsoch <[email protected]>
Signed-off-by: vsoch <[email protected]>
hey @adildahlan so I figured it out! What was happening is that for any action that uses the |
Hi @vsoch, it is working!!! |
Woot! Let's merge and release then. |
This is the start of work to read/parse file_meta to address #183. I made some time this evening and I think the implementation was easier than I anticipated, but I have a few questions / concerns to address!
Remaining questions:
Do file meta fields overlap with fields in the dicom?
The simplest implementation I could do is to look for another section in the recipe,
filemeta
, that gets parsed just like the header section. I simply createDicomField
with is_filemeta = True. However, I realized that if the identifiers you find in file_meta are unique to that (and vice versa with the dicom) then we don't need to check this value - we can simply replace/add/ etc based on the unique identifier just as we would. So this is how I implemented it to start. However, if it's the case that the same Field can appear in file meta OR the regular dicom, then we'll need to add flag actions that are just for filemeta, and add checks all over the place that essentially say:TLDR: If fields are not unique to the sections we need to be checking each field if it is a filemeta or not before any action is taken. As implemented now, the fields are parsing over equivalently and it is assumed that file meta fields do not appear in the dicom and vice versa.
Do we need to make any special changes to the file meta, e.g., perhaps the size?
I have never manipulated it before, and I am hoping that pydicom save handles these changes, but if not we should do it manually. We can check with some of the other maintainers of the pydicom org that know pydicom well if we aren't sure here, and after we've addressed the point above.
cc @wetzelj and @adildahlan for your feedback!
Signed-off-by: vsoch [email protected]
Description
Related issues: # (issue)
Please include a summary of the change(s) and if relevant, any related issues above.
Checklist
Open questions
See above!