-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jitter_timestamp with variable name change #85
Conversation
JITTER command in recipe generates an error that points to this file. jitter_timestamp was called in line 121 with "item=dicom" but the function itself now asks for parameter name "dicom" so changed line 121 to dicom=dicom Also the code for jitter adding datetime together with simple + in line 160 feels unusual for date operation. Added new code to utilize datetime package to do the jitter and updates the dicom field. In dicom/utils/action.py there is a get_timestamp func which is imported by actions.py and seems to deal with jitter but looks to be heavier than necessary since JITTER operates in +/- days only based on Documentation. Neither current nor old jitter_timestamp uses the get_timestamp function.
hey @howardpchen this PR is really great and I'm so glad you contributed it! I'll have some time tomorrow to give it a proper review, so stay tuned! :L) |
Added fix to handle when the DICOM tag is blank or doesn't exist, and added support for both DA (Date) and DT (timestamp) DICOM value representations for JITTER. Will handle NEMA DICOM non-compliant dates as long as it can be parsed by `dateutil.parse`. JITTER will ignore non-date, non-timestamp fields and throw a warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed new commit - should address the issues raised including empty and None
fields. Uses the DICOM DataElement's Value Representation to determine whether JITTER should output a date or timestamp, and should be able to support most common date or timestamp inputs like "YYYY-MM-DD" and "YYYY/MM/DD". The output is always in DICOM-compliant format.
@vsoch Please review, and thank you!
I think this was a casualty of when I tried to reorganize the functions, removing the identifiers module that had the right "get_timestamp" function, here --> https://github.com/pydicom/deid/blob/38c6ed91591ba5a340079d6ec4b3c52107e03348/deid/identifiers/utils.py Take a look at the original function, I'll put it here too: def get_timestamp(item_date, item_time=None, jitter_days=None, format=None):
'''get_timestamp will return (default) a UTC timestamp
with some date and (optionall) time. A different format can be
provided to change default behavior. eg: "%Y%m%d"
'''
if format is None:
format = "%Y-%m-%dT%H:%M:%SZ"
if item_date in ['',None]:
bot.warning("No date in header, cannot create timestamp.")
return None
if item_time is None:
item_time = ""
timestamp = dateutil.parser.parse("%s%s" %(item_date,item_time))
if jitter_days is not None:
jitter_days = int(float(jitter_days))
timestamp = timestamp + timedelta(days=jitter_days)
return timestamp.strftime(format) Actually, it's still there, under deid.utils.actions. The issue is with the jitter_timestamp function, which also used to be the following: # Timestamps
def jitter_timestamp(field,value,item):
'''if present, jitter a timestamp in dicom
field "field" by number of days specified by "value"
The value can be positive or negative.
'''
value = to_int(value)
original = item.get(field,None)
if original is not None:
jittered = get_timestamp(item_date=original,
jitter_days=value,
format="%Y%m%d")
bot.debug("JITTER %s + (%s): %s" %(original,
value,
jittered))
item[field] = jittered
return item This would explain why "item" is the argument instead! This function is a lesser version of what you've implemented (it assumes a date format) so let's move forward with your solution. |
deid/dicom/actions.py
Outdated
new_value = original | ||
bot.warning("JITTER not supported for %s with VR=%s" % (field, | ||
dcmvr)) | ||
dicom = update_tag(dicom, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! My last comment is here - if we haven't changed the value, then we don't need to update the tag. I would set new_value to None, and then only update_tag if new_value isn't equal to None. You could also check if the new_value is equal to the original (and not update the tag if so). Does that make sense? Beautiful commenting here, by the way!
Also, let's please add a test! You can add it to this file 👉 https://github.com/pydicom/deid/blob/master/deid/tests/test_dicom_utils.py#L41 since it's a util. You can follow the others for example - add something like this: def test_jitter_timestamp(self):
from deid.dicom.actions import jitter_timestamp
dicom = get_dicom(self.dataset)
print("Test test_jitter_timestamp")
print("Case 1: ...")
# write your test cases here, use self.assertTrue and self.assertEqual(a,b) |
Added tests for jitter_timestamp. Noticed the test cases currently doesn't do JITTER (might have been how it fell through the crack?) so added that too.
field=field, | ||
value=new_value) | ||
if (new_value is not None and new_value is not original): | ||
# Only update if there's something to update AND there's been change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here let's do != instead of writing it out. The difference is subtle - "is" as you have here relates to two objects being the exact same object. But if we use equals (either == or !=) we allow the two to be equivalent, but not necessarily the same object in memory. We want the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok
@@ -160,6 +160,61 @@ def generate_uid(item, value, field): | |||
updated = perform_action(dicom=dicom, action=ACTION, item=item) | |||
self.assertEqual(updated.PatientID, "pancakes") | |||
|
|||
def test_jitter_timestamp(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow this is totally fantastic! :D
I'll take care of bumping the version and releasing after we finish this up! Your tests just ran and passed! |
Great work! Merging. |
https://pypi.org/project/deid/0.1.25/ And @howardpchen I added you as a contributor! https://github.com/pydicom/deid/blob/master/AUTHORS.md |
JITTER command in recipe generates an error that points to this file.
jitter_timestamp was called in line 121 with "item=dicom" but the
function itself now asks for parameter name "dicom" so changed line 121
to dicom=dicom
Also the code for jitter adding datetime together with simple + in line
160 feels unusual for date operation. Added new code to utilize
datetime package to do the jitter and updates the dicom field.
In dicom/utils/action.py there is a get_timestamp func which is imported
by actions.py and seems to deal with jitter but looks to be heavier than
necessary since JITTER operates in +/- days only based on Documentation.
Neither current nor old jitter_timestamp uses the get_timestamp
function.
Description
Related issues: #86
Please include a summary of the change(s) and if relevant, any related issues above.
Checklist
Open questions
Questions that require more discussion or to be addressed in future development: