Skip to content

How to encode or anonymize the DICOM Files

Raluca M. Sandu edited this page Oct 3, 2019 · 2 revisions

3.10.2019 Solved Major Bug

  • don't add '-' in the patient name because the XML cannot read it and gibberish characters are generate which leads to the XML being impossible to open --> no CT SeriesInstnaceUID can be read --> no matching between the segmentation and source CT images

Rewrite the following DICOM Tags:

  1. PatientName
  2. PatientID
  3. PatientBirthDate
  4. InstitutionName
  5. InstitutionAddress

Code # rootdir is the patient folder that we want anonymized for subdir, dirs, files in os.walk(rootdir): for file in sorted(files): # sort files by date of creation DcmFilePathName = os.path.join(subdir, file) try: dcm_file = os.path.normpath(DcmFilePathName) dataset = pydicom.read_file(dcm_file) except Exception as e: print(repr(e)) continue # not a DICOM file dataset.PatientName = patient_name dataset.PatientID = patient_id dataset.PatientBirthDate = patient_dob dataset.InstitutionName = "None" dataset.InstitutionAddress = "None" dataset.save_as(dcm_file)

Example

  • PatientID : "B01"
  • PatientName: "MAVB01"
  • PatientBirthDate: "19470101"

DON'T

  • don't add dashes or underscores or other characters in the PatientID or PatientName
  • don't add umlauts or other special characters from German or Swedish or Dannish etc. alphabets
Clone this wiki locally