Generating Data Objects#
While PyDicer primarily deals with converting data objects from DICOM, there are instances where you may want to generate a new data object and have it integrated into your PyDicer dataset.
Some examples of when you may want to do this are: - Generate a new dose grid with EQD2 correction applied. - Generate a structure set of auto-segmented structures. - Generate a Pseudo-CT image from an MRI.
In this guide we show you how to generate new data objects to help you perform tasks such as those described in the examples mentioned above.
[1]:
try:
from pydicer import PyDicer
except ImportError:
!pip install pydicer
from pydicer import PyDicer
from pathlib import Path
import SimpleITK as sitk
from pydicer.utils import fetch_converted_test_data, load_object_metadata, read_simple_itk_image
working_directory = fetch_converted_test_data("./testdata_hnscc", dataset="HNSCC")
pydicer = PyDicer(working_directory)
Working directory %s aready exists, won't download test data.
Generate Dose Objects#
In the following cell, we: - Iterate over each dose grid in our dataset - Load the dose grid using SimpleITK - Apply EQD2 dose correction (hard coded for demonstration purposes) - Save the corrected dose as a new object in our dataset
Once the dose object is saved, when you compute DVHs and dose metrics, this new dose will appear in that data.
[2]:
alpha_beta = 2
df = pydicer.read_converted_data()
df_doses = df[df["modality"] == "RTDOSE"]
for _, dose_row in df_doses.iterrows():
if "EQD2_ab" in dose_row.hashed_uid:
# This is an already scaled dose
continue
df_linked_plan = df[df["sop_instance_uid"] == dose_row.referenced_sop_instance_uid]
linked_plan = df_linked_plan.iloc[0]
ds_plan = load_object_metadata(linked_plan)
# Read the planned fractions from the plan object
fractions = int(ds_plan.FractionGroupSequence[0].NumberOfFractionsPlanned)
print(f"{dose_row.patient_id} has {fractions} fractions")
# Load the dose grid
dose_path = Path(dose_row.path).joinpath("RTDOSE.nii.gz")
dose = sitk.ReadImage(str(dose_path))
dose = sitk.Cast(dose, sitk.sitkFloat64)
dose_id = f"{dose_row.hashed_uid}_EQD2_ab{alpha_beta}"
if len(df_doses[df_doses.hashed_uid == dose_id]) > 0:
print(f"Already converted dose for {dose_id}")
continue
# Apply the EQD2 correction
eqd2_dose = dose * (((dose / fractions) + alpha_beta) / (2 + alpha_beta))
# Save off the new dose grid
try:
print(f"Saving dose grid with ID: {dose_id}")
pydicer.add_dose_object(
eqd2_dose, dose_id, dose_row.patient_id, linked_plan, dose_row.for_uid
)
except SystemError:
print(f"Dose object {dose_id} already exists!")
HNSCC-01-0199 has 33 fractions
Saving dose grid with ID: c16e76_EQD2_ab2
HNSCC-01-0176 has 20 fractions
Saving dose grid with ID: 833a74_EQD2_ab2
HNSCC-01-0176 has 20 fractions
Saving dose grid with ID: bf3fba_EQD2_ab2
HNSCC-01-0019 has 21 fractions
Saving dose grid with ID: 309e1a_EQD2_ab2
Now we can load our data objects, and check that our new dose grids are stored alongside our converted data.
[3]:
df = pydicer.read_converted_data()
df[df.modality=="RTDOSE"]
[3]:
sop_instance_uid | hashed_uid | modality | patient_id | series_uid | for_uid | referenced_sop_instance_uid | path | |
---|---|---|---|---|---|---|---|---|
1 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.264264397186... | c16e76 | RTDOSE | HNSCC-01-0199 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.233527028792... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.870916135819... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.287865632112... | testdata_hnscc/data/HNSCC-01-0199/doses/c16e76 |
4 | c16e76_EQD2_ab2 | c16e76_EQD2_ab2 | RTDOSE | HNSCC-01-0199 | c16e76_EQD2_ab2 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.870916135819... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.287865632112... | testdata_hnscc/data/HNSCC-01-0199/doses/c16e76... |
9 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.169033525924... | 833a74 | RTDOSE | HNSCC-01-0176 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.279793773343... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.706719210726... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.470253980284... | testdata_hnscc/data/HNSCC-01-0176/doses/833a74 |
10 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.267291308489... | bf3fba | RTDOSE | HNSCC-01-0176 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.283706688235... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.566662631858... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.173917268454... | testdata_hnscc/data/HNSCC-01-0176/doses/bf3fba |
15 | 833a74_EQD2_ab2 | 833a74_EQD2_ab2 | RTDOSE | HNSCC-01-0176 | 833a74_EQD2_ab2 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.706719210726... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.470253980284... | testdata_hnscc/data/HNSCC-01-0176/doses/833a74... |
16 | bf3fba_EQD2_ab2 | bf3fba_EQD2_ab2 | RTDOSE | HNSCC-01-0176 | bf3fba_EQD2_ab2 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.566662631858... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.173917268454... | testdata_hnscc/data/HNSCC-01-0176/doses/bf3fba... |
18 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.242809596262... | 309e1a | RTDOSE | HNSCC-01-0019 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.777975715563... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.290727775603... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.254865609982... | testdata_hnscc/data/HNSCC-01-0019/doses/309e1a |
21 | 309e1a_EQD2_ab2 | 309e1a_EQD2_ab2 | RTDOSE | HNSCC-01-0019 | 309e1a_EQD2_ab2 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.290727775603... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.254865609982... | testdata_hnscc/data/HNSCC-01-0019/doses/309e1a... |
Generate Structure Set Objects#
In this example, we: - Iterate over each CT image in our dataset - Load the CT image using SimpleITK, and apply a threshold to segment bones - Save the segmented bones as a new structure set object
Note: This specific functionality is supported by the auto-segmentation inference module. If you are using this to generate auto-segmentations it is recommended you use that functionality.
[4]:
bone_threshold = 300 # Set threshold at 300 HU
df = pydicer.read_converted_data()
df_cts = df[df["modality"] == "CT"]
for idx, ct_row in df_cts.iterrows():
# Load the image
img = read_simple_itk_image(ct_row)
# Apply the threshold
bone_mask = img > bone_threshold
# Save the mask in a new structure set
structure_set_id = f"bones_{ct_row.hashed_uid}"
new_structure_set = {
"bones": bone_mask
}
try:
print(f"Saving structure set with ID: {structure_set_id}")
pydicer.add_structure_object(
new_structure_set,
structure_set_id,
ct_row.patient_id,
ct_row,
)
except SystemError:
print(f"Structure Set {structure_set_id} already exists!")
Saving structure set with ID: bones_72b0f9
Saving structure set with ID: bones_c4ffd0
Saving structure set with ID: bones_8e0da9
Saving structure set with ID: bones_ec4aec
Saving structure set with ID: bones_33c44a
Saving structure set with ID: bones_b281ea
Let’s load out data objects to see if we have our new structure sets.
[5]:
df = pydicer.read_converted_data()
df[df.modality=="RTSTRUCT"]
[5]:
sop_instance_uid | hashed_uid | modality | patient_id | series_uid | for_uid | referenced_sop_instance_uid | path | |
---|---|---|---|---|---|---|---|---|
3 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.166429645421... | 06e49c | RTSTRUCT | HNSCC-01-0199 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.243934637013... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.870916135819... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.240263316258... | testdata_hnscc/data/HNSCC-01-0199/structures/0... |
5 | bones_72b0f9 | bones_72b0f9 | RTSTRUCT | HNSCC-01-0199 | bones_72b0f9 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.870916135819... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.240263316258... | testdata_hnscc/data/HNSCC-01-0199/structures/b... |
14 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.403955456521... | cbbf5b | RTSTRUCT | HNSCC-01-0176 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.276897558084... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.120880328745... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.334001018535... | testdata_hnscc/data/HNSCC-01-0176/structures/c... |
15 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.323156708629... | 6d2934 | RTSTRUCT | HNSCC-01-0176 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.495627765798... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.310630617866... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.469610481459... | testdata_hnscc/data/HNSCC-01-0176/structures/6... |
18 | bones_c4ffd0 | bones_c4ffd0 | RTSTRUCT | HNSCC-01-0176 | bones_c4ffd0 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.120880328745... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.334001018535... | testdata_hnscc/data/HNSCC-01-0176/structures/b... |
19 | bones_8e0da9 | bones_8e0da9 | RTSTRUCT | HNSCC-01-0176 | bones_8e0da9 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.216161306702... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.107072817915... | testdata_hnscc/data/HNSCC-01-0176/structures/b... |
20 | bones_ec4aec | bones_ec4aec | RTSTRUCT | HNSCC-01-0176 | bones_ec4aec | 1.3.6.1.4.1.14519.5.2.1.1706.8040.216161306702... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.133948865586... | testdata_hnscc/data/HNSCC-01-0176/structures/b... |
21 | bones_33c44a | bones_33c44a | RTSTRUCT | HNSCC-01-0176 | bones_33c44a | 1.3.6.1.4.1.14519.5.2.1.1706.8040.310630617866... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.469610481459... | testdata_hnscc/data/HNSCC-01-0176/structures/b... |
25 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.168221415040... | 7cdcd9 | RTSTRUCT | HNSCC-01-0019 | 1.3.6.1.4.1.14519.5.2.1.1706.8040.103450757970... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.290727775603... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.418136430763... | testdata_hnscc/data/HNSCC-01-0019/structures/7... |
27 | bones_b281ea | bones_b281ea | RTSTRUCT | HNSCC-01-0019 | bones_b281ea | 1.3.6.1.4.1.14519.5.2.1.1706.8040.290727775603... | 1.3.6.1.4.1.14519.5.2.1.1706.8040.418136430763... | testdata_hnscc/data/HNSCC-01-0019/structures/b... |
And we can also run the visualise module. Use the force=False
flag to ensure that only the newly generated objects are visualised
[6]:
pydicer.visualise.visualise(force=False)
100%|██████████| 24/24 [00:46<00:00, 1.94s/objects, visualise]
Take a look inside the testdata_hnscc/data
directory for the new structure set folders. See the visualised snapshot to check that our bone segmentation worked!
Generate Image Objects#
In this example, we: - Iterate over each CT image in our dataset - Load the CT image using SimpleITK, and apply a Laplacian Sharpening image filter to it - Save the image as a new image object
[7]:
df = pydicer.read_converted_data()
df_cts = df[df["modality"] == "CT"]
for idx, ct_row in df_cts.iterrows():
# Load the image
img = read_simple_itk_image(ct_row)
# Sharpen the image
img_sharp = sitk.LaplacianSharpening(img)
# Save the sharpened image
img_id = f"sharp_{ct_row.hashed_uid}"
try:
print(f"Saving image with ID: {img_id}")
pydicer.add_image_object(
img_sharp,
img_id,
ct_row.modality,
ct_row.patient_id,
for_uid=ct_row.for_uid
)
except SystemError:
print(f"Image {img_id} already exists!")
Saving image with ID: sharp_72b0f9
Saving image with ID: sharp_c4ffd0
Saving image with ID: sharp_8e0da9
Saving image with ID: sharp_ec4aec
Saving image with ID: sharp_33c44a
Saving image with ID: sharp_b281ea
Now we can visualise the images and produce snapshots once more. Find the sharpened images in the working directory. Can you see the difference between the sharpened CT and the original?
[8]:
pydicer.visualise.visualise(force=False)
0%| | 0/30 [00:00<?, ?objects/s, visualise]/home/runner/.cache/pypoetry/virtualenvs/pydicer-pNPN9n62-py3.9/lib/python3.9/site-packages/pydicom/valuerep.py:443: UserWarning: Invalid value for VR DA: 'NaT'.
warnings.warn(msg)
20%|██ | 6/30 [00:00<00:01, 12.15objects/s, visualise]/home/runner/.cache/pypoetry/virtualenvs/pydicer-pNPN9n62-py3.9/lib/python3.9/site-packages/pydicom/valuerep.py:443: UserWarning: Invalid value for VR DA: 'NaT'.
warnings.warn(msg)
70%|███████ | 21/30 [00:00<00:00, 25.94objects/s, visualise]/home/runner/.cache/pypoetry/virtualenvs/pydicer-pNPN9n62-py3.9/lib/python3.9/site-packages/pydicom/valuerep.py:443: UserWarning: Invalid value for VR DA: 'NaT'.
warnings.warn(msg)
/home/runner/.cache/pypoetry/virtualenvs/pydicer-pNPN9n62-py3.9/lib/python3.9/site-packages/pydicom/valuerep.py:443: UserWarning: Invalid value for VR DA: 'NaT'.
warnings.warn(msg)
/home/runner/.cache/pypoetry/virtualenvs/pydicer-pNPN9n62-py3.9/lib/python3.9/site-packages/pydicom/valuerep.py:443: UserWarning: Invalid value for VR DA: 'NaT'.
warnings.warn(msg)
80%|████████ | 24/30 [00:01<00:00, 11.38objects/s, visualise]/home/runner/.cache/pypoetry/virtualenvs/pydicer-pNPN9n62-py3.9/lib/python3.9/site-packages/pydicom/valuerep.py:443: UserWarning: Invalid value for VR DA: 'NaT'.
warnings.warn(msg)
100%|██████████| 30/30 [00:02<00:00, 13.53objects/s, visualise]
[ ]: