Working with Structures#
In PyDicer, structure sets which have been converted from the DICOM RTSTRUCT
modality are stored within the directory structure. One common issue when working with real world datasets is that structure names are often inconsistent requiring standardisation of names prior to analysing data.
In PyDicer, structure name standardisation is achieved by defining structure mapping dictionaries which can be stored globally (applied to all structure sets) or locally (specific mapping per structure set or per patient).
In this guide we present some examples on how to define such structure name mappings and will introduce the StructureSet class which simplifies loading and working with structure objects.
[1]:
try:
from pydicer import PyDicer
except ImportError:
!pip install pydicer
from pydicer import PyDicer
from pathlib import Path
import SimpleITK as sitk
from pydicer.utils import fetch_converted_test_data, add_structure_name_mapping
from pydicer.dataset.structureset import StructureSet
Setup PyDicer#
Here we load the LCTSC data which has already been converted. This is downloaded into the testdata_lctsc
directory. We also initialise a PyDicer
object.
[2]:
working_directory = fetch_converted_test_data("./testdata_lctsc", dataset="LCTSC")
pydicer = PyDicer(working_directory)
Working directory %s aready exists, won't download test data.
Load Structures with StructureSet
#
With the StructureSet
, we can load structures in a structure set, with the structure name being the key and the SimpleITK
Image of the mask as the value.
In the following cell, we create a StructureSet
object, determine the names of the structures in that structure set, and iterate over each structure, printing the sum of all voxel values in the mask (for demonstration purposes).
[3]:
# Load the converted data
df = pydicer.read_converted_data()
df_structs = df[df.modality=="RTSTRUCT"]
# Create a StructureSet for the first row
struct_row = df_structs.iloc[0]
structure_set = StructureSet(struct_row)
structure_names = structure_set.structure_names
print(f"Structure names: {structure_names}")
for structure in structure_names:
mask = sitk.GetArrayFromImage(structure_set[structure])
print(f"Mask voxel sum for {structure}: {mask.sum()}")
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
Mask voxel sum for Lung_L: 52528
Mask voxel sum for Esophagus: 1151
Mask voxel sum for Lung_R: 68074
Mask voxel sum for SpinalCord: 4157
Mask voxel sum for Heart: 23857
In the next cell, we iterate over all our structure sets, and print out the names of the structures available. Notice that for some structure sets, structures aren’t named consistently. In the next section we will resolve this with a structure name mapping.
[4]:
df = pydicer.read_converted_data()
df_structs = df[df.modality=="RTSTRUCT"]
for idx, struct_row in df_structs.iterrows():
structure_set = StructureSet(struct_row)
structure_names = structure_set.structure_names
print(f"Patient: {struct_row.patient_id}, Structure names: {structure_names}")
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Train-S1-007, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Train-S1-006, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Test-S1-102, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Train-S1-002, Structure names: ['Esophagus', 'SpinalCord', 'Lung_Right', 'Heart', 'Lung_Left']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Train-S1-008, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Train-S1-004, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SC', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Train-S1-005, Structure names: ['Esophagus', 'Lung_R', 'SpinalCord', 'Heart', 'L_Lung']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Train-S1-001, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Test-S1-101, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id default
Patient: LCTSC-Train-S1-003, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
Add Structure Name Mapping#
Structure name mappings are defined as Python dictionaries, with the standardised structure name as the key, and the value a list of name variations which should map to the standardised name.
Use the add_structure_name_mapping to add a mapping. A mapping_id
may be supplied to refer to different mappings. If no mapping_id
is supplied, a default mapping id is used.
If a structure_set_row
or patient_id
is supplied, then the mapping will be stored at the corresponding level. If neither is supplied, the mapping will be stored globally for all structure sets in the datasets.
[5]:
mapping = {
"Esophagus": [],
"Heart": [],
"Lung_L": ["Lung_Left"],
"Lung_R": ["Lung_Right"],
"SpinalCord": ["SC"],
}
[6]:
pydicer.add_structure_name_mapping(mapping)
pydicer.utils - INFO - Adding mapping for project in testdata_lctsc/.pydicer
The default mapping has been saved. You can find the saved mapping in the testdata_lctsc/.pydicer/.structure_set_mappings
directory.
Now we can check our StructureSet
to confirm the names are mapped properly.
[7]:
for idx, struct_row in df_structs.iterrows():
structure_set = StructureSet(struct_row)
structure_names = structure_set.structure_names
print(f"Patient: {struct_row.patient_id}, Structure names: {structure_names}")
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Train-S1-007, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Train-S1-006, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Test-S1-102, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Train-S1-002, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Train-S1-008, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Train-S1-004, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Train-S1-005, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Train-S1-001, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Test-S1-101, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/default.json
Patient: LCTSC-Train-S1-003, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
Subsets of Structures#
Structure name mappings are also useful if you only want to work with a subset of structures available. Simply leave them out of the mapping entirely, and they won’t be loaded as part of the StructureSet
.
In this example, we use a mapping_id
of struct_subset
to keep this mapping separate from the mapping defined above.
[8]:
mapping_id = "struct_subset"
sub_mapping = {
"Lung_L": ["Lung_Left"],
"Lung_R": ["Lung_Right"],
}
pydicer.add_structure_name_mapping(sub_mapping, mapping_id=mapping_id)
for idx, struct_row in df_structs.iterrows():
structure_set = StructureSet(struct_row, mapping_id=mapping_id) # Provide the mapping_id!
structure_names = structure_set.structure_names
print(f"Patient: {struct_row.patient_id}, Structure names: {structure_names}")
pydicer.utils - INFO - Adding mapping for project in testdata_lctsc/.pydicer
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Train-S1-007, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Train-S1-006, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Test-S1-102, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Train-S1-002, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Train-S1-008, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Train-S1-004, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Train-S1-005, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Train-S1-001, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Test-S1-101, Structure names: ['Lung_L', 'Lung_R']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/.pydicer/.structure_set_mappings/struct_subset.json
Patient: LCTSC-Train-S1-003, Structure names: ['Lung_L', 'Lung_R']
Local Structure Name Mappings#
Next, we only specify a mapping for one specific structure set. We will use local_mapping
as the mapping_id
. In the output you will see that only one structure set has had the mapping applied.
[9]:
mapping_id = "local_mapping"
mapping = {
"Esophagus": [],
"Heart": [],
"Lung_L": ["Lung_Left"],
"Lung_R": ["Lung_Right"],
"SpinalCord": ["SC"],
}
struct_row = df[(df.patient_id=="LCTSC-Train-S1-002") & (df.modality=="RTSTRUCT")].iloc[0]
# Only adding mapping for one structure set
pydicer.add_structure_name_mapping(
mapping,
mapping_id=mapping_id,
structure_set_row=struct_row
)
for idx, struct_row in df_structs.iterrows():
structure_set = StructureSet(struct_row, mapping_id=mapping_id) # Provide the mapping_id!
structure_names = structure_set.structure_names
print(f"Patient: {struct_row.patient_id}, Structure names: {structure_names}")
pydicer.utils - INFO - Adding mapping local_mapping for structure set f036b8
pydicer.utils - INFO - Adding mapping for stucture_set in testdata_lctsc/data/LCTSC-Train-S1-002/structures/f036b8
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Train-S1-007, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Train-S1-006, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Test-S1-102, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - DEBUG - Using mapping file in testdata_lctsc/data/LCTSC-Train-S1-002/structures/f036b8/.structure_set_mappings/local_mapping.json
Patient: LCTSC-Train-S1-002, Structure names: ['Esophagus', 'Heart', 'Lung_L', 'Lung_R', 'SpinalCord']
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Train-S1-008, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Train-S1-004, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SC', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Train-S1-005, Structure names: ['Esophagus', 'Lung_R', 'SpinalCord', 'Heart', 'L_Lung']
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Train-S1-001, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Test-S1-101, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
pydicer.dataset.structureset - WARNING - No mapping file found with id local_mapping
Patient: LCTSC-Train-S1-003, Structure names: ['Lung_L', 'Esophagus', 'Lung_R', 'SpinalCord', 'Heart']
Notice that mapping has only been applied to the structure set for patient LCTSC-Train-S1-002
Using Mappings in PyDicer#
Once mappings are defined, these can be used when you: - Compute Dose Metrics - Fetch Radiomics Features - Analyse Auto-segmentations - Prepare data for nnUNet training
Check out the documentation for those modules to see where you can supply your mapping_id
to have the structure set standardisation applied. If you have used the default mapping_id
, the standardisation will be applied automatically.
[ ]: