PyDicer#

class pydicer.tool.PyDicer(working_directory='.')#

The PyDicer class provides easy access to all the key PyDicer functionality.

Parameters:

working_directory (str|pathlib.Path, optional) – Directory in which data is stored. Defaults to “.”.

Variables:

convert – Instance of ConvertData
visualise – Instance of VisualiseData
dataset – Instance of PrepareDataset
analyse – Instance of AnalyseData

add_dose_object(*args, **kwargs) → DataFrame#

Add a generated dose object to the project.

Parameters:

dose (sitk.Image) – A SimpleITK.Image of the dose volume to add.
dose_id (str) – The unique ID of the new dose object.
patient_id (str) – The ID of the patient of this object.
linked_plan (str|pd.Series, optional) – The hashed_uid or the Pandas DataFrame row of the RTPLAN object to link to. If None the new object won’t be linked. Defaults to None.
for_uid (str, optional) – Frame Of Reference UID for new data object. If not provided it will be extracted from the linked plan (if available). Defaults to None.
datasets (list|str, optional) – The name(s) of the dataset(s) to add the object to. Defaults to None.

Raises:

ValueError – Raised then the patient ID does not exist
SystemError – Raised when a linked_plan is provided but can’t be found for this patient.

add_image_object(*args, **kwargs) → DataFrame#

Add a generated image object to the project.

Parameters:

image (sitk.Image) – The SimpleITK.Image object to save.
image_id (str) – The unique ID of the new image object.
modality (str) – The modality of the image being saved.
patient_id (str) – The ID of the patient of this object.
linked_image (str|pd.Series, optional) – The hashed_uid or the Pandas DataFrame row of the image object to link to. If None the new object won’t be linked. Defaults to None.
for_uid (str, optional) – Frame Of Reference UID for new data object. If not provided it will be extracted from the linked image (if available). Defaults to None.
datasets (list|str, optional) – The name(s) of the dataset(s) to add the object to. Defaults to None.

Raises:

ValueError – Raised then the patient ID does not exist
SystemError – Raised when a linked_image is provided but can’t be found for this patient.

add_input(input_obj: Union[str, Path, InputBase])#

Add an input location containing DICOM data. Must a str, pathlib.Path or InputBase object, such as: - FileSystemInput - DICOMPacsInput - OrthancInput - WebInput

Parameters:: input_obj (str|pathlib.Path|InputBase) – The Input object, derived from InputBase or a str/pathlib.Path pointing to the folder containing the DICOM files

add_object(*args, **kwargs) → DataFrame#

Add a generated object to the project.

Parameters:

object_id (str) – The unique ID of the new object.
patient_id (str) – The ID of the patient for which the object is being added.
object_type (str) – The type of object, must be one of “image”, “structure”, “plan” or “dose”.
modality (str) – The modality of the object (as per the DICOM standard)
for_uid (str, optional) – The Frame of Reference UID. Defaults to None.
referenced_sop_instance_uid (str, optional) – The SOP Instance UID of the object referenced by the generated object. Defaults to None.
datasets (list|str, optional) – The name(s) of the dataset(s) to add the object to. Defaults to None.

Raises:

ValueError – Raised in object_type is not “image”, “structure”, “plan” or “dose”.
ValueError – Raised if the patient does not exist in the project.
SystemError – Raised if the generated object does not yet exist on the file system.
SystemError – Raised if an object with this ID has already exists in the project.

add_structure_name_mapping(*args, **kwargs) → DataFrame#

Specify a structure name mapping dictionary object where keys are the standardised structure names and value is a list of strings of various structure names to map to the standard name.

If a structure_set_row is provided, the mapping will be stored only for that specific structure. Otherwise, working_directory must be provided, then it will be stored at project level by default, or at the patient level if patient_id is also provided.

Parameters:

mapping_dict (dict) – Dictionary object with the standardised structure name (str) as the key and a list of the various structure names to map as the value.
mapping_id (str, optional) – The ID to refer to this mapping as. Defaults to DEFAULT_MAPPING_ID. structure_set_row is None. Defaults to None.
patient_id (str, optional) – The ID of the patient to which this mapping belongs. Defaults to None.
structure_set_row (pd.Series, optional) – The row of the converted structure set to which this mapping belongs. Defaults to None.

Raises:

SystemError – Ensure working_directory or structure_set is provided.
ValueError – All keys in mapping dictionary must be of type str.
ValueError – All values in mapping dictionary must be a list of str entries.

add_structure_object(*args, **kwargs) → DataFrame#

Add a generated structure object to the project.

Parameters:

structures (dict) – A dict object container structure names as key and SimpleITK.Image of the corresponding structure mask as value.
structure_id (str) – The unique ID of the new structure object.
patient_id (str) – The ID of the patient of this object.
linked_image (str|pd.Series, optional) – The hashed_uid or the Pandas DataFrame row of the image object to link to. If None the new object won’t be linked. Defaults to None.
for_uid (str, optional) – Frame Of Reference UID for new data object. If not provided it will be extracted from the linked image (if available). Defaults to None.
datasets (list|str, optional) – The name(s) of the dataset(s) to add the object to. Defaults to None.

Raises:

ValueError – Raised then the patient ID does not exist
SystemError – Raised when a linked_image is provided but can’t be found for this patient.

get_structures_linked_to_dose(*args, **kwargs) → DataFrame#

A function to read the data from the quarantine summary.

Args:

Returns:: A DataFrame summarising the contents of the quarantine.
Return type:: pd.DataFrame

preprocess(force: bool = True)#

Preprocess the DICOM data in preparation for conversion

Parameters:: force (bool, optional) – When True, all DICOM data will be re-processed (even if it has already been preprocessed). Defaults to True.

read_all_segmentation_logs(*args, **kwargs) → DataFrame#

Read all auto-segmentation logs in a dataset. :param dataset_name: The name of the dataset to read for. :type dataset_name: str :param segment_id: The ID of the auto-segmentation run. :type segment_id: str :param modality: The modality of the images to read logs for. :type modality: str

Returns:: The pandas DataFrame object with all logs for the dataset.
Return type:: pd.DataFrame

read_converted_data(*_, **kwargs) → DataFrame#

Read the converted data frame from the supplied data directory.

Parameters:

dataset_name (str, optional) – The name of the dataset for which to extract converted data. Defaults to “data”.
patients (list, optional) – The list of patients for which to read converted data. If None is supplied then all data will be read. Defaults to None.
join_working_directory (bool, optional) – If True, the path in the data frame returned will be adjusted to the location of the working_directory. If False the path will be relative to the working_directory.

Returns:

The DataFrame with the converted data objects.

Return type:

pd.DataFrame

read_preprocessed_data() → DataFrame#

Reads the pydicer preprocessed data

Args:

Raises:: SystemError – Error raised when preprocessed data doesn’t yet exist
Returns:: The preprocessed data
Return type:: pd.DataFrame

read_quarantined_data() → DataFrame#

A function to read the data from the quarantine summary.

Args:

Returns:: A DataFrame summarising the contents of the quarantine.
Return type:: pd.DataFrame

run_pipeline(patient: Optional[Union[list, str]] = None, force: bool = True)#

Runs the entire conversion pipeline, including computation of DVHs and first-order radiomics.

Parameters:

patient (str|list, optional) – A patient ID or list of patient IDs for which to run the
None (pipeline. Defaults to) –
force (bool, optional) – When True, all steps are re-processed even if the output files have previously been generated. Defaults to True.

segment_dataset(*args, **kwargs) → DataFrame#

Run an auto-segmentation function across all images of a given modality in a dataset.

Parameters:

segment_id (str) – The ID to be given to track the results of this segmentation.
segmentation_function (Callable) – The function to call to run the segemtantion. Excepts a SimpleITK.Image as input and returns a dict object with structure names as keys and SimpleITK.Image masks as values.
dataset_name (str, optional) – The name of the dataset to run auto-segmentation on. Defaults to CONVERTED_DIR_NAME which run across all images available.
modality (str, optional) – The modality of the image to run on. Defaults to “CT”.
force (bool, optional) – If True, the segmetation will be re-run for each image even if it was already previously run. Defaults to False.

segment_image(*args, **kwargs) → DataFrame#

Run an auto-segmentation function on an image. Provide the image row of the converted DataFrame to auto-segment, segmentation results will be save as a new object within the patient’s data.

The segment_function provided should accept a SimpleITK image as input and return a dict with structure names as keys and SimpleITK images as value.

If you segmentation algorithm requires further customisation, consider wrapping it in a function to match this notation. For example, to run the TotalSegmentator, you can define a warpper function like:

```python import tempfile from pathlib import Path import SimpleITK as sitk

def run_total_segmentator(input_image: sitk.Image) -> dict:

# Import within function since this is an optional dependency

from totalsegmentator.python_api import (: totalsegmentator, # pylint: disable=import-outside-toplevel

)

results = {}

with tempfile.TemporaryDirectory() as working_dir:

working_dir = Path(working_dir)

# Save the temporary image file for total segmentator to find input_dir = working_dir.joinpath(“input”) input_dir.mkdir() input_file = input_dir.joinpath(“img.nii.gz”) sitk.WriteImage(input_image, str(input_file))

# Prepare a temporary folder for total segmentator to store the output output_dir = working_dir.joinpath(“output”) output_dir.mkdir()

# Run total segmentator totalsegmentator(input_dir, output_dir)

# Load the output masks into a dict to return for mask_file in output_dir.glob(”*.nii.gz”):

mask = sitk.ReadImage(str(mask_file))

# Check if the mask is empty, total segmentator stores empty mask files for # structures that aren’t within FOV if sitk.GetArrayFromImage(mask).sum() == 0:

continue

structure_name = mask_file.name.replace(“.nii.gz”)

results[structure_name] = mask

return results

```

Parameters:

image_row (pd.Series) – The image row of the converted DataFrame to use for segmentation.
segment_id (str) – The ID to be given to track the results of this segmentation.
segmentation_function (Callable) – The function to call to run the segemtantion. Excepts a SimpleITK.Image as input and returns a dict object with structure names as keys and SimpleITK.Image masks as values.
dataset_name (str, optional) – The name of the dataset to add the segmented structure set to. Defaults to None.
force (bool, optional) – If True, the segmetation will be re-run. Defaults to False.

Raises:

TypeError – The segmentation function returned the wrong type (requies a dict)

set_verbosity(verbosity: int)#

Set’s the verbosity of the tool to the std out (console). When 0 (not set) the tool will display a progress bar. Other values indicate Python’s build in logging levels: - DEBUG: 10 - INFO: 20 - WARNING: 30 - ERROR: 40 - CRITICAL = 50

Example: `python pd = PyDicer(working_directory) pd.set_verbosity(logging.INFO) `

Parameters:: verbosity (int) – The Python log level

update_logging()#: Resets the loggers configured. Should be called after every config change to logging.