Utils#

pydicer.utils.add_structure_name_mapping(mapping_dict: dict, mapping_id: str = 'default', working_directory: Optional[Path] = None, patient_id: Optional[str] = None, structure_set_row: Optional[Series] = None)#

Specify a structure name mapping dictionary object where keys are the standardised structure names and value is a list of strings of various structure names to map to the standard name.

If a structure_set_row is provided, the mapping will be stored only for that specific structure. Otherwise, working_directory must be provided, then it will be stored at project level by default, or at the patient level if patient_id is also provided.

Parameters:
  • mapping_dict (dict) – Dictionary object with the standardised structure name (str) as the key and a list of the various structure names to map as the value.

  • mapping_id (str, optional) – The ID to refer to this mapping as. Defaults to DEFAULT_MAPPING_ID.

  • working_directory (Path, optional) – The working directory for this project Required if structure_set_row is None. Defaults to None.

  • patient_id (str, optional) – The ID of the patient to which this mapping belongs. Defaults to None.

  • structure_set_row (pd.Series, optional) – The row of the converted structure set to which this mapping belongs. Defaults to None.

Raises:
  • SystemError – Ensure working_directory or structure_set is provided.

  • ValueError – All keys in mapping dictionary must be of type str.

  • ValueError – All values in mapping dictionary must be a list of str entries.

pydicer.utils.copy_doc(copy_func, remove_args=None)#

Copies the doc string of the given function to another. This function is intended to be used as a decorator.

Remove args listed in remove_args from the docstring.

This function was adapted from: https://stackoverflow.com/questions/68901049/copying-the-docstring-of-function-onto-another-function-by-name

def foo():
    '''This is a foo doc string'''
    ...

@copy_doc(foo)
def bar():
    ...
pydicer.utils.determine_dcm_datetime(ds: Dataset, require_time: bool = False) datetime#

Get a date/time value from a DICOM dataset. Will attempt to pull from SeriesDate/SeriesTime field first. Will fallback to StudyDate/StudyTime or InstanceCreationDate/InstanceCreationTime if not available.

Parameters:
  • ds (pydicom.Dataset) – DICOM dataset

  • require_time (bool) – Flag to require the time component along with the date

Returns:

The date/time

Return type:

datetime

pydicer.utils.download_and_extract_zip_file(zip_url: str, output_directory: Union[str, Path])#

Downloads a zip file from the URL specified and extracts the contents to the output directory.

Parameters:
  • zip_url (str) – The URL of the zip file.

  • output_directory (str|pathlib.Path) – The path in which to extract the contents.

pydicer.utils.fetch_converted_test_data(working_directory: Optional[Union[str, Path]] = None, dataset: str = 'HNSCC') Path#

Fetch some public data which has already been converted using PyDicer. Useful for unit testing as well as examples.

Parameters:
  • working_directory (str|pathlib.Path, optional) – The working directory in which to place the test data. Defaults to None.

  • dataset (str, optional) – The name of the dataset to fetch, either HNSCC or LCTSC. Defaults to “HNSCC”.

Returns:

The path to the working directory.

Return type:

pathlib.Path

pydicer.utils.get_iterator(iterable, length: Optional[int] = None, unit: str = 'it', name: Optional[str] = None)#

Get the appropriate iterator based on the level of verbosity configured.

Parameters:
  • iterable (iterable) – The list or iterable to iterate over.

  • length (int, optional) – The length of the iterator. If None, the len() functio will be used to determine the length (only works for list/tuple). Defaults to None.

  • unit (str, optional) – The unit string to display in the progress bar. Defaults to “it”.

  • name (str, optional) – The name to display in the progress bar. Defaults to None.

Returns:

The appropriate iterable object.

Return type:

iterable

pydicer.utils.get_structures_linked_to_dose(working_directory: Path, dose_row: Series) DataFrame#

Get the structure sets which are linked to a dose object.

Parameters:
  • working_directory (Path) – The PyDicer working folder.

  • dose_row (pd.Series) – The row from the converted data describing the dose object.

Returns:

The data frame containing structure sets linked to row.

Return type:

pd.DataFrame

pydicer.utils.hash_uid(uid: str, truncate: int = 6) str#

Hash a UID and truncate it

Parameters:
  • uid (str) – The UID to hash

  • truncate (int, optional) – The number of the leading characters to keep. Defaults to 6.

Returns:

The hashed and trucated UID

Return type:

str

pydicer.utils.load_dvh(row: Series, struct_hash: Optional[Union[list, str]] = None) DataFrame#

Loads an object’s Dose Volume Histogram (DVH)

Parameters:
  • row (pd.Series) – The row of the converted DataFrame for an RTDOSE

  • struct_hash (list|str, optional) – The struct_hash (or list of struct_hashes) to load DVHs for. When None all DVHs for RTDOSE will be loaded. Defaults to None.

Raises:

ValueError – Raised the the object described in the row is not an RTDOSE

Returns:

The DataFrame containing the DVH for the row

Return type:

pd.DataFrame

pydicer.utils.load_object_metadata(row: Series, keep_tags: Optional[Union[list, str]] = None, remove_tags: Optional[Union[list, str]] = None) Dataset#

Loads the object’s metadata

Parameters:
  • row (pd.Series) – The row of the converted DataFrame for which to load the metadata

  • keep_tags (str|list, optional) – DICOM tag keywords keep when loading data. If set all other tags will be removed. Defaults to None.

  • remove_tag (str|list, optional) – DICOM tag keywords keep when loading data. If set all other tags will be kept. Defaults to None.

Returns:

The dataset object containing the original DICOM metadata

Return type:

pydicom.Dataset

pydicer.utils.map_structure_name(struct_name: str, struct_map_dict: dict) str#

Function to map a structure’s name according to a mapping dictionary

Parameters:
  • struct_name (str) – the structure name to be mapped. If the name is remapped according to the

  • file (mapping) –

  • name (then the structure NifTi file is renamed with the mapped) –

  • struct_map_dict (dict) – the mapping dictionary

Returns:

the mapped structure name

Return type:

str

pydicer.utils.parse_patient_kwarg(patient: Union[list, str]) list#

Helper function to prepare patient list from kwarg used in functions throughout pydicer.

Parameters:

patient (list|str) – The patient ID or list of patient IDs. If None, all patients in dataset_directory are returned.

Raises:
  • ValueError – All patient IDs in list aren’t of type str

  • ValueError – patient was not list, str or None.

Returns:

The list of patient IDs to process or None if patient is None

Return type:

list

pydicer.utils.read_converted_data(working_directory: Path, dataset_name: str = 'data', patients: Optional[list] = None, join_working_directory: bool = True) DataFrame#

Read the converted data frame from the supplied data directory.

Parameters:
  • working_directory (Path) – Working directory for project

  • dataset_name (str, optional) – The name of the dataset for which to extract converted data. Defaults to “data”.

  • patients (list, optional) – The list of patients for which to read converted data. If None is supplied then all data will be read. Defaults to None.

  • join_working_directory (bool, optional) – If True, the path in the data frame returned will be adjusted to the location of the working_directory. If False the path will be relative to the working_directory.

Returns:

The DataFrame with the converted data objects.

Return type:

pd.DataFrame

pydicer.utils.read_preprocessed_data(working_directory: Path) DataFrame#

Reads the pydicer preprocessed data

Parameters:

working_directory (Path) – Working directory for project

Raises:

SystemError – Error raised when preprocessed data doesn’t yet exist

Returns:

The preprocessed data

Return type:

pd.DataFrame

pydicer.utils.read_simple_itk_image(row: Series) Image#

Reads the SimpleITK Image object given a converted dataframe row.

Parameters:

row (pd.Series) – The row of the data frame for which to load the SimpleITK Image.

Returns:

The loaded image. Returns None if the image was not found.

Return type:

SimpleITK.Image