nnUNet#

class pydicer.dataset.nnunet.NNUNetDataset(working_directory: Union[str, Path], nnunet_id: int, nnunet_name: str, nnunet_description: str = '', dataset_name: str = 'data', image_modality: str = 'CT', mapping_id: str = 'default')#
add_testing_cases(testing_cases: List[str])#

Add some testing cases only. Can be useful if wanting to analyse more data after a model has been trained.

Parameters:

testing_cases (list) – A list of case IDs to add to the training set.

check_dataset()#

Check to see that this dataset has been prepared properly.

Expect to have exactly 1 image of the modality configured and 1 structure set.

check_duplicates_train_test()#

Check the images in the train and test sets to determine if there are any inadvertant duplicates.

This can be useful since sometimes when datasets are anonymised multiple times the same dataset might have a different anonymised patients ID. Best to find this out before training the model so that these cases can be removed from the training or testing set.

Raises:

SystemError – Raised if split_dataset has not yet been run

check_overlapping_structures()#

Determine if any of the structures are overlapping. The nnUNet does not support overlapping structures. If any overlapping structures exist voxels will be assigned to the smallest structure by default or to the largest structure if assign_overlap_to_largest is True.

check_structure_names() DataFrame#

Prepare a DataFrame to indicate which structures are available/missing for each patient in the dataset.

Returns:

DataFrame indicating structures available.

Return type:

pd.DataFrame

generate_training_scripts(script_directory: Union[str, Path] = '.', folds: Union[str, list] = 'all', models: Optional[Union[list, str]] = None, script_header: Optional[list] = None) Path#

Generate the bash scripts needed to train the nnUNet

Parameters:
  • script_directory (Union[str, Path], optional) – Directory in which to place the generated script. Defaults to “.”.

  • folds (Union[str, list], optional) – The nnUNet folds to train. Defaults to “all”.

  • models (Union[str, list], optional) – The nnUNet models to train. Defaults to [“2d”, “3d_lowres”, “3d_fullres”].

  • script_header (list, optional) – An optional list of headers that will be inserted at then beginning of the script. This is useful if you need to activate a Python environment containing nnUNet prior to training. Defaults to None.

Raises:

FileNotFoundError – Raised when script_directory does not exist.

Returns:

The path to the script file generated.

Return type:

Path

prep_label_map_from_one_hot(image: Image, structure_set: StructureSet) Image#

Prepare a label map from a structure set. Since overlapping structures aren’t supported in a label map, voxels will be assigned to the larger structure if assign_overlap_to_largest is True or the smaller structure if assign_overlap_to_largest is False.

Parameters:
  • image (sitk.Image) – The image corresponding to the structure set.

  • structure_set (StructureSet) – The structure set from which to create the label map.

Returns:

The label map.

Return type:

sitk.Image

prepare_dataset() Path#

Prepare the dataset ready for nnUNet training on the file system.

Raises:
  • SystemError – Raised if split_dataset hasn’t yet been run.

  • SystemError – Raised if check_structure_names has detected missing structures for patients.

Returns:

The folder in which the nnUNet dataset has been prepared.

Return type:

Path

split_dataset(training_cases: Optional[List[str]] = None, testing_cases: Optional[List[str]] = None, patients: Optional[List[str]] = None, **kwargs)#

Split the dataset by either supplying the training and testing cases. If these are not supplied a split will be done using sklearn’s train_test_split. Key-word arguments passed through to this function will be passed on to train_test_split.

Parameters:
  • training_cases (List[str], optional) – Specify a list of training cases, won’t split using train_test_split function in this scenario. Defaults to None.

  • testing_cases (List[str], optional) – Specify list of testing cases, can only be supplied if training_cases is also supplied. Defaults to None.

  • patients (List[str], optional) – Define a subset of patient to use for train_test_split. If None then all patients wil be used. Defaults to None.

Raises:
  • AttributeError – Raised when testing_cases is set but training_cases is not.

  • ValueError – Raised when training or testing case not present in dataset.

train(script_directory: Union[str, Path] = '.', in_screen: bool = True)#

Start the nnUNet training script. Note this function might be useful in certain circumstances, but training should mostly be managed and monitored from the terminal.

See nnUNet documentation for further information on the training process.

Parameters:
  • script_directory (Union[str, Path], optional) – Directory containing the training script generated script. Defaults to “.”.

  • in_screen (bool, optional) – If True, script will be started using the screen utility. This runs training in the background and allows you to log out of the system. If False this script will run within the current session (not recommended). Defaults to True.

Raises:

FileNotFoundError – Raised if the training script hasn’t yet been generated with the generate_training_scripts function.