Preprocessing#

class pydicer.preprocess.data.PreprocessData(working_directory)#

Class for preprocessing the data information into a dicionary that holds the data in a structured hierarchy

preprocess(input_directory: Union[Path, list], force: bool = True) → DataFrame#

Function to preprocess information regarding the data located in an Input working directory

Parameters:

input_directory (Path|list) – The directory (or list of directories) containing the DICOM input data
force (bool, optional) – When True, all files will be preprocessed. Otherwise only files not already scanned previously will be preprocessed. Defaults to True.

Returns: res_dict (pd.DataFrame): containing a row for each DICOM file that was

preprocessed, with the following columns:

patient_id: PatientID field from the DICOM header
study_uid: StudyInstanceUID field from the DICOM header
series_uid: SeriesInstanceUID field from the DICOM header
modality: Modailty field from the DICOM header
sop_class_uid: SOPClassUID field from the DICOM header
sop_instance_uid: SOPInstanceUID field from the DICOM header
for_uid: FrameOfReferenceUID field from the DICOM header
file_path: The path to the file (as a pathlib.Path object)
slice_location: The real-world location of the slice (used for imaging modalities)
referenced_uid: The SeriesUID referenced by this DICOM file for RTSTRUCT and RTDOSE, the SOPInstanceUID of the structure set referenced by an RTPLAN.
referenced_for_uid: The ReferencedFrameOfReferenceUID referenced by this DICOM file

scan_file(file: Union[str, Path]) → dict#

Scan a DICOM file.

Parameters:

file (pathlib.Path|str) – The path to the file to scan.

Returns:

Returns the dict object containing the scanned information. None if the file: couldn’t be scanned.

Return type:

dict