patientImaging

class PDCP.patientImaging(codesconfig=None)

Bases: object

A class that contains all the required functions to prepare and process patient records from an Orthanc server.

Methods Summary

adapt_dataset_from_bytes(blob)

A function that changes a bytes array into a pydicom FileDataset

check_if_organs_in_rtstruct(list_of_organs)

A function that checks if a list of organs is not found in the rtstruct

collect_pids_orthanc([codesconfig, …])

Send requests to the orthanc server to collect patient ids, and save outputs to a csv file.

export_2_matlab(filename, patdict)

A function to save the dictionary generated into a matlab file.

export_2_pickle(filename, patdict)

A function to save the dictionary generated into a pickle file.

generate_imaging_dataframe_threading(…[, save])

This function targets the Orthanc server and retrieves the summaries of all the patient related instances in the server to a pandas dataframe.

generate_orthanc_files_summaries(…)

A function that collects a set of patients images summaries from the orthanc server.

generate_patients_data(firstversion)

A function that collects a set of patients required and verified files (CT slices, rt structs, etc.) from the orthanc server.

get(orthanc, adict)

An abstract function.

get_ct_nifti(adict, notes)

A function that creates CT nifti file.

get_ct_threading(orthanc, adict, notes)

A function that collects the patient’s CT instances through a set of threads.

get_instance_details(orthanc, …)

A function that retrieves an orthanc file simplified tags associated with an instance identifier.

get_masks_nifti(adict, notes)

A function that uses a convert RTSTRUCT function developed by RF & PC to convert rt struct ROIs to nifti masks.

get_pid(orthanc, orthanc_identifier)

Get the patient id from the orthanc server.

get_rtdoses(orthanc, adict, notes)

A function that retrieves the patients associated RTDOSES with the selected study.

get_rtplan(orthanc, adict, notes)

A function that retrieves the patients associated RTPLAN.

get_rtstruct(orthanc, adict, notes)

A function that collects the patient rt struct files.

imagefile_summary(patient_id, df)

A function that summarizes the records in a patient pandas dataframe

imagefiles_summaries(thedir)

A function that reloads all the patients summaries into a dataframe.

load(pid, notes, df, ctnifti_path, …)

An abstract function.

load_ct_nifti_2_numpyarray(adict, notes)

A function used to load the patient CT nifti file into a 3D numpy array.

load_doses_2_numpyarray(adict, notes)

A function used to load the patient dicom RTDOSES to a list of 3D numpy arrays.

load_nifti_mask_2_numpyarray(pid, …)

A function used to load a patient mask into a 3D numpy array

load_nifti_masks(adict, notes)

A function used to load the patient nifti masks to a list of 3D numpy arrays.

load_roi_names_from_rtstuct(rtstruct_path, notes)

A function that can be used to extract ROI names from the patient rtstruct file.

load_url(url)

A function that uses the requests module to target the orthanc server.

loadpatientnotes(pid)

A function that loads the patient notes based on the patient’s identifier.

prepare_patient_directory(adict[, remove_old])

A function that prepares the patient directories.

purge(adir, pattern)

A function that removes any file with a keyword in the variable pattern from a directory.

recommendation(patientnotes)

A function that recommends the patient inclusion in the study based on a list of patient notes.

remove_phantom_studies(df, patient_notes)

A function that identifies phantom studies and removes them.

remove_unused_rtstructs(df, notes)

A function that removes rtstructs with no target volumes related to any of the possible studies.

savepatientnotes(pid, thekey, patientnotes)

A function that adds a list of patient notes with a key to the patient notes JSON file.

search_for_code(notes, code)

A function that searches for a code in a list of notes associated with a patient

select_and_combine_dosegrids(doseGrids, notes)

A function used to select the dose grids.

verify_initial(PatId, notes)

A function that checks the dataframe that summarizes patient imaging files (resulted from generate_imaging_dataframe_threading()).

verify_study(df, notes[, modality])

Within this function, the links between different modalities are identified to find connections.

Methods Documentation

adapt_dataset_from_bytes(blob)

A function that changes a bytes array into a pydicom FileDataset

Parameters

blob (bytes array) – a str that consists of bytest

Returns

dataset – A pydicom object

Return type

FileDataset

check_if_organs_in_rtstruct(list_of_organs)

A function that checks if a list of organs is not found in the rtstruct

Parameters

list_of_organs (list of organs) – the list of organs found in the rtstruct and to be checked.

Returns

boolean – a boolean value to indicate that the rtstruct contain usable target volumes

Return type

boolean

static collect_pids_orthanc(codesconfig=None, outputcsv='final_ids_remoteorthanc.csv', saveids_asstr=True)

Send requests to the orthanc server to collect patient ids, and save outputs to a csv file.

Parameters
  • codesconfig (dict) – configuration file with details about the targeted cohort.

  • outputcsv (str) – path to the location of the output file.

  • saveids_asstr (bool) – a boolean to save the retrived ids as strings.

export_2_matlab(filename, patdict)

A function to save the dictionary generated into a matlab file.

export_2_pickle(filename, patdict)

A function to save the dictionary generated into a pickle file.

generate_imaging_dataframe_threading(orthanc, OrthancId, PatId, save=True)

This function targets the Orthanc server and retrieves the summaries of all the patient related instances in the server to a pandas dataframe.

This function uses threading to get the patient’s summaries

while collecting the studies, any study that does not contain all the required modalities listed in codes function will be removed.

Parameters
  • orthanc (pyorthanc Orthanc variable) – connection to the orthanc (pyorthanc variable). Not used (requests module used instead). Left as parameter if users wanted to change.

  • OrthancId (str) – Patient orthanc identifier

  • PatId (str/int) – Patient identifier

  • save (boolean) – A variable to indicate if to save the patient files. default True

Returns

  • df (pandas dataframe) – A pandas dataframe that contains the summaries of the patients instances with simplified tags

  • comment (str) – A str with a comment that determines if the image summaries collection was successful (_SUCCESS) or not (_ERROR, _PATIENTFILESNOTFOUND)

generate_orthanc_files_summaries(orthanc_ids, patients_ids)

A function that collects a set of patients images summaries from the orthanc server.

Parameters
  • orthanc_ids (list) – a list of orthanc ids

  • patients_ids (list) – a list of patient ids

Returns

  • pids (list) – the patient ids

  • comments (list) – a list of the patients comments (_SUCCESS, _ERROR, or _PATIENTIMAGESNOTFOUND)

generate_patients_data(firstversion)

A function that collects a set of patients required and verified files (CT slices, rt structs, etc.) from the orthanc server.

It is expected that for each child, another function is created.

get(orthanc, adict)

An abstract function.

get_ct_nifti(adict, notes)

A function that creates CT nifti file.

Parameters
  • adict (dictionary) – a dictionary with a key to the CT instances directory i.e. adict[‘ct_directory’]

  • notes (list) – a list of patient notes

Returns

notes – a list of the patient’s notes with updates about the CT nifti file generation task.

Return type

list

get_ct_threading(orthanc, adict, notes)

A function that collects the patient’s CT instances through a set of threads. With threading the returned files might return out of order. At the same time, the instances associated with the CT might not be in the correct order. For this reason, each set of CT instances is checked based on SliceLocation and PatientImagePosition tags.

Parameters
  • orthanc (pyorthanc Orthanc instance) – a pyorthanc Orthanc class instance, used to get the CT instance identifiers.

  • adict (dictionary) – a dict with the paths to patient files location (where to save CT instances).

  • notes (list) – a list of patient notes

Returns

notes – a list of the patient’s notes with updates about the CT collection task, if any.

Return type

list

get_instance_details(orthanc, instance_identifier)

A function that retrieves an orthanc file simplified tags associated with an instance identifier. Uses the request module in python to retrieve a bytes array.

Parameters
  • orthanc (pyorthanc Orthanc variable) – Connection to the orthanc details (not used can be empty/ used orginally with the first version). Left for the user if he/she wants to use later without using the requests module.

  • instance_identifier (str) – Instance orthanc identifier

Returns

  • dictionary (dictionary) – A dictionary with the simplified tag associated with the instance

  • status (int) – An integer to indicate if the request to the server was successful. i.e. 200,401,etc.

get_masks_nifti(adict, notes)

A function that uses a convert RTSTRUCT function developed by RF & PC to convert rt struct ROIs to nifti masks. It saves the created files to the patient directory.

Parameters
  • adict (dictionary) – A dictionary that contains the patient details.

  • notes (list) – a list of patient notes

Returns

notes – a list of the patient’s notes with updates about the masks generation process.

Return type

list

static get_pid(orthanc, orthanc_identifier)

Get the patient id from the orthanc server.

Parameters
  • orthanc (pyorthanc instance) – a pyorthanc instance with connections to the orthanc server.

  • orthanc_identifier (str) – Orthanc identifier

Returns

  • patient_id (str/int) – the patient id

  • orthanc_identifier (str) – Orthanc identifier

get_rtdoses(orthanc, adict, notes)

A function that retrieves the patients associated RTDOSES with the selected study. It saves the collected files to the patient directory pid/RTDOSE/UID.dcm. It also exports the file to a nifti file. It should be noted that in the RTDOSES all the files will be saved in the same folder, unlike the RTSTRUCTS where each struct will have a seperate folder.

Parameters
  • adict (dictionary) – A dictionary that contains the patient details.

  • notes (list) – a list of patient notes

Returns

notes – a list of the patient’s notes with updates about the RTDOSE retrieve task.

Return type

list

get_rtplan(orthanc, adict, notes)

A function that retrieves the patients associated RTPLAN.

Parameters
  • adict (dictionary) – A dictionary that contains the patient details.

  • notes (list) – a list of patient notes

Returns

notes – a list of the patient’s notes with updates about the RTDOSE retrieve task.

Return type

list

get_rtstruct(orthanc, adict, notes)

A function that collects the patient rt struct files. RTSTRUCTS collected based on instances

Parameters
  • orthanc (pyorthanc Orthanc instance) – not used anymore, used originally to retrieve the rtstructs, before moving to requests.

  • adict (dictionary) – a dictionary with a key to the path to save the retieved RTSTRUCTS i.e. adict[‘rtstructs_directory’]

  • notes (list) – a list of patient notes

Returns

notes – a list of the patient’s notes with updates about the rtstruct collection task.

Return type

list

imagefile_summary(patient_id, df)

A function that summarizes the records in a patient pandas dataframe

Parameters
  • patient_id (int/str) – patient identifier

  • df (pandas dataframe) – a dataframe with the patient’s image summaries saved as rows in the dataframe

Returns

di – a dictionary with the dataframe summary (i.e. number of CTs, number of studies)

Return type

dictionary

imagefiles_summaries(thedir)

A function that reloads all the patients summaries into a dataframe.

Parameters

thedir (str) – path to the directory with the patients files summaries.

Returns

thesummary – a dataframe with the patients files summaries (i.e number of CTs, number of studies, etc.)

Return type

pandas dataframe

load(pid, notes, df, ctnifti_path, masks_directory, rtdoses_directory)

An abstract function.

load_ct_nifti_2_numpyarray(adict, notes)

A function used to load the patient CT nifti file into a 3D numpy array.

Parameters
  • adict (dict) – A dictionary that contains the patient details.

  • notes (list) – a list of patient notes

Returns

  • thect (3D numpy array) – The collected 3D array

  • spacing (tuple) – The ct image spacing, empty array is returned if an error occured.

  • notes (list) – a list of the patient’s notes with updates about the RTSTRUCT ROIs.

load_doses_2_numpyarray(adict, notes)

A function used to load the patient dicom RTDOSES to a list of 3D numpy arrays.

Parameters
  • adict (dict) – A dictionary that contains the patient details.

  • notes (list) – a list of patient notes

Returns

  • doseGrids (list) – The list of 3D numpy arrays representing each dose grid

  • notes (list) – a list of the patient’s notes with updates on loading the dose grids.

load_nifti_mask_2_numpyarray(pid, nifti_mask_path, notes)

A function used to load a patient mask into a 3D numpy array

Parameters
  • pid (int) – patient id

  • nifti_mask_path (str) – path to the nifti mask

  • notes (list) – a list of patient notes

Returns

  • thect (3D numpy array) – The collected 3D array mask

  • sn (str) – name of the mask

  • notes (list) – a list of the patient’s notes with updates on reading the mask from the nifti file.

load_nifti_masks(adict, notes)

A function used to load the patient nifti masks to a list of 3D numpy arrays.

Parameters
  • adict (dict) – A dictionary that contains the patient details.

  • notes (list) – a list of patient notes

Returns

  • list_of_masks (list) – The list of 3D numpy arrays

  • name_of_masks (list) – The list of the OARs and TV

  • roi_masks (dict) – A dict with keys as (mask0,mask1,mask2) represnting each roi in name_of_masks. i.e. roiname at position 0 in the list name_of_masks is mask0 and so on.

  • notes (list) – a list of the patient’s notes with updates on loading the nifti masks.

load_roi_names_from_rtstuct(rtstruct_path, notes)

A function that can be used to extract ROI names from the patient rtstruct file.

Parameters

rtstruct_path (str) – Path to the RTSTRUCT file in the patient directory.

notes: list

a list of patient notes

Returns

notes – a list of the patient’s notes with updates about the RTSTRUCT ROIs.

Return type

list

load_url(url)

A function that uses the requests module to target the orthanc server.

Parameters

url (str) – a url to a file location (usually an instance file)

Returns

  • content (request content) – an attribute with the patient content (with instances it should be a bytes array).

  • status_code (int) – the response code (i.e 200, 401, etc.)

loadpatientnotes(pid)

A function that loads the patient notes based on the patient’s identifier.

Parameters

pid (int/str) – Patient identifier

Returns

data – Patient dictionaty with various types of notes. i.e. collection, retrieval, verification, etc.

Return type

dictionary

prepare_patient_directory(adict, remove_old=True)

A function that prepares the patient directories. i.e. CT directory, RTSTRUCTs directory, etc.

Parameters
  • adict (dictionary) – a patient dictionary to add the paths to

  • remove_old (boolean) – an attribute used to specify if the old directory should be removed

Returns

adict – patient dictionary with the updates paths to the directories, where data collected from the orthan server will be saved.

Return type

dictionary

purge(adir, pattern)

A function that removes any file with a keyword in the variable pattern from a directory. It is used with patients where file/directory should be removed.

Parameters
  • adir (str) – a directory

  • pattern (str) – a str value, where all filenames that contain this keyword will be removed.

recommendation(patientnotes)

A function that recommends the patient inclusion in the study based on a list of patient notes.

Parameters

patientnotes (list) – list of patient’s notes

Returns

  • str – recommendation (SUCCESS, REVIEW, or EXECLUDE)

  • codes (list) – list of patient codes collected from the list of notes.

remove_phantom_studies(df, patient_notes)

A function that identifies phantom studies and removes them.

Parameters
  • df (pandas dataframe) – patient files summary dataframe

  • patient_notes (list) – list with the patients notes in the initial verification task.

Returns

  • df (pandas dataframe) – a dataframe with no phantom studise

  • patient_notes (list) – list with the patients notes in the initial verification task.

remove_unused_rtstructs(df, notes)

A function that removes rtstructs with no target volumes related to any of the possible studies. i.e. any rtstruct without any keyword such as ptv, ctv, heart, or lung will be removed.

Parameters
  • df (pandas dataframe) – patient image files summary dataframe

  • notes (list) – patient verfication notes list

Returns

  • df (pandas dataframe) – patient image files summary dataframe with removed ununsed rtstructs, if any

  • notes (list) – patient verfication notes list with updated notes, if any

savepatientnotes(pid, thekey, patientnotes)

A function that adds a list of patient notes with a key to the patient notes JSON file.

Parameters
  • pid (int/str) – Patient identifier

  • thekey (str) – The type of the notes to be saved

  • patientnotes (list) – A list of strs to save the the patient’s notes file

search_for_code(notes, code)

A function that searches for a code in a list of notes associated with a patient

Parameters
  • notes (list) – list of patient’s notes

  • code (str) – a code to search for in a list of strs

Returns

True if the code is in the list of strs, otherwise False.

Return type

boolean

select_and_combine_dosegrids(doseGrids, notes)

A function used to select the dose grids. This function is used to handle the logic for combining dose grids.

Parameters
  • doseGrids (list) – The list of 3D numpy arrays representing each dose grid

  • notes (list) – a list of patient notes

Returns

  • doseGrid (3D numpy array) – The final 3D dose grid

  • notes (list) – a list of the patient’s notes with updates on loading the dose grids.

verify_initial(PatId, notes)

A function that checks the dataframe that summarizes patient imaging files (resulted from generate_imaging_dataframe_threading()). It reports details about modalities listed in required_modalities in the codes function.

Parameters
  • PatId (int/str) – an int/str that represents the patient identifier

  • notes (list) – a list that contains the notes assoicated with the verification, and is used to append new notes while verifying the patient files.

Returns

  • df (Pandas dataframe) – A pandas dataframe with the verified imaging records, i.e the required modalities

  • notes (list) – list of notes appended while verifying the patient’s imaging files.

verify_study(df, notes, modality='CT')

Within this function, the links between different modalities are identified to find connections.

In each of the child classes, the logic that connects the required modalities is implemented.