Getting Started#

Open In Colab

This notebook provides a basic example to run the PyDicer pipeline using some test data.

[1]:
try:
    from pydicer import PyDicer
except ImportError:
    !pip install pydicer
    from pydicer import PyDicer

from pathlib import Path

from pydicer.input.test import TestInput

Setup working directory#

First we’ll create a directory for our project. Change the directory location to a folder on your system where you’d like PyDicer to work with this data.

[2]:
directory = Path("./data")

Create a PyDicer object#

The PyDicer class provides all functionlity to run the pipeline and work with the data stored and converted in your project directory

[3]:
pydicer = PyDicer(directory)

Fetch some data#

A TestInput class is provided in pydicer to download some sample data to work with. Several other input classes exist if you’d like to retrieve DICOM data for conversion from somewhere else, see the docs for information on how these work.

[4]:
dicom_directory = directory.joinpath("dicom")
test_input = TestInput(dicom_directory)
test_input.fetch_data()

# Add the input DICOM location to the pydicer object
pydicer.add_input(dicom_directory)

Run the pipeline#

The function runs the entire PyDicer pipeline on the test DICOM data. This includes: - Preprocessing the DICOM data (data which can’t be handled or is corrupt will be placed in Quarantine) - Convert the data to Nifti format (see the output in the data directory) - Visualise the data (png files will be placed alongside the converted Nifti files) - Compute Radiomics features (Results are stored in a csv alongside the converted structures) - Compute Dose Volume Histograms (results are stored alongside converted dose data)

Note that the entire Pipeline can be quite time consuming to run. Depending on your project’s dataset you will likely want to run only portions of the pipeline with finer control over each step. For this reason we only run the pipeline for one patient here as a demonstration.

[5]:
pydicer.run_pipeline(patient="HNSCC-01-0019")
100%|██████████| 1309/1309 [00:03<00:00, 403.32files/s, preprocess]
100%|██████████| 4/4 [00:50<00:00, 12.65s/objects, convert]
100%|██████████| 3/3 [00:15<00:00,  5.16s/objects, visualise]
100%|██████████| 1/1 [00:14<00:00, 14.99s/objects, Compute Radiomics]
100%|██████████| 1/1 [00:09<00:00,  9.76s/objects, Compute DVH]

Prepare a dataset#

Datasets which are extracted in DICOM format can often be a bit messy and require some cleaning up after conversion. Exactly what data objects to extract for the clean dataset will differ by project but here we use a somewhat common approach of extracting the latest structure set for each patient and the image linked to that.

The resulting dataset is stored in a folder with your dataset name (clean for this example).

See the dataset preparation example for a more detailed description on how this works.

[6]:
pydicer.dataset.prepare(dataset_name="clean", preparation_function="rt_latest_dose")

Analyse the dataset#

The pipeline computes first-order radiomics features by default, as well as dose volume histograms. Here we can extract out the results easily into a Pandas DataFrame for analysis.

Check out the Compute Radiomics and the Dose Metrics examples for further details on how to use these functions.

[7]:
# Display the DataFrame of radiomics computed
df_radiomics = pydicer.analyse.get_all_computed_radiomics_for_dataset(dataset_name="clean")
df_radiomics
[7]:
Contour Patient ImageHashedUID StructHashedUID ResampledPixelSpacing NormalisationScale firstorder|10Percentile firstorder|90Percentile firstorder|Energy firstorder|Entropy ... firstorder|Mean firstorder|Median firstorder|Minimum firstorder|Range firstorder|RobustMeanAbsoluteDeviation firstorder|RootMeanSquared firstorder|Skewness firstorder|TotalEnergy firstorder|Uniformity firstorder|Variance
25 +1 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -999.0 339.0 5.178130e+11 4.785593 ... -127.630179 18.0 -1024.0 4000.0 276.691595 565.017962 0.216289 1.481475e+12 0.076182 302955.835075
1 -.3 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -68.0 669.0 1.587794e+11 4.415742 ... 142.809678 41.0 -1024.0 4000.0 77.747524 410.946993 1.886750 4.542715e+11 0.111299 148482.826970
16 Brain HNSCC-01-0019 b281ea 7cdcd9 NaN NaN 24.0 70.0 6.420357e+08 1.698910 ... 45.375629 40.0 -849.0 2151.0 9.055957 57.813075 10.351606 1.836879e+09 0.427362 1283.403884
27 Brainstem HNSCC-01-0019 b281ea 7cdcd9 NaN NaN 14.0 46.0 1.860914e+07 1.275139 ... 30.442222 30.0 -26.0 581.0 7.063362 33.431514 3.682653 5.324118e+07 0.491624 190.937232
2 CTV63 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -65.0 308.0 1.346678e+10 4.124500 ... 58.805603 37.0 -1014.0 3990.0 36.643247 287.047711 0.524907 3.852878e+10 0.125626 78938.289561
17 CTV63_Sep HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -409.0 498.2 8.517524e+09 5.069875 ... 37.578143 32.0 -1014.0 3990.0 89.611043 407.993479 0.163450 2.436883e+10 0.061470 165046.562407
0 CTV70 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -34.0 209.0 5.101866e+09 3.585289 ... 69.823863 38.0 -997.0 2572.0 24.191215 211.758612 1.987960 1.459655e+10 0.164658 39966.337967
18 Cord HNSCC-01-0019 b281ea 7cdcd9 NaN NaN 5.0 85.0 5.136484e+07 2.646141 ... 48.951795 49.0 -296.0 1058.0 14.980855 71.085212 2.263382 1.469560e+08 0.233730 2656.829157
3 Cord_EXPANDED HNSCC-01-0019 b281ea 7cdcd9 NaN NaN 24.0 754.0 5.387667e+09 4.928948 ... 276.231702 123.0 -296.0 3272.0 177.244691 417.346112 1.615239 1.541424e+10 0.061246 97873.824272
20 External HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -78.0 582.0 1.663208e+11 4.426843 ... 121.129199 37.0 -1024.0 4000.0 62.170197 389.791941 2.028668 4.758476e+11 0.104930 137265.474901
24 GTV1 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -3.0 65.0 3.553947e+08 2.430806 ... 37.678033 37.0 -997.0 2297.0 12.273112 83.549794 -0.213768 1.016792e+09 0.290775 5560.934010
26 LT_Parotid HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -76.0 33.0 6.793782e+06 2.747940 ... -19.772190 -19.0 -119.0 208.0 24.155809 45.366267 0.094309 1.943717e+07 0.165854 1667.158645
5 Lt_Deep_Prtd HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -70.0 52.0 4.208941e+06 2.855015 ... -9.347047 -11.0 -119.0 221.0 26.206174 45.135812 -0.002344 1.204188e+07 0.152867 1949.874234
11 Lt_Sup_Prtd HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -80.0 11.0 2.932703e+06 2.486043 ... -31.238202 -27.0 -112.0 178.0 24.082085 46.869833 -0.060561 8.390531e+06 0.193892 1220.955994
9 Mandible HNSCC-01-0019 b281ea 7cdcd9 NaN NaN 68.0 1294.0 1.492566e+10 5.931930 ... 637.881065 602.0 -188.0 1851.0 328.164625 787.427714 0.236129 4.270266e+10 0.018566 213150.151566
13 Optic_Nerve HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -52.0 121.0 1.702145e+07 3.379679 ... 27.759629 17.0 -850.0 1437.0 25.985759 110.185458 -0.087479 4.869876e+07 0.149776 11370.238228
28 Oral_Avoid HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -64.0 1044.0 1.299995e+10 5.093911 ... 257.075139 56.0 -1024.0 4000.0 177.615572 600.257436 2.282101 3.719315e+10 0.063132 294221.363063
29 PTV57 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -83.0 74.0 2.141580e+09 3.655007 ... 9.569312 30.0 -974.0 2389.0 36.424098 204.728314 -1.066404 6.127108e+09 0.132085 41822.111014
19 PTV63 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -270.0 429.0 8.911092e+09 4.852886 ... 36.173846 32.0 -1014.0 3990.0 68.189480 384.016550 0.549360 2.549484e+10 0.072524 146160.163747
23 PTV70 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -55.0 387.0 1.495341e+10 4.153192 ... 79.991129 39.0 -1000.0 3976.0 43.904178 306.295730 1.086361 4.278205e+10 0.126097 87418.493331
6 Post_Neck HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -85.0 216.5 3.456349e+09 3.744445 ... 68.016919 27.0 -143.0 1555.0 46.872956 237.810678 3.155396 9.888695e+09 0.117753 51927.617163
4 RT_Parotid HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -61.0 27.0 1.763875e+07 2.571701 ... -12.798874 -16.0 -124.0 1345.0 17.355544 64.339575 9.353950 5.046488e+07 0.214621 3975.769687
7 Ring HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -999.0 339.0 5.178130e+11 4.785593 ... -127.630179 18.0 -1024.0 4000.0 276.691595 565.017962 0.216289 1.481475e+12 0.076182 302955.835075
22 Rt_Deep_Prtd HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -72.6 17.6 1.353914e+07 2.596270 ... -15.225243 -20.0 -124.0 1448.0 17.602131 93.611962 9.152671 3.873580e+07 0.216523 8531.391337
8 Rt_Sup_Prtd HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -65.0 27.0 8.699281e+06 2.627386 ... -14.784448 -17.0 -115.0 1206.0 18.682879 56.091120 8.311626 2.488884e+07 0.200511 2927.633827
12 chiasm HNSCC-01-0019 b281ea 7cdcd9 NaN NaN 7.0 51.0 1.122952e+07 1.907659 ... 35.686332 28.0 -88.0 1223.0 6.397863 80.989482 9.099615 3.212790e+07 0.383310 5285.781869
10 cold HNSCC-01-0019 b281ea 7cdcd9 NaN NaN 41.0 287.0 2.882986e+07 3.265029 ... 119.388430 75.0 18.0 645.0 43.849018 162.707419 2.117025 8.248290e+07 0.163774 12220.107157
30 cool_off HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -88.0 54.0 8.915266e+06 2.894287 ... -28.218110 -41.0 -115.0 203.0 34.733117 59.244808 0.523930 2.550678e+07 0.149391 2713.685499
15 ctv_57 HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -78.0 65.0 9.135206e+08 3.351470 ... -7.309885 27.0 -972.0 1988.0 34.376180 171.730129 -3.093002 2.613603e+09 0.148683 29437.802764
21 hotspot HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -187.3 1282.3 2.445281e+09 6.115974 ... 375.515514 196.0 -964.0 2508.0 321.413652 660.328969 0.522511 6.996004e+09 0.020100 295022.446264
14 larynx_avoidance HNSCC-01-0019 b281ea 7cdcd9 NaN NaN -975.0 43.0 4.655352e+08 4.605136 ... -369.268506 -85.5 -1000.0 1237.0 353.504851 559.714876 -0.451007 1.331907e+09 0.063251 176921.512696

31 rows × 24 columns

[8]:
# Extract the D95, D50 and V3 dose metrics
df_dose_metrics = pydicer.analyse.compute_dose_metrics(dataset_name="clean", d_point=[95, 50], v_point=[3])
df_dose_metrics
[8]:
patient struct_hash dose_hash label cc mean D95 D50 V3
0 HNSCC-01-0019 7cdcd9 309e1a +1 4640.553474 29.679710 0.030835 25.358893 3709.925652
1 HNSCC-01-0019 7cdcd9 309e1a -.3 2689.948082 43.784290 11.384142 43.358867 2688.291550
2 HNSCC-01-0019 7cdcd9 309e1a Brain 549.576759 24.043829 7.657684 22.339093 549.553871
3 HNSCC-01-0019 7cdcd9 309e1a Brainstem 47.636032 39.998913 28.935185 39.202222 47.636032
4 HNSCC-01-0019 7cdcd9 309e1a CTV63 467.602730 71.274086 66.177108 71.589748 467.602730
5 HNSCC-01-0019 7cdcd9 309e1a CTV63_Sep 146.395683 68.858795 64.433568 69.122560 146.395683
6 HNSCC-01-0019 7cdcd9 309e1a CTV70 325.512886 72.304520 69.484503 72.385685 325.512886
7 HNSCC-01-0019 7cdcd9 309e1a Cord 29.082298 24.179092 3.371394 32.130000 28.092384
8 HNSCC-01-0019 7cdcd9 309e1a Cord_EXPANDED 88.497162 23.978754 3.222574 31.201053 84.640503
9 HNSCC-01-0019 7cdcd9 309e1a External 3131.858826 41.149075 8.677738 39.766531 3123.699188
10 HNSCC-01-0019 7cdcd9 309e1a GTV1 145.660400 72.551830 69.922256 72.579562 145.660400
11 HNSCC-01-0019 7cdcd9 309e1a LT_Parotid 9.444237 37.419106 20.986429 34.608333 9.444237
12 HNSCC-01-0019 7cdcd9 309e1a Lt_Deep_Prtd 5.910873 45.722030 28.243333 44.633333 5.910873
13 HNSCC-01-0019 7cdcd9 309e1a Lt_Sup_Prtd 3.819466 26.882912 19.387500 26.795833 3.819466
14 HNSCC-01-0019 7cdcd9 309e1a Mandible 68.870544 46.695267 21.463077 42.843662 68.870544
15 HNSCC-01-0019 7cdcd9 309e1a Optic_Nerve 4.011154 9.785028 4.719756 8.280000 4.011154
16 HNSCC-01-0019 7cdcd9 309e1a Oral_Avoid 103.225708 29.481693 18.202899 30.124706 103.222847
17 HNSCC-01-0019 7cdcd9 309e1a PTV57 146.183968 57.981964 53.899716 58.101947 146.183968
18 HNSCC-01-0019 7cdcd9 309e1a PTV63 172.883034 65.743240 60.450535 65.955415 172.883034
19 HNSCC-01-0019 7cdcd9 309e1a PTV70 456.015587 71.606650 67.455984 71.783420 456.015587
20 HNSCC-01-0019 7cdcd9 309e1a Post_Neck 174.854279 38.756176 30.159714 38.474556 174.854279
21 HNSCC-01-0019 7cdcd9 309e1a RT_Parotid 12.190819 25.072630 15.400455 22.722973 12.190819
22 HNSCC-01-0019 7cdcd9 309e1a Ring 4640.553474 29.679710 0.030835 25.358893 3709.925652
23 HNSCC-01-0019 7cdcd9 309e1a Rt_Deep_Prtd 4.420280 32.257150 21.456250 30.916667 4.420280
24 HNSCC-01-0019 7cdcd9 309e1a Rt_Sup_Prtd 7.910728 20.038557 14.269444 20.693590 7.910728
25 HNSCC-01-0019 7cdcd9 309e1a chiasm 4.898071 7.427364 5.240769 6.989552 4.898071
26 HNSCC-01-0019 7cdcd9 309e1a cold 3.115654 71.219050 69.963571 71.269500 3.115654
27 HNSCC-01-0019 7cdcd9 309e1a cool_off 7.266998 62.732640 56.155556 63.781250 7.266998
28 HNSCC-01-0019 7cdcd9 309e1a ctv_57 88.623047 59.087975 56.294831 58.537787 88.623047
29 HNSCC-01-0019 7cdcd9 309e1a hotspot 16.044617 57.649550 48.448571 59.036364 16.036034
30 HNSCC-01-0019 7cdcd9 309e1a larynx_avoidance 4.251480 51.509174 43.743333 52.757143 4.251480
[ ]: