Getting Started#
This notebook provides a basic example to run the PyDicer pipeline using some test data.
[1]:
try:
from pydicer import PyDicer
except ImportError:
!pip install pydicer
from pydicer import PyDicer
from pathlib import Path
from pydicer.input.test import TestInput
Setup working directory#
First we’ll create a directory for our project. Change the directory
location to a folder on your system where you’d like PyDicer to work with this data.
[2]:
directory = Path("./data")
Create a PyDicer object#
The PyDicer class provides all functionlity to run the pipeline and work with the data stored and converted in your project directory
[3]:
pydicer = PyDicer(directory)
Fetch some data#
A TestInput class is provided in pydicer to download some sample data to work with. Several other input classes exist if you’d like to retrieve DICOM data for conversion from somewhere else, see the docs for information on how these work.
[4]:
dicom_directory = directory.joinpath("dicom")
test_input = TestInput(dicom_directory)
test_input.fetch_data()
# Add the input DICOM location to the pydicer object
pydicer.add_input(dicom_directory)
Run the pipeline#
The function runs the entire PyDicer pipeline on the test DICOM data. This includes: - Preprocessing the DICOM data (data which can’t be handled or is corrupt will be placed in Quarantine) - Convert the data to Nifti format (see the output in the data
directory) -
Visualise the data (png files will be placed alongside the converted Nifti files) - Compute Radiomics features (Results are stored in a csv alongside the converted structures) - Compute Dose Volume
Histograms (results are stored alongside converted dose data)
Note that the entire Pipeline can be quite time consuming to run. Depending on your project’s dataset you will likely want to run only portions of the pipeline with finer control over each step. For this reason we only run the pipeline for one patient here as a demonstration.
[5]:
pydicer.run_pipeline(patient="HNSCC-01-0019")
100%|██████████| 1309/1309 [00:03<00:00, 403.32files/s, preprocess]
100%|██████████| 4/4 [00:50<00:00, 12.65s/objects, convert]
100%|██████████| 3/3 [00:15<00:00, 5.16s/objects, visualise]
100%|██████████| 1/1 [00:14<00:00, 14.99s/objects, Compute Radiomics]
100%|██████████| 1/1 [00:09<00:00, 9.76s/objects, Compute DVH]
Prepare a dataset#
Datasets which are extracted in DICOM format can often be a bit messy and require some cleaning up after conversion. Exactly what data objects to extract for the clean dataset will differ by project but here we use a somewhat common approach of extracting the latest structure set for each patient and the image linked to that.
The resulting dataset is stored in a folder with your dataset name (clean
for this example).
See the dataset preparation example for a more detailed description on how this works.
[6]:
pydicer.dataset.prepare(dataset_name="clean", preparation_function="rt_latest_dose")
Analyse the dataset#
The pipeline computes first-order radiomics features by default, as well as dose volume histograms. Here we can extract out the results easily into a Pandas DataFrame for analysis.
Check out the Compute Radiomics and the Dose Metrics examples for further details on how to use these functions.
[7]:
# Display the DataFrame of radiomics computed
df_radiomics = pydicer.analyse.get_all_computed_radiomics_for_dataset(dataset_name="clean")
df_radiomics
[7]:
Contour | Patient | ImageHashedUID | StructHashedUID | ResampledPixelSpacing | NormalisationScale | firstorder|10Percentile | firstorder|90Percentile | firstorder|Energy | firstorder|Entropy | ... | firstorder|Mean | firstorder|Median | firstorder|Minimum | firstorder|Range | firstorder|RobustMeanAbsoluteDeviation | firstorder|RootMeanSquared | firstorder|Skewness | firstorder|TotalEnergy | firstorder|Uniformity | firstorder|Variance | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
25 | +1 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -999.0 | 339.0 | 5.178130e+11 | 4.785593 | ... | -127.630179 | 18.0 | -1024.0 | 4000.0 | 276.691595 | 565.017962 | 0.216289 | 1.481475e+12 | 0.076182 | 302955.835075 |
1 | -.3 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -68.0 | 669.0 | 1.587794e+11 | 4.415742 | ... | 142.809678 | 41.0 | -1024.0 | 4000.0 | 77.747524 | 410.946993 | 1.886750 | 4.542715e+11 | 0.111299 | 148482.826970 |
16 | Brain | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | 24.0 | 70.0 | 6.420357e+08 | 1.698910 | ... | 45.375629 | 40.0 | -849.0 | 2151.0 | 9.055957 | 57.813075 | 10.351606 | 1.836879e+09 | 0.427362 | 1283.403884 |
27 | Brainstem | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | 14.0 | 46.0 | 1.860914e+07 | 1.275139 | ... | 30.442222 | 30.0 | -26.0 | 581.0 | 7.063362 | 33.431514 | 3.682653 | 5.324118e+07 | 0.491624 | 190.937232 |
2 | CTV63 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -65.0 | 308.0 | 1.346678e+10 | 4.124500 | ... | 58.805603 | 37.0 | -1014.0 | 3990.0 | 36.643247 | 287.047711 | 0.524907 | 3.852878e+10 | 0.125626 | 78938.289561 |
17 | CTV63_Sep | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -409.0 | 498.2 | 8.517524e+09 | 5.069875 | ... | 37.578143 | 32.0 | -1014.0 | 3990.0 | 89.611043 | 407.993479 | 0.163450 | 2.436883e+10 | 0.061470 | 165046.562407 |
0 | CTV70 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -34.0 | 209.0 | 5.101866e+09 | 3.585289 | ... | 69.823863 | 38.0 | -997.0 | 2572.0 | 24.191215 | 211.758612 | 1.987960 | 1.459655e+10 | 0.164658 | 39966.337967 |
18 | Cord | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | 5.0 | 85.0 | 5.136484e+07 | 2.646141 | ... | 48.951795 | 49.0 | -296.0 | 1058.0 | 14.980855 | 71.085212 | 2.263382 | 1.469560e+08 | 0.233730 | 2656.829157 |
3 | Cord_EXPANDED | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | 24.0 | 754.0 | 5.387667e+09 | 4.928948 | ... | 276.231702 | 123.0 | -296.0 | 3272.0 | 177.244691 | 417.346112 | 1.615239 | 1.541424e+10 | 0.061246 | 97873.824272 |
20 | External | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -78.0 | 582.0 | 1.663208e+11 | 4.426843 | ... | 121.129199 | 37.0 | -1024.0 | 4000.0 | 62.170197 | 389.791941 | 2.028668 | 4.758476e+11 | 0.104930 | 137265.474901 |
24 | GTV1 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -3.0 | 65.0 | 3.553947e+08 | 2.430806 | ... | 37.678033 | 37.0 | -997.0 | 2297.0 | 12.273112 | 83.549794 | -0.213768 | 1.016792e+09 | 0.290775 | 5560.934010 |
26 | LT_Parotid | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -76.0 | 33.0 | 6.793782e+06 | 2.747940 | ... | -19.772190 | -19.0 | -119.0 | 208.0 | 24.155809 | 45.366267 | 0.094309 | 1.943717e+07 | 0.165854 | 1667.158645 |
5 | Lt_Deep_Prtd | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -70.0 | 52.0 | 4.208941e+06 | 2.855015 | ... | -9.347047 | -11.0 | -119.0 | 221.0 | 26.206174 | 45.135812 | -0.002344 | 1.204188e+07 | 0.152867 | 1949.874234 |
11 | Lt_Sup_Prtd | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -80.0 | 11.0 | 2.932703e+06 | 2.486043 | ... | -31.238202 | -27.0 | -112.0 | 178.0 | 24.082085 | 46.869833 | -0.060561 | 8.390531e+06 | 0.193892 | 1220.955994 |
9 | Mandible | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | 68.0 | 1294.0 | 1.492566e+10 | 5.931930 | ... | 637.881065 | 602.0 | -188.0 | 1851.0 | 328.164625 | 787.427714 | 0.236129 | 4.270266e+10 | 0.018566 | 213150.151566 |
13 | Optic_Nerve | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -52.0 | 121.0 | 1.702145e+07 | 3.379679 | ... | 27.759629 | 17.0 | -850.0 | 1437.0 | 25.985759 | 110.185458 | -0.087479 | 4.869876e+07 | 0.149776 | 11370.238228 |
28 | Oral_Avoid | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -64.0 | 1044.0 | 1.299995e+10 | 5.093911 | ... | 257.075139 | 56.0 | -1024.0 | 4000.0 | 177.615572 | 600.257436 | 2.282101 | 3.719315e+10 | 0.063132 | 294221.363063 |
29 | PTV57 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -83.0 | 74.0 | 2.141580e+09 | 3.655007 | ... | 9.569312 | 30.0 | -974.0 | 2389.0 | 36.424098 | 204.728314 | -1.066404 | 6.127108e+09 | 0.132085 | 41822.111014 |
19 | PTV63 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -270.0 | 429.0 | 8.911092e+09 | 4.852886 | ... | 36.173846 | 32.0 | -1014.0 | 3990.0 | 68.189480 | 384.016550 | 0.549360 | 2.549484e+10 | 0.072524 | 146160.163747 |
23 | PTV70 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -55.0 | 387.0 | 1.495341e+10 | 4.153192 | ... | 79.991129 | 39.0 | -1000.0 | 3976.0 | 43.904178 | 306.295730 | 1.086361 | 4.278205e+10 | 0.126097 | 87418.493331 |
6 | Post_Neck | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -85.0 | 216.5 | 3.456349e+09 | 3.744445 | ... | 68.016919 | 27.0 | -143.0 | 1555.0 | 46.872956 | 237.810678 | 3.155396 | 9.888695e+09 | 0.117753 | 51927.617163 |
4 | RT_Parotid | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -61.0 | 27.0 | 1.763875e+07 | 2.571701 | ... | -12.798874 | -16.0 | -124.0 | 1345.0 | 17.355544 | 64.339575 | 9.353950 | 5.046488e+07 | 0.214621 | 3975.769687 |
7 | Ring | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -999.0 | 339.0 | 5.178130e+11 | 4.785593 | ... | -127.630179 | 18.0 | -1024.0 | 4000.0 | 276.691595 | 565.017962 | 0.216289 | 1.481475e+12 | 0.076182 | 302955.835075 |
22 | Rt_Deep_Prtd | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -72.6 | 17.6 | 1.353914e+07 | 2.596270 | ... | -15.225243 | -20.0 | -124.0 | 1448.0 | 17.602131 | 93.611962 | 9.152671 | 3.873580e+07 | 0.216523 | 8531.391337 |
8 | Rt_Sup_Prtd | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -65.0 | 27.0 | 8.699281e+06 | 2.627386 | ... | -14.784448 | -17.0 | -115.0 | 1206.0 | 18.682879 | 56.091120 | 8.311626 | 2.488884e+07 | 0.200511 | 2927.633827 |
12 | chiasm | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | 7.0 | 51.0 | 1.122952e+07 | 1.907659 | ... | 35.686332 | 28.0 | -88.0 | 1223.0 | 6.397863 | 80.989482 | 9.099615 | 3.212790e+07 | 0.383310 | 5285.781869 |
10 | cold | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | 41.0 | 287.0 | 2.882986e+07 | 3.265029 | ... | 119.388430 | 75.0 | 18.0 | 645.0 | 43.849018 | 162.707419 | 2.117025 | 8.248290e+07 | 0.163774 | 12220.107157 |
30 | cool_off | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -88.0 | 54.0 | 8.915266e+06 | 2.894287 | ... | -28.218110 | -41.0 | -115.0 | 203.0 | 34.733117 | 59.244808 | 0.523930 | 2.550678e+07 | 0.149391 | 2713.685499 |
15 | ctv_57 | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -78.0 | 65.0 | 9.135206e+08 | 3.351470 | ... | -7.309885 | 27.0 | -972.0 | 1988.0 | 34.376180 | 171.730129 | -3.093002 | 2.613603e+09 | 0.148683 | 29437.802764 |
21 | hotspot | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -187.3 | 1282.3 | 2.445281e+09 | 6.115974 | ... | 375.515514 | 196.0 | -964.0 | 2508.0 | 321.413652 | 660.328969 | 0.522511 | 6.996004e+09 | 0.020100 | 295022.446264 |
14 | larynx_avoidance | HNSCC-01-0019 | b281ea | 7cdcd9 | NaN | NaN | -975.0 | 43.0 | 4.655352e+08 | 4.605136 | ... | -369.268506 | -85.5 | -1000.0 | 1237.0 | 353.504851 | 559.714876 | -0.451007 | 1.331907e+09 | 0.063251 | 176921.512696 |
31 rows × 24 columns
[8]:
# Extract the D95, D50 and V3 dose metrics
df_dose_metrics = pydicer.analyse.compute_dose_metrics(dataset_name="clean", d_point=[95, 50], v_point=[3])
df_dose_metrics
[8]:
patient | struct_hash | dose_hash | label | cc | mean | D95 | D50 | V3 | |
---|---|---|---|---|---|---|---|---|---|
0 | HNSCC-01-0019 | 7cdcd9 | 309e1a | +1 | 4640.553474 | 29.679710 | 0.030835 | 25.358893 | 3709.925652 |
1 | HNSCC-01-0019 | 7cdcd9 | 309e1a | -.3 | 2689.948082 | 43.784290 | 11.384142 | 43.358867 | 2688.291550 |
2 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Brain | 549.576759 | 24.043829 | 7.657684 | 22.339093 | 549.553871 |
3 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Brainstem | 47.636032 | 39.998913 | 28.935185 | 39.202222 | 47.636032 |
4 | HNSCC-01-0019 | 7cdcd9 | 309e1a | CTV63 | 467.602730 | 71.274086 | 66.177108 | 71.589748 | 467.602730 |
5 | HNSCC-01-0019 | 7cdcd9 | 309e1a | CTV63_Sep | 146.395683 | 68.858795 | 64.433568 | 69.122560 | 146.395683 |
6 | HNSCC-01-0019 | 7cdcd9 | 309e1a | CTV70 | 325.512886 | 72.304520 | 69.484503 | 72.385685 | 325.512886 |
7 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Cord | 29.082298 | 24.179092 | 3.371394 | 32.130000 | 28.092384 |
8 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Cord_EXPANDED | 88.497162 | 23.978754 | 3.222574 | 31.201053 | 84.640503 |
9 | HNSCC-01-0019 | 7cdcd9 | 309e1a | External | 3131.858826 | 41.149075 | 8.677738 | 39.766531 | 3123.699188 |
10 | HNSCC-01-0019 | 7cdcd9 | 309e1a | GTV1 | 145.660400 | 72.551830 | 69.922256 | 72.579562 | 145.660400 |
11 | HNSCC-01-0019 | 7cdcd9 | 309e1a | LT_Parotid | 9.444237 | 37.419106 | 20.986429 | 34.608333 | 9.444237 |
12 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Lt_Deep_Prtd | 5.910873 | 45.722030 | 28.243333 | 44.633333 | 5.910873 |
13 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Lt_Sup_Prtd | 3.819466 | 26.882912 | 19.387500 | 26.795833 | 3.819466 |
14 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Mandible | 68.870544 | 46.695267 | 21.463077 | 42.843662 | 68.870544 |
15 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Optic_Nerve | 4.011154 | 9.785028 | 4.719756 | 8.280000 | 4.011154 |
16 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Oral_Avoid | 103.225708 | 29.481693 | 18.202899 | 30.124706 | 103.222847 |
17 | HNSCC-01-0019 | 7cdcd9 | 309e1a | PTV57 | 146.183968 | 57.981964 | 53.899716 | 58.101947 | 146.183968 |
18 | HNSCC-01-0019 | 7cdcd9 | 309e1a | PTV63 | 172.883034 | 65.743240 | 60.450535 | 65.955415 | 172.883034 |
19 | HNSCC-01-0019 | 7cdcd9 | 309e1a | PTV70 | 456.015587 | 71.606650 | 67.455984 | 71.783420 | 456.015587 |
20 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Post_Neck | 174.854279 | 38.756176 | 30.159714 | 38.474556 | 174.854279 |
21 | HNSCC-01-0019 | 7cdcd9 | 309e1a | RT_Parotid | 12.190819 | 25.072630 | 15.400455 | 22.722973 | 12.190819 |
22 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Ring | 4640.553474 | 29.679710 | 0.030835 | 25.358893 | 3709.925652 |
23 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Rt_Deep_Prtd | 4.420280 | 32.257150 | 21.456250 | 30.916667 | 4.420280 |
24 | HNSCC-01-0019 | 7cdcd9 | 309e1a | Rt_Sup_Prtd | 7.910728 | 20.038557 | 14.269444 | 20.693590 | 7.910728 |
25 | HNSCC-01-0019 | 7cdcd9 | 309e1a | chiasm | 4.898071 | 7.427364 | 5.240769 | 6.989552 | 4.898071 |
26 | HNSCC-01-0019 | 7cdcd9 | 309e1a | cold | 3.115654 | 71.219050 | 69.963571 | 71.269500 | 3.115654 |
27 | HNSCC-01-0019 | 7cdcd9 | 309e1a | cool_off | 7.266998 | 62.732640 | 56.155556 | 63.781250 | 7.266998 |
28 | HNSCC-01-0019 | 7cdcd9 | 309e1a | ctv_57 | 88.623047 | 59.087975 | 56.294831 | 58.537787 | 88.623047 |
29 | HNSCC-01-0019 | 7cdcd9 | 309e1a | hotspot | 16.044617 | 57.649550 | 48.448571 | 59.036364 | 16.036034 |
30 | HNSCC-01-0019 | 7cdcd9 | 309e1a | larynx_avoidance | 4.251480 | 51.509174 | 43.743333 | 52.757143 | 4.251480 |
[ ]: