Auto-segmentation Inference & Analysis#

Open In Colab

A common task when working the medical imaging data, it to run an auto-segmentation model (inference) on the images in your dataset. If you have manual definitions of the same structures available in your dataset, you will typically want to compare the auto and the manual segmentations, computing metrics and produce plots and visualisation.

This example notebook will guide you through the process of performing model inference and analysis of those structures. We will use a single atlas-based segmentation model for demonstration purposes.

Warning: The auto-segmentation results produced by the example in this notebook are poor. The atlas based segmentation function is optimised for runtime is purely provided to demonstrate how to run and analyse an auto-segmentation model.

[1]:
try:
    from pydicer import PyDicer
except ImportError:
    !pip install pydicer
    from pydicer import PyDicer

import os
import logging

from pathlib import Path

import SimpleITK as sitk

from platipy.imaging.registration.utils import apply_transform
from platipy.imaging.registration.linear import linear_registration
from platipy.imaging.registration.deformable import fast_symmetric_forces_demons_registration

from pydicer.utils import fetch_converted_test_data

from pydicer.generate.segmentation import segment_image, read_all_segmentation_logs

from pydicer.analyse.compare import (
    compute_contour_similarity_metrics,
    get_all_similarity_metrics_for_dataset,
    prepare_similarity_metric_analysis
)

Setup PyDicer#

For this example, we will use the LCTSC test data which has already been converted using PyDicer. We also initialise our PyDicer object.

[2]:
working_directory = fetch_converted_test_data("./lctsc_autoseg", dataset="LCTSC")
pydicer = PyDicer(working_directory)

Prepare Atlas#

Since we will use a single atlas-based segmentation model, we must split our data, selecting one case as our atlas and the remaining cases as the validation set. We use the PyDicer dataset preparation module to create these subsets of data.

[3]:
df = pydicer.read_converted_data()

# Specify the patient ID to use as the atlas case
atlas_dataset = "atlas"
atlas_case  = "LCTSC-Train-S1-001"
df_atlas = df[df.patient_id==atlas_case]

# And the remaining cases will make up our validation set
validation_dataset = "validation"
df_validation = df[df.patient_id!=atlas_case]

# Use the dataset preparation module to prepare these two data subsets
pydicer.dataset.prepare_from_dataframe(atlas_dataset, df_atlas)
pydicer.dataset.prepare_from_dataframe(validation_dataset, df_validation)

Define Segmentation Function#

Now that our atlas and validation sets are ready, we will define our function which will run our simple single atlas-based auto-segmentation model for us. This example uses the atlas-based segmentation tools available in PlatiPy.

In a real-world scenario, you will have your own segmentation model you wish to apply. You should integrate this model into such a function which accepts an image and returns a dictionary of structures.

To get started, we recommend you try running the TotalSegmentator model on your CT data. PyDicer already has a function ready which runs this model, check out run_total_segmentator.

[4]:
def single_atlas_segmentation(img):
    """Segment an image using a single atlas case

    Args:
        img (SimpleITK.Image): The SimpleITK image to segment.

    Returns:
        dict: The segmented structure dictionary
    """

    # Load the atlas case image
    atlas_img_row = df_atlas[df_atlas.modality=="CT"].iloc[0]
    atlas_img = sitk.ReadImage(str(Path(atlas_img_row.path).joinpath("CT.nii.gz")))

    # Load the atlas case structures
    atlas_structures = {}
    atlas_struct_row = df_atlas[df_atlas.modality=="RTSTRUCT"].iloc[0]
    for struct_path in Path(atlas_struct_row.path).glob("*.nii.gz"):
        struct_name = struct_path.name.replace(".nii.gz", "")
        atlas_structures[struct_name] = sitk.ReadImage(str(struct_path))

    # Use a simple linear (rigid) registration to align the input image with the atlas image
    img_ct_atlas_reg_linear, tfm_linear = linear_registration(
        fixed_image = img,
        moving_image = atlas_img,
        reg_method='similarity',
        metric='mean_squares',
        optimiser='gradient_descent',
        shrink_factors=[4, 2],
        smooth_sigmas=[2, 0],
        sampling_rate=1.0,
        number_of_iterations=50,
    )

    # Perform a fast deformable registration
    img_ct_atlas_reg_dir, tfm_dir, dvf = fast_symmetric_forces_demons_registration(
        img,
        img_ct_atlas_reg_linear,
        ncores=4,
        isotropic_resample=True,
        resolution_staging=[4],
        iteration_staging=[20],
    )

    # Combine the two transforms
    tfm_combined = sitk.CompositeTransform((tfm_linear, tfm_dir))

    # Apply the transform to the atlas structures
    auto_segmentations = {}
    for s in atlas_structures:
        auto_segmentations[s] = apply_transform(
            atlas_structures[s],
            reference_image=img,
            transform=tfm_combined
        )

    return auto_segmentations

Run Auto-segmentation#

The segment_dataset function will run over all images in our dataset and will pass the images to a function we define for segmentation. We pass in the name of our validation_dataset so that only the images in this dataset will be segmented.

[5]:
segment_id = "atlas" # Used to generate the ID of the resulting auto-segmented structure sets

pydicer.segment_dataset(segment_id, single_atlas_segmentation, dataset_name=validation_dataset)
Segmentation: 100%|██████████| 9/9 [01:00<00:00,  6.70s/it]

We can use PyDicer’s visualisation module to produce snapshots of the auto-segmentations produced.

[6]:
pydicer.visualise.visualise(force=False)
100%|██████████| 29/29 [00:08<00:00,  3.23objects/s, visualise]

Read Segmentation Logs#

After running the auto-segmentation on across the dataset, we can fetch the logs to confirm that everything went well using the read_all_segmentation_logs function. This will also let us inspect the runtime of the segmentation. In case something went wrong, we can use these logs to help debug the issue.

[7]:
# Read the segmentation log DataFrame
df_logs = read_all_segmentation_logs(working_directory)
df_logs
[7]:
patient_id segment_id img_id start_time end_time total_time_seconds success_flag error
0 LCTSC-Train-S1-006 atlas 88c5ef 2024-02-02 12:56:47.701381 2024-02-02 12:56:54.341045 6.639664 True NaN
1 LCTSC-Train-S1-008 atlas dd0026 2024-02-02 12:56:54.362806 2024-02-02 12:57:00.569891 6.207085 True NaN
2 LCTSC-Test-S1-102 atlas 914d57 2024-02-02 12:57:00.594239 2024-02-02 12:57:07.559670 6.965431 True NaN
3 LCTSC-Train-S1-005 atlas 738c1a 2024-02-02 12:57:07.584231 2024-02-02 12:57:13.693452 6.109221 True NaN
4 LCTSC-Train-S1-004 atlas d91c84 2024-02-02 12:57:13.718177 2024-02-02 12:57:20.695170 6.976993 True NaN
5 LCTSC-Train-S1-002 atlas aa38e6 2024-02-02 12:57:20.716338 2024-02-02 12:57:28.057675 7.341337 True NaN
6 LCTSC-Train-S1-003 atlas 2bf2f9 2024-02-02 12:57:28.079891 2024-02-02 12:57:34.929087 6.849196 True NaN
7 LCTSC-Test-S1-101 atlas 666be6 2024-02-02 12:57:34.953609 2024-02-02 12:57:41.277923 6.324314 True NaN
8 LCTSC-Train-S1-007 atlas 5adf40 2024-02-02 12:57:41.300846 2024-02-02 12:57:47.937181 6.636335 True NaN
[8]:
# Use some Pandas magic to produce some stats on the segmentation runtime
df_success = df_logs[df_logs.success_flag]
agg_stats = ["mean", "std", "max", "min", "count"]
df_success[["segment_id", "total_time_seconds"]].groupby("segment_id").agg(agg_stats)
[8]:
total_time_seconds
mean std max min count
segment_id
atlas 6.672175 0.405371 7.341337 6.109221 9

Auto-segmentation Analysis#

Now that our auto-segmentation model has been run on our validation set, we can compare these structures to the manual structures available on this dataset. PyDicer provides functionality to compute similarity metrics, but we must first prepare a DataFrame containing our auto structure sets (df_target) and a separate DataFrame with our manual structure sets (df_reference).

[9]:
df = pydicer.read_converted_data(dataset_name=validation_dataset)
df_structs = df[df.modality=="RTSTRUCT"]

df_reference = df_structs[~df_structs.hashed_uid.str.startswith(f"atlas_")]
df_target = df_structs[df_structs.hashed_uid.str.startswith(f"atlas_")]
[10]:
df_reference
[10]:
sop_instance_uid hashed_uid modality patient_id series_uid for_uid referenced_sop_instance_uid path
1 1.3.6.1.4.1.14519.5.2.1.7014.4598.161109536301... ceb111 RTSTRUCT LCTSC-Train-S1-006 1.3.6.1.4.1.14519.5.2.1.7014.4598.161109536301... 1.3.6.1.4.1.14519.5.2.1.7014.4598.325217978839... 1.3.6.1.4.1.14519.5.2.1.7014.4598.263565217029... lctsc_autoseg/data/LCTSC-Train-S1-006/structur...
4 1.3.6.1.4.1.14519.5.2.1.7014.4598.225487254421... 48a970 RTSTRUCT LCTSC-Train-S1-008 1.3.6.1.4.1.14519.5.2.1.7014.4598.225487254421... 1.3.6.1.4.1.14519.5.2.1.7014.4598.243364093992... 1.3.6.1.4.1.14519.5.2.1.7014.4598.217118546740... lctsc_autoseg/data/LCTSC-Train-S1-008/structur...
7 1.3.6.1.4.1.14519.5.2.1.7014.4598.110977663386... 6c6ea4 RTSTRUCT LCTSC-Test-S1-102 1.3.6.1.4.1.14519.5.2.1.7014.4598.110977663386... 1.3.6.1.4.1.14519.5.2.1.7014.4598.408067568497... 1.3.6.1.4.1.14519.5.2.1.7014.4598.235489581364... lctsc_autoseg/data/LCTSC-Test-S1-102/structure...
10 1.3.6.1.4.1.14519.5.2.1.7014.4598.214803404117... 68d663 RTSTRUCT LCTSC-Train-S1-005 1.3.6.1.4.1.14519.5.2.1.7014.4598.214803404117... 1.3.6.1.4.1.14519.5.2.1.7014.4598.964812114328... 1.3.6.1.4.1.14519.5.2.1.7014.4598.140943693489... lctsc_autoseg/data/LCTSC-Train-S1-005/structur...
13 1.3.6.1.4.1.14519.5.2.1.7014.4598.595315284787... 61758b RTSTRUCT LCTSC-Train-S1-004 1.3.6.1.4.1.14519.5.2.1.7014.4598.595315284787... 1.3.6.1.4.1.14519.5.2.1.7014.4598.313160975008... 1.3.6.1.4.1.14519.5.2.1.7014.4598.141349678572... lctsc_autoseg/data/LCTSC-Train-S1-004/structur...
16 1.3.6.1.4.1.14519.5.2.1.7014.4598.291449913947... f036b8 RTSTRUCT LCTSC-Train-S1-002 1.3.6.1.4.1.14519.5.2.1.7014.4598.291449913947... 1.3.6.1.4.1.14519.5.2.1.7014.4598.145984743865... 1.3.6.1.4.1.14519.5.2.1.7014.4598.318848546630... lctsc_autoseg/data/LCTSC-Train-S1-002/structur...
19 1.3.6.1.4.1.14519.5.2.1.7014.4598.127494043495... 8e34f9 RTSTRUCT LCTSC-Train-S1-003 1.3.6.1.4.1.14519.5.2.1.7014.4598.127494043495... 1.3.6.1.4.1.14519.5.2.1.7014.4598.803594135292... 1.3.6.1.4.1.14519.5.2.1.7014.4598.159497309342... lctsc_autoseg/data/LCTSC-Train-S1-003/structur...
22 1.3.6.1.4.1.14519.5.2.1.7014.4598.929007819506... cc682f RTSTRUCT LCTSC-Test-S1-101 1.3.6.1.4.1.14519.5.2.1.7014.4598.280355341349... 1.3.6.1.4.1.14519.5.2.1.7014.4598.171234424242... 1.3.6.1.4.1.14519.5.2.1.7014.4598.333558838494... lctsc_autoseg/data/LCTSC-Test-S1-101/structure...
25 1.3.6.1.4.1.14519.5.2.1.7014.4598.284338872489... ed6686 RTSTRUCT LCTSC-Train-S1-007 1.3.6.1.4.1.14519.5.2.1.7014.4598.284338872489... 1.3.6.1.4.1.14519.5.2.1.7014.4598.187709044743... 1.3.6.1.4.1.14519.5.2.1.7014.4598.168227681755... lctsc_autoseg/data/LCTSC-Train-S1-007/structur...
[11]:
df_target
[11]:
sop_instance_uid hashed_uid modality patient_id series_uid for_uid referenced_sop_instance_uid path
2 atlas_88c5ef atlas_88c5ef RTSTRUCT LCTSC-Train-S1-006 atlas_88c5ef 1.3.6.1.4.1.14519.5.2.1.7014.4598.325217978839... 1.3.6.1.4.1.14519.5.2.1.7014.4598.263565217029... lctsc_autoseg/data/LCTSC-Train-S1-006/structur...
5 atlas_dd0026 atlas_dd0026 RTSTRUCT LCTSC-Train-S1-008 atlas_dd0026 1.3.6.1.4.1.14519.5.2.1.7014.4598.243364093992... 1.3.6.1.4.1.14519.5.2.1.7014.4598.217118546740... lctsc_autoseg/data/LCTSC-Train-S1-008/structur...
8 atlas_914d57 atlas_914d57 RTSTRUCT LCTSC-Test-S1-102 atlas_914d57 1.3.6.1.4.1.14519.5.2.1.7014.4598.408067568497... 1.3.6.1.4.1.14519.5.2.1.7014.4598.235489581364... lctsc_autoseg/data/LCTSC-Test-S1-102/structure...
11 atlas_738c1a atlas_738c1a RTSTRUCT LCTSC-Train-S1-005 atlas_738c1a 1.3.6.1.4.1.14519.5.2.1.7014.4598.964812114328... 1.3.6.1.4.1.14519.5.2.1.7014.4598.140943693489... lctsc_autoseg/data/LCTSC-Train-S1-005/structur...
14 atlas_d91c84 atlas_d91c84 RTSTRUCT LCTSC-Train-S1-004 atlas_d91c84 1.3.6.1.4.1.14519.5.2.1.7014.4598.313160975008... 1.3.6.1.4.1.14519.5.2.1.7014.4598.141349678572... lctsc_autoseg/data/LCTSC-Train-S1-004/structur...
17 atlas_aa38e6 atlas_aa38e6 RTSTRUCT LCTSC-Train-S1-002 atlas_aa38e6 1.3.6.1.4.1.14519.5.2.1.7014.4598.145984743865... 1.3.6.1.4.1.14519.5.2.1.7014.4598.318848546630... lctsc_autoseg/data/LCTSC-Train-S1-002/structur...
20 atlas_2bf2f9 atlas_2bf2f9 RTSTRUCT LCTSC-Train-S1-003 atlas_2bf2f9 1.3.6.1.4.1.14519.5.2.1.7014.4598.803594135292... 1.3.6.1.4.1.14519.5.2.1.7014.4598.159497309342... lctsc_autoseg/data/LCTSC-Train-S1-003/structur...
23 atlas_666be6 atlas_666be6 RTSTRUCT LCTSC-Test-S1-101 atlas_666be6 1.3.6.1.4.1.14519.5.2.1.7014.4598.171234424242... 1.3.6.1.4.1.14519.5.2.1.7014.4598.333558838494... lctsc_autoseg/data/LCTSC-Test-S1-101/structure...
26 atlas_5adf40 atlas_5adf40 RTSTRUCT LCTSC-Train-S1-007 atlas_5adf40 1.3.6.1.4.1.14519.5.2.1.7014.4598.187709044743... 1.3.6.1.4.1.14519.5.2.1.7014.4598.168227681755... lctsc_autoseg/data/LCTSC-Train-S1-007/structur...

Compute Similarity#

We use the compute_contour_similarity_metrics function to compute the metrics comparing our target structures to our reference structures.

We can specify which metrics we want to compute, in this example we compute the Dice Similarity Coefficient (DSC), Hausdorff Distance, Mean Surface Distance and the Surface DSC.

Structure names must match exactly, so we use a structure name mapping to standardise our structure names prior to computing the similarity metrics.

[12]:
# Add our structure name mapping
mapping_id = "nnunet_lctsc"
mapping = {
    "Esophagus": [],
    "Heart": [],
    "Lung_L": ["L_Lung", "Lung_Left"],
    "Lung_R": ["Lung_Right"],
    "SpinalCord": ["SC"],
}
pydicer.add_structure_name_mapping(mapping, mapping_id)

# Specify the metrics we want to compute
compute_metrics = ["DSC", "hausdorffDistance", "meanSurfaceDistance", "surfaceDSC"]

# Compute the similarity metrics
compute_contour_similarity_metrics(
    df_target,
    df_reference,
    segment_id,
    compute_metrics=compute_metrics,
    mapping_id=mapping_id
)
100%|██████████| 9/9 [00:21<00:00,  2.35s/structure sets, Compare Structures]

Fetch the similarity metrics#

Here we fetch the metrics computed and output some stats. Note that if a segmentation fails, surface metrics will return NaN and will be excluded from these stats.

[13]:
# Fetch the similarity metrics
df_metrics = get_all_similarity_metrics_for_dataset(
    working_directory,
    dataset_name=validation_dataset,
    structure_mapping_id=mapping_id
)

# Aggregate the stats using Pandas
df_metrics[
    ["segment_id", "structure", "metric", "value"]
    ].groupby(
        ["segment_id", "structure", "metric"]
    ).agg(
        ["mean", "std", "min", "max", "count"]
    )

[13]:
value
mean std min max count
segment_id structure metric
atlas Esophagus DSC 0.023485 0.039155 0.000000 0.109864 9
hausdorffDistance 86.016542 31.242434 51.074932 138.589307 9
meanSurfaceDistance 50.829617 29.069703 13.329453 88.431997 9
surfaceDSC 0.044144 0.076819 0.000000 0.222843 9
Heart DSC 0.245349 0.213440 0.000000 0.485516 9
hausdorffDistance 81.796660 28.022026 45.554259 127.344050 9
meanSurfaceDistance 34.477665 22.412489 13.660061 71.395360 9
surfaceDSC 0.136241 0.112819 0.000000 0.297727 9
Lung_L DSC 0.345206 0.227535 0.042621 0.665717 9
hausdorffDistance 107.038653 31.328493 67.318771 166.238581 9
meanSurfaceDistance 33.207495 20.589999 15.643577 64.321409 9
surfaceDSC 0.261660 0.173089 0.034036 0.567455 9
Lung_R DSC 0.435620 0.211995 0.120374 0.654082 9
hausdorffDistance 93.095879 30.999991 58.515670 158.412996 9
meanSurfaceDistance 27.582301 16.213033 12.244833 58.727983 9
surfaceDSC 0.302265 0.149149 0.084824 0.468548 9
SpinalCord DSC 0.059047 0.133917 0.000000 0.398152 9
hausdorffDistance 107.062020 37.777640 65.512324 153.971275 9
meanSurfaceDistance 50.480604 35.360977 9.644323 122.360613 9
surfaceDSC 0.086375 0.184924 0.000000 0.543595 9

Perform Analysis#

There are various plots and visualisations you may wish to produce following computation of similarity metrics. The prepare_similarity_metric_analysis function will generate several useful plots which will serve as a useful starting point when analysing your auto-segmentation results.

Plots are generated in a directory you provide. In this example, plots and tables (.csv) are output in the testdata_lctsc/analysis/atlas directory. Run the following cell, then navigate to that directory to explore the results.

[14]:
analysis_output_directory = working_directory.joinpath(
    "analysis",
    segment_id
)
analysis_output_directory.mkdir(parents=True, exist_ok=True)

prepare_similarity_metric_analysis(
    working_directory,
    analysis_output_directory=analysis_output_directory,
    dataset_name=validation_dataset,
    structure_mapping_id=mapping_id
)
../_images/_examples_AutoSegmentation_24_0.png
../_images/_examples_AutoSegmentation_24_1.png
../_images/_examples_AutoSegmentation_24_2.png
../_images/_examples_AutoSegmentation_24_3.png
[ ]: