Dataset


NOTE: TRAINING DATA NOT AVAILABLE YET.  ZENODO LINK TO TRAINING DATA WILL BE PLACED HERE IN JUNE. 


Patient Cohorts

Patients with histologically proven HNC who underwent radiotherapy (RT) at The University of Texas MD Anderson Cancer Center (MDACC).  Predominantly oropharyngeal cancer (OPC) or cancer of unknown primary. 


Imaging Data

T2-weighted (T2w) anatomical sequences of the head and neck region taken at MDACC. Data will be a mix of fat-suppressed and non-fat-suppressed images. All patients are immobilized using a thermoplastic mask. Raw images were automatically extracted from a centralized institutional imaging repository (Evercore). Images include pre-RT (1-3 weeks before start of RT) and mid-RT (2-4 weeks intra-RT) scans; example shown below. Pre-RT and mid-RT image pairs for a given patient will be consistently either fat-suppressed or non-fat-suppressed.


Figure 1. Example of pre-RT and mid-RT T2w (non-fat-suppressed) scans for a patient. 


Segmentation Information

Primary gross tumor volumes (abbreviated GTVp) - at most 1 per patient (can be 0), and metastatic lymph nodes (abbreviated GTVn) - variable number per patient (can be 0). 

Multiple physician expert observers (n = 3 to 4) have independently segmented GTVp and GTVn structures for all cases (pre-RT and mid-RT) based on MRI images provided. Based on recent literature from our group (PMID: 36761036), a minimum of 3 annotators is suggested to yield acceptable segmentations when combined via the simultaneous truth and performance level estimation algorithm (STAPLE) in these structures. Therefore, we have collected independent segmentations from >=3 annotators for each structure. All annotators were medical doctors with at least 2 years of experience in head and neck cancer segmentation. Final verification of segmentation quality was performed by experienced radiation oncology faculty members with greater than 10 years of experience.  Segmentations were combined via the STAPLE algorithm to yield the final ground truth segmentation for each case; illusatrative example shown below. Note: For this challenge, we will only provide STAPLE consensus segmentations. Individual observer segmentations will not be provided but will be made publicly available upon the challenge's completion (i.e., via TCIA).


Figure 2. Example of the STAPLE consensus process combining multiple segmentations into a single final consensus segmentation. 


The final label mask has one of 3 possible values: background = 0, GTVp = 1, GTVn = 2 (in the case of multiple lymph nodes they are concatenated into one single label). A visual example is shown below. 


Figure 3. A visual example of the mask labeling scheme for this challenge. Background = 0, primary gross tumor volume (GTVp) = 1 , metastatic node gross tumor volume (GTVn) = 2. Visualization performed in 3D Slicer. 


Data Pre-Processing 

Anonymized DICOM files (images and structure files) will be converted to NIfTI format (.nii.gz) for ease of use by participants. All images will be cropped from the top of the clavicles to the bottom of the nasal septum (oropharynx region to shoulders), allowing for more consistent image field of views and removal of identifiable facial structures. 


Training and Test Data 

The same patient cases will be used for the training and test sets of both tasks of this challenge. Therefore, we plan to release a single training dataset that can be used to construct solutions for either segmentation task. A Zenodo link to the training data will be inserted here when ready (June). 

For a given patient case, the following training data will be provided in .nii.gz format:

  • Original pre-RT T2w MRI volume with original pre-RT segmentation mask.
  • Registered pre-RT T2w MRI volume with registered pre-RT segmentation mask - More details on why these files are provided are mentioned below in Task 2 (Mid-RT Segmentation) Specific Details. 
  • Original mid-RT T2w MRI volume with original mid-RT segmentation. 

The test data, however, will be different for the 2 tasks. Participants must be cognizant that only certain files will be provided for their Docker containers depending on which task they are submitting forMore details on what test data will be provided for the specific tasks are provided in the below sections. 


Task 1 (Pre-RT Segmentation) Specific Details 

Test data for Task 1 will be composed of an unseen pre-RT scan and will not contain any annotations. The goal is to successfully predict the GTVp and GTVn tumor segmentations on new unseen pre-RT images. This is a task analogous to previous conventional segmentation challenges, such as Task 1 of the 2022 HECKTOR Challenge and the 2023 SegRap Challenge.

When training their algorithms for Task 1, participants can choose to use only pre-RT data or add in mid-RT data as well. Initially, our plan was to limit participants to utilizing only pre-RT data for training their algorithms in Task 1. However, upon reflection, we recognized that in a practical setting, individuals aiming to develop auto-segmentation algorithms could theoretically train using any accessible data at their disposal. Based on current literature, we actually don't know what the best solution would be! Would the incorporation of mid-RT data for training a pre-RT segmentation model actually be helpful, or would it merely introduce harmful noise? The answer remains unclear. Therefore, we leave this choice to the participants. Remember, though, during testing, you will ONLY have the pre-RT image as an input to your model (naturally, since this is a pre-RT segmentation task and you won't know what mid-RT data for a patient will look like). 


Task 2 (Mid-RT Segmentation) Specific Details 

Test data for Task 2 will be composed of an unseen pre-RT image with segmentation, deformed pre-RT image with deformed segmentation, and mid-RT image. In other words, you will only have annotations for the pre-RT image, which would mimic a real-world scenario for adaptive RT. The goal is to successfully predict the GTVp and GTVn segmentations on new unseen mid-RT images. 

For training, in addition to the original images, we have also provided a registered pre-RT MRI volume (deformably registered where the mid-RT scan serves as the fixed image and pre-RT scan serves as the moving image) and the corresponding registered pre-RT segmentation for each patient. We offer this data for participants who opt not to integrate any image registration techniques into their algorithms but still wish to use the two images as a joint input to their model. Moreover, in a real-world adaptive RT context, such registered scans are typically readily accessible. Naturally, participants are also free to incorporate their own image registration processes into their pipelines if they wish, as they will have access to the original images. 

Participants will be free to use any combination of input images/masks to develop their mid-RT auto-segmentation algorithms. In practice, what this means is you could do any of the following:

  • Train using only pre-RT images as input (kind of odd).
  • Train using only mid-RT images as input.
  • Train using pre-RT images and mid-RT images as individual separate inputs.
  • Train using registered pre-RT and mid-RT images as joint input.
  • Train using registered pre-RT images with pre-RT segmentation and mid-RT images as joint input.
  • Or anything you can think of that leverages the data! 

Again, during testing, you will have the original pre-RT image, original pre-RT segmentation, registered pre-RT image, registered pre-RT segmentation, and mid-RT image as possible inputs to your model. You can ignore any of these provided pieces of data (except the mid-RT image of course) if so desired. 


Misc.

  • Data from the training and test sets are representative of real-world cases from a large cancer institute treating HNC. Training and test sets will be partitioned such as to contain similar distributions based on datset characteristics such as image fat-supression status, tumor response, TNM staging,  etc. 
  • Only the challenge organizers (i.e., MDA Fuller Lab) will have access to the ground-truth segmentations (labels) for the test cases until final publication of data. 
  • Ethics approval was obtained from the University of Texas MD Anderson Cancer Center Institutional Review Board with protocol number RCR03-0800. This is a retrospective data collection protocol with a waiver of informed consent.