PET/CT-Based Image Analysis and Machine Learning of Hypermetabolic Pulmonary Lesions
- Conditions
- Lung CancersPulmonary LymphomasPulmonary MetastasesBenign Pulmonary Diseases
- Registration Number
- NCT06602674
- Lead Sponsor
- Ruijin Hospital
- Brief Summary
First, we analyse the types, imaging findings and relevant treatment responses based on PET/CT to complete a more comprehensive view of pulmonary lymphomas.
Then, some models based on radiomics features will be developed to verify the possibility of differentiating pulmonary lymphomas via machine learning and develop a multi-class classification model.
The final objective of this study is to develop a set of deep learning models for preliminary lung lesion segmentation and multi-class classification. The models will classify FDG-avid lung lesions into four groups, each defined by their pathological origin, primary therapy and relevant clinical department.
- Detailed Description
1. The local image feature extraction software (LIFEx, v 7.4.0, France) was employed for the image review and measurement of relevant data. Three observers independently interpreted the images. In cases of disagreement, the opinion of a senior doctor with over a decade of experience was given precedence. The imaging findings were recorded based on the baseline examinations. Lesion counts, locations, and descriptive labels were systematically logged in accordance with the norms set out in imaging report. The statistical software SPSS (v26.0) was used in data sorting and calculation. Chi-square test was employed to compare SPL and PPL based on categorical variables like CT findings, while T-test was used to assess continuous variables like glycemia and SUV. Given the predominance of categorical variables, chi-square, or Fisher\'s exact test (for samples \<40 or \>20% cells with \<5 expected counts) was utilised to assess treatment response and imaging performance. Spearman\'s correlation coefficient was employed to analyse the relationship between categorical and SUV-based continuous variables.
2. In this study, the metabolic tumor volume at a relative threshold of 40% (MTV40%) was selected as the volume of interest (VOI) for image analysis. For feature extraction, we employed the Python (v3.11.7)-based radiomics feature extraction toolkit PyRadiomics (v3.1.0), along with the medical image processing library SimpleITK (v2.3.1), the numerical computation and data manipulation library Numpy (v1.26.2), and the wavelet transform library PyWavelet (v1.5.0). Feature selection was conducted using RStudio (v.2023.12.0+369) based on the R programming language (v4.2.0). To ensure computational efficiency and avoid overfitting, the number of features retained was limited to 10% or less of the number of lesions in the training set. Model analysis and validation were primarily performed using RStudio as well.
3. The deep learning study divides the task of identifying and classifying hypermetabolic lung lesions into two stages: segmentation and classification. In the segmentation stage, we first utilized the open-source 2D model Lungmask to automatically crop the lung region from whole-body PET/CT images, ensuring that subsequent processing is focused on the lung area. Next, we developed a 3D UNet model with residual modules specifically designed for segmenting hypermetabolic lung lesions. This model takes the cropped PET/CT images as input, efficiently extracting lesion information from the three-dimensional images and accurately segmenting the hypermetabolic lung lesion areas.The model was then applied to both internal test sets and external validation sets for inference, resulting in the extraction of lesion-containing ROIs.
Recruitment & Eligibility
- Status
- COMPLETED
- Sex
- All
- Target Recruitment
- 647
Not provided
Not provided
Study & Design
- Study Type
- OBSERVATIONAL
- Study Design
- Not specified
- Primary Outcome Measures
Name Time Method Imaging/radiomics/deep learning features of 18F-FDG PET/CT image Baseline
- Secondary Outcome Measures
Name Time Method Efficiency of the segmentation model immediately after the development and testing of models The effectiveness of the segmentation model is evaluated by the detection rate of lesions and the Dice similarity coefficient (2(A∩B)/ (A+B), A=segmented voxel volume, B=ground truth volume), which both describes the accuracy of dividing the lesion and the background.
Efficiency of the classification model immediately after the development and testing of models The classification model is evaluated by the accuracy \[ (TP+TN)/(TP+FP+TN+FN) \] , precision \[TP/(TP+FP)\], recall \[TP/(TP+FN)\], F1-score \[2\*precision\*recall/(precision+recall)\], which all describes the ratio of correctly or wrongly classified lesions of the samples from different aspects. While the receiver operating characteristic (ROC) curve can illustrate this more visuelly. Area under the curve (AUC) calculated the proportion of area under the ROC curve, ranging from 0 to 1, representing the overall efficiency of classification in each group.
Trial Locations
- Locations (1)
Ruijin Hospital affiliated to Shanghai Jiao Tong University of Medicine
🇨🇳Shanghai, Shanghai, China
Ruijin Hospital affiliated to Shanghai Jiao Tong University of Medicine🇨🇳Shanghai, Shanghai, China