Prospective User Study and Multicenter Validation of Multimodal Medical Imaging Large Models in the Diagnosis of Common Systemic Diseases
Overview
- Phase
- Not Applicable
- Status
- Recruiting
- Sponsor
- The Third Affiliated Hospital of Southern Medical University
- Enrollment
- 1,000
- Locations
- 1
- Primary Endpoint
- Area Under the Receiver Operating Characteristic Curve (AUC)
Overview
Brief Summary
This study aims to evaluate the diagnostic performance and clinical utility of a multimodal medical imaging large model in identifying common systemic diseases. Through a retrospective reader study involving multiple centers, the research will compare the diagnostic accuracy, sensitivity, and specificity of radiologists with and without AI assistance. The goal is to validate the model's robustness and its impact on the diagnostic efficiency of clinicians across diverse healthcare settings.
Detailed Description
Background: Multimodal large models have shown significant potential in medical imaging. However, their performance and impact on clinical workflows across multiple centers require rigorous validation.
Objective
To assess the diagnostic performance of a multimodal large model and investigate whether AI assistance can improve the diagnostic accuracy and efficiency of radiologists with varying levels of experience.
Methodology: This research is designed as a multicenter, retrospective comparative reader study. A large-scale, diverse dataset of medical images (including CT and MRI) will be curated from the participating institutions. A group of licensed radiologists will perform diagnostic tasks in two separate sessions: a standalone session (without AI assistance) and an AI-assisted session, with a suitable washout period between sessions.
Data Analysis: The clinical "ground truth" will be established by expert consensus or histological results. The study will compare the Area Under the Receiver Operating Characteristic Curve (AUC), sensitivity, and specificity between the standalone and AI-assisted modes. Additionally, the reading time per case will be recorded to evaluate diagnostic efficiency.
Ethics: This study uses retrospective, anonymized data and does not alter the clinical management or treatment of patients.
The multimodal large model was developed and pre-trained using a massive dataset of approximately 1,000,000 medical imaging cases. This study focus on the multicenter clinical validation using an independent test cohort of 1,000 cases.
Study Design
- Study Type
- Observational
- Observational Model
- Cohort
- Time Perspective
- Retrospective
Eligibility Criteria
- Ages
- 18 Years to — (Adult, Older Adult)
- Sex
- All
- Accepts Healthy Volunteers
- Yes
Inclusion Criteria
- •Patients who underwent systemic medical imaging examinations (e.g., CT or MRI) at participating centers for common systemic diseases.
- •Imaging data must have a confirmed clinical reference standard, expert consensus, or pathological diagnosis.
- •Availability of complete DICOM format images with standard acquisition protocols.
Exclusion Criteria
- •Poor image quality (e.g., severe motion or metal artifacts) that precludes definitive diagnosis.
- •Cases with incomplete clinical or pathological reference standards. Corrupted image files or duplicate cases.
Arms & Interventions
Validation Cohort
A retrospective dataset of medical imaging cases (including CT and MRI) collected from multiple centers, representing common systemic diseases, used to evaluate the diagnostic performance of the multimodal large model.
Intervention: Standalone Radiologist Interpretation (Other)
Validation Cohort
A retrospective dataset of medical imaging cases (including CT and MRI) collected from multiple centers, representing common systemic diseases, used to evaluate the diagnostic performance of the multimodal large model.
Intervention: AI-assisted Radiologist Interpretation (Other)
Outcomes
Primary Outcomes
Area Under the Receiver Operating Characteristic Curve (AUC)
Time Frame: Through study completion, approximately 12 months.
Evaluation of diagnostic accuracy using AUC to compare standalone radiologist performance versus AI-assisted performance.
Secondary Outcomes
- Mean Reading and Reporting Time per Case(Through study completion, approximately 12 months.)
- Clinical Report Quality and Semantic Accuracy Score(Through study completion, approximately 12 months.)
- Sensitivity and Specificity(Through study completion, approximately 12 months.)
Investigators
Junyan Li
Scientific Research Administrator
The Third Affiliated Hospital of Southern Medical University