AI Assisted Reader Evaluation in Acute Computed Tomography (CT) Head Interpretation
- Conditions
- Intracranial HemorrhagesCerebral EdemaHydrocephalusCerebral InjuryAcute Ischemic StrokeCerebral Infarction
- Interventions
- Other: Ground truthingOther: Reading
- Registration Number
- NCT06018545
- Lead Sponsor
- Oxford University Hospitals NHS Trust
- Brief Summary
This study has been added as a sub study to the Simulation Training for Emergency Department Imaging 2 study (ClinicalTrials.gov ID NCT05427838).
The purpose of the study is to assess the impact of an Artificial Intelligence (AI) tool called qER 2.0 EU on the performance of readers, including general radiologists, emergency medicine clinicians, and radiographers, in interpreting non-contrast CT head scans. The study aims to evaluate the changes in accuracy, review time, and diagnostic confidence when using the AI tool. It also seeks to provide evidence on the diagnostic performance of the AI tool and its potential to improve efficiency and patient care in the context of the National Health Service (NHS). The study will use a dataset of 150 CT head scans, including both control cases and abnormal cases with specific abnormalities. The results of this study will inform larger follow-up studies in real-life Emergency Department (ED) settings.
- Detailed Description
Not available
Recruitment & Eligibility
- Status
- ACTIVE_NOT_RECRUITING
- Sex
- All
- Target Recruitment
- 33
- Radiologists/Radiographers/ED clinicians who review CT head scans as part of their clinical practice
- Neuroradiologists.
- Non-radiologist groups: Clinicians with previous formal postgraduate CT reporting training
- Emergency Medicine group: Clinicians with previous career in radiology/neurosurgery to registrar level
Study & Design
- Study Type
- OBSERVATIONAL
- Study Design
- Not specified
- Arm && Interventions
Group Intervention Description Ground truthers Ground truthing Two Consultant neuroradiologists will independently review the images to establish the 'ground truth' findings on the CT scans which will be used as the reference standard. In the case of disagreement, a third senior neuroradiologist's opinion will be sought for arbitration. A difficulty score will be assigned to each scan by the ground truthers using a 5-point Likert scale. Readers Reading 30 readers will be recruited across four NHS trusts including ten general radiologists, fifteen emergency medicine clinicians, and five CT radiographers of varying seniority. Readers will interpret each scan first without, then with, the assistance of the AI tool, with an intervening 4-week washout period. Using a panel of neuroradiologists as ground truth, the stand-alone performance of qER will be assessed, and its impact on the readers' performance will be analysed as change in accuracy, mean review time per scan, and self-reported diagnostic confidence. Subgroup analyses will be performed by reader professional group, reader seniority, pathological finding, and neuroradiologist-rated difficulty.
- Primary Outcome Measures
Name Time Method Reader performance: Positive and negative predictive value, comparative between with and without AI assistance. During 6 weeks, which is the period for reading or reviewing the cases/scans. Reader performance will be evaluated as Positive Predictive Value (PPV) and negative predictive value (NPV), with and without AI assistance.
qER (AI algorithm) performance: Positive and negative predictive value. During 6 weeks, which is the period for reading or reviewing the cases/scans. qER performance will be evaluated as Positive Predictive Value (PPV) and negative predictive value (NPV).
Reader performance: Sensitivity, specificity, comparative between with and without AI assistance. During 6 weeks, which is the period for reading or reviewing the cases/scans. Reader performance will be evaluated as sensitivity, specificity, with and without AI assistance.
Reader speed: Mean time taken to review a scan, with versus without AI assistance. During 6 weeks, which is the period for reading or reviewing the cases/scans. Reader speed will be evaluated as the man time taken to review a scan, using time unite of seconds.
Reader confidence: Self-reported diagnostic confidence on a 10 point visual analogue scale, with vs without AI assistance. During 6 weeks, which is the period for reading or reviewing the cases/scans. On the reading platform (RAIQC), one of the questions asks the level of confidence that the participant has in their diagnostic opinion. The question offers a scale of 1 to 10, where 1 is not confident, and 10 is highly confident.
qER (AI algorithm) performance: Sensitivity and specificity During 6 weeks, which is the period for reading or reviewing the cases/scans. qER performance will be evaluated as sensitivity, specificity.
Reader performance: Area Under Receiver Operating Characteristic Curve (AUROC), comparative between with and without AI assistance. During 6 weeks, which is the period for reading or reviewing the cases/scans. Reader performance will be evaluated as Area Under Receiver Operating Characteristic Curve (AUROC), with and without AI assistance.
qER (AI algorithm) performance: Area Under Receiver Operating Characteristic Curve (AUROC). During 6 weeks, which is the period for reading or reviewing the cases/scans. qER performance will be evaluated as Area Under Receiver Operating Characteristic Curve (AUROC)
- Secondary Outcome Measures
Name Time Method
Trial Locations
- Locations (4)
NHS Greater Glasgow and Clyde
🇬🇧Glasgow, United Kingdom
Oxford University Hospitals NHS Foundation Trust
🇬🇧Oxford, Oxfordshire, United Kingdom
Guy's & St Thomas NHS Foundation Trust
🇬🇧London, United Kingdom
Northumbria Healthcare NHS Foundation Trust
🇬🇧Newcastle Upon Tyne, United Kingdom