Hill-grade Knowledge Via Integrated Neural-network for Gastroscopy
- Conditions
- Gastroesophageal Hernia
- Interventions
- Device: EndoMind
- Registration Number
- NCT06040723
- Lead Sponsor
- Wuerzburg University Hospital
- Brief Summary
The Hill classification, also known as the Hill grade, is a system used to classify the severity of gastroesophageal valve incompetence, specifically related to gastroesophageal reflux disease (GERD) and hiatal hernia. This study aims to compare the ability of physicians versus an AI model to asses the Hill grade during gastroscopy.
- Detailed Description
Objective:
The primary goal of this study is to compare the accuracy in determining the Hill classification during gastroscopy between an artificial intelligence (AI) based system and physicians performing the examination. Secondary outcomes include evaluation of the per-class accuracy and other statistical measures such as precision, recall and f1 score.
Study Design:
Single center, endoscopist blinded study. The model considered in a previous study achieved a mean accuracy of 88%. All participants initially attended a lecture serving as a refresher regarding the Hill classification. Subsequently, physicians were asked to provide the Hill classification for test images expert annotated images depicting different Hill grades, achieving mean accuracy of 72%. Thus 127 paired measurements are required. Taking patient drop-out into consideration, at least 159 patients need to be recruited. Upon examination of the flap-valve during endoscopy, the physician is required to store an image of the flap-valve during retroflexion, which is part of the standard procedure, based on which they determine the Hill classification. The prediction of the AI model on this image is considered the model output and is considered the model's output. A group of three expert endoscopists determines the Hill classification for each image, based on majority vote, which is treated as the gold standard.
AI setup and limitations:
There are no limitations caused by the AI. The method performs a frame-by-frame analysis of the recording. These images are parsed from the AI based system in order to obtain predictions. The only interactions required with the method is a button press that initiates the examination recording process and a second button press to terminate the recording. This is performed at the beginning and end of the examination respectively. The model used in this study is an updated version of the model reported in a preliminary study, that has been trained with more data together with an auxiliary output for predicting if the Hill classification is relevant to the shown image.
Study population:
All adult patients appointed for gastroscopy that do not match the exclusion criteria will be asked for informed consent. Exclusion criteria include previous surgical interventions or altered anatomy that prevents the proper examination of the flap valve, examinations where the flap-valve is not inspected, and examinations where the expert committee does not produce a majority vote.
Intervention:
The physician performs the examination as usual. Upon inspection of the flap valve, the physician captures an image of the examination, as usual, and gives their assessment of the Hill grade. The output of the model for the same image is considered the model prediction. The physician is blinded to the model's prediction.
Recruitment & Eligibility
- Status
- COMPLETED
- Sex
- All
- Target Recruitment
- 195
- Adult patients (>18 years)
- Scheduled gastroscopy
Examination level
- Previous surgical interventions or altered anatomy that prevents the proper examination of the flap valve
- Flap-valve not inspected
Data Level:
- Image during flap-valve inspection not stored
- Expert committee not resulting in a majority vote
Study & Design
- Study Type
- OBSERVATIONAL
- Study Design
- Not specified
- Arm && Interventions
Group Intervention Description Experimental: Intervention arm EndoMind All patients within the study are included in the intervention arm: The Hill classification is determined by the physician and the AI method.
- Primary Outcome Measures
Name Time Method Accuracy of assessments for the Hill classification for physicians and AI method. Through study completion, an average of 5 months Binary assessment of physician vs AI correct and erroneous predictions.
- Secondary Outcome Measures
Name Time Method Distance for label assessment from gold standard label. Through study completion, an average of 5 months Comparison of the distance between the gold standard label and the label assigned by the physician and AI method.
Accuracy of assessments for each Hill grade for physicians and AI method. Through study completion, an average of 5 months Description: The correct and erroneous predictions for each specific Hill class.
Precision and recall of the assessments for each Hill from endoscopists and AI method. Through study completion, an average of 5 months The precision, and recall statistics for each class (1v0) over the four different Hill classes.
Trial Locations
- Locations (1)
Universitätsklinikum Würzburg
🇩🇪Würzburg, Bayern, Germany