Physician Response Evaluation With Contextual Insights vs. Standard Engines - Artificial Intelligence RAG vs LLM Clinical Decision Support

Not Applicable

Recruiting

Conditions: Large Language Models

Registration Number: NCT07037940

Lead Sponsor: Montefiore Medical Center

Brief Summary: Clinical decision support tools powered by artificial intelligence are being rapidly integrated into medical practice. Two leading systems currently available to clinicians are OpenEvidence, which uses retrieval-augmented generation to access medical literature, and GPT-4, a large language model. While both tools show promise, their relative effectiveness in supporting clinical decision-making has not been directly compared. This study aims to evaluate how these tools influence diagnostic reasoning and management decisions among internal medicine physicians.

Detailed Description: Internal medicine attendings and residents are invited to participate in a study investigating how physicians using a RAG-based LLM (OpenEvidence) perform compared to those using a standard general-purpose LLM (ChatGPT) on both diagnostic reasoning and complex management decisions. As AI tools increasingly enter clinical practice, evidence is needed about which approaches best support physician decision-making. This study will help determine if specialized medical knowledge retrieval systems (OpenEvidence) provide advantages over general AI assistants (ChatGPT) when solving real clinical cases.

Participants will complete one 90-minute Zoom session where clinical cases derived from real, de-identified patient encounters will be solved. Participants will be randomly assigned to use either OpenEvidence or ChatGPT and all responses evaluated by blinded scorers using a validated rubric.

Recruitment & Eligibility

Status: RECRUITING

Sex: All

Target Recruitment: 56

Inclusion Criteria

Internal medicine residents
Internal medicine attending physicians

Exclusion Criteria

Not meeting Inclusion Criteria

Study & Design

Study Type: INTERVENTIONAL

Study Design: PARALLEL

Primary Outcome Measures

Name	Time	Method
Clinical Reasoning Performance as determined by Rater Scores	15-minutes upon completion of cases, up to approximately 90 minutes total	Clinical reasoning performance will be evaluated based upon the accuracy of the rater scores to responses to the surveys administered. Six blinded, trained independent raters will independently score each participant's response using a validated scoring rubric. Possible response scores can range from 0-100% with higher scores indicating increased clinical reasoning performance. Results for each assessment will be summarized by study arm using basic descriptive statistics and analyzed using mixed-effects models to account for within-subject correlation and between-subject factors.

Secondary Outcome Measures

Name	Time	Method
Time efficiency	Up to approximately 75 minutes	Time efficiency will be assessed based on the amount of time it takes for participants to complete the surveys. Each survey will automatically be time stamped to record the amount of time needed for each participant to answer each case. Results for the virtual session will be summarized by study arm using basic descriptive statistics and analyzed.
Decision confidence	15-minutes upon completion of cases, up to approximately 90 minutes total	Decision confidence will be determined by asking participants to assess the level of confidence in survey answers using a scale ranging from 1-5 (1 being least confident, 5 being most confident) such that higher scores are associated with increased confidence in responses. Scores will be summarized by study arm using basic descriptive statistics.

Trial Locations

Locations (1): MontefioreMC
🇺🇸
Bronx, New York, United States
MontefioreMC
🇺🇸Bronx, New York, United States
Soaptarshi Paul
Contact
732-609-5130
paulsoaptarshi@gmail.com

Related Trials

Validate: Trustworthy AI to Improve Acute Stroke Outcomes

Recruiting

Hospital Universitari Vall d'Hebron Research Institute

Posted 11/18/2022

Updated 7/25/2024

Integrating Deep Learning CT-scan Model, Biological and Clinical Variables to Predict Severity of Asthma in Children

Recruiting

Fondazione IRCCS Policlinico San Matteo di Pavia

Posted 12/2/2021

Updated 7/25/2024

Artificial Intelligence for Real-time Detection and Monitoring of Colorectal Polyps

CompletedNot Applicable

Centre hospitalier de l'Université de Montréal (CHUM)

Posted 10/14/2020

Updated 11/25/2022

Research on Early Prediction Model of Ischemic Cerebrovascular Disease Based on Artificial Intelligence Technology.

Recruiting

Shanghai Jiao Tong University School of Medicine

Posted 5/18/2025

A Diagnostic Test on DeepDoc-an AI-based Decision Support System

Unknown Status

Xijing Hospital

Posted 8/14/2019

Updated 12/13/2019

Artificial Intelligence-assisted Diagnosis

Active, Not Recruiting

China National Center for Cardiovascular Diseases

Posted 1/20/2025

Artificial Intelligence-assisted Diagnosis and Prognostication in Low Ejection Fraction Using Electrocardiograms

CompletedNot Applicable

National Defense Medical Center, Taiwan

Posted 11/11/2021

Updated 10/23/2024

Linking Novel Diagnostics With Data-Driven Clinical Decision Support in the Emergency Department

Unknown Status

Stocastic, LLC

Posted 4/19/2022

Augmented Human Intelligence in Major Depressive Disorder

CompletedNot Applicable

Mayo Clinic

Posted 4/21/2020

Updated 12/12/2022

AI-Powered Research

Premium Access

Physician Response Evaluation With Contextual Insights vs. Standard Engines - Artificial Intelligence RAG vs LLM Clinical Decision Support

Recruitment & Eligibility

Study & Design

Related Research Topics

Trial Locations

Related Trials

Validate: Trustworthy AI to Improve Acute Stroke Outcomes

Integrating Deep Learning CT-scan Model, Biological and Clinical Variables to Predict Severity of Asthma in Children

Artificial Intelligence for Real-time Detection and Monitoring of Colorectal Polyps

Research on Early Prediction Model of Ischemic Cerebrovascular Disease Based on Artificial Intelligence Technology.

A Diagnostic Test on DeepDoc-an AI-based Decision Support System

Artificial Intelligence-assisted Diagnosis

Artificial Intelligence-assisted Diagnosis and Prognostication in Low Ejection Fraction Using Electrocardiograms

Linking Novel Diagnostics With Data-Driven Clinical Decision Support in the Emergency Department

Augmented Human Intelligence in Major Depressive Disorder

Clinical Trial Alerts