The Big Unknown: A Journey Into Generative AI's Transformative Effect on Meical Professions
概览
- 阶段
- 不适用
- 状态
- 已完成
- 入组人数
- 249
- 试验地点
- 3
- 主要终点
- Percentage Correct Score
概览
简要总结
A parallel group randomized controlled trial using a superiority framework. Clinical vignettes will be used to assess the impact of a large language model on the clinical reasoning of physicians. Quantitative analyses will be performed on graded vignette responses.
详细描述
This study is a multi-country, parallel-group randomized controlled trial designed to evaluate whether access to a large language model (LLM) improves physician clinical decision-making. The trial uses a superiority framework and compares physicians randomized to either complete standardized clinical vignettes with access to GPT-4o or without any AI assistance.
Clinical vignettes simulate common primary care conditions such as cardiovascular, respiratory, musculoskeletal, fatigue-related, and infectious diseases. Each vignette includes multiple steps in the clinical reasoning process, from initial history-taking to diagnosis, treatment, and follow-up. Physician responses are graded using rubrics developed from evidence-based, context-specific best-practice guidelines.
The study is conducted across three countries-Indonesia, Kenya, and the Netherlands-representing different income levels and health system contexts. The primary outcome is performance on clinical vignettes, defined as adherence to best-practice guidelines. Secondary objectives include examining cross-country variation in physician performance, variation in performance distributions, and the role of engagement with the LLM in shaping outcomes.
研究设计
- 研究类型
- Interventional
- 分配方式
- Randomized
- 干预模型
- Parallel
- 主要目的
- Diagnostic
- 盲法
- Single (Outcomes Assessor)
入排标准
- 性别
- All
- 接受健康志愿者
- 是
入选标准
- •Registered medical physicians
- •Training in internal or family medicine
排除标准
- •Not currently practicing clinically
研究组 & 干预措施
Own Knowlege
Group will not be given access to GPT-4 or other online resources
GPT-4o
Group given GPT-4o access
干预措施: GPT-4o (Other)
结局指标
主要结局
Percentage Correct Score
时间窗: During Evaluation
Following Peabody et al (2000), the primary outcome is a percentage correct score across all steps in a vignette. This is generated by dividing the weighted total sum of rubric items assessed as present by the total number of rubric items possible in a vignette. Rubric items will be weighted with regards to their relevance by our expert panel.
次要结局
- Quality Per Answer(During Evaluation)
- Number of Answers(During evaluation)
- Less obvious answers(During evaluation)