Can Feedback From a Large Language Model Improve Health Care Quality?
- Conditions
- All Conditions
- Registration Number
- NCT06823765
- Lead Sponsor
- Yale University
- Brief Summary
The goal of this study is to learn if computer-assisted advice can help improve patient care in Nigerian health clinics. The main question it aims to answer is: does giving healthcare workers instant computer feedback help them make better decisions about patient care?
Researchers will compare patient care notes written by healthcare workers before and after they receive computer feedback to see if the feedback improves care quality. A doctor who doesn't know if feedback was given will review these notes.
Participants will:
* Be seen by a community healthcare worker who uses the computer feedback system
* Be treated by a fully trained medical doctor
* Get tested for malaria, anemia, or urinary tract infections if they have certain symptoms
- Detailed Description
This project tests whether Large Language Models (LLMs) can improve patient care in Nigerian primary care clinics by giving customized and instant feedback to the provider in natural language. An LLM-based tool integrated into an electronic patient record management system provides "second opinions" to community health extension workers (CHEWs) at two clinics in Nigeria. These second opinions are intended to mirror what a reviewing physician might advise the CHEWs after seeing or hearing their initial report on a patient.
For the main analysis, this study employs a within-patient comparison of two patient notes created by the CHEW; one during the initial patient consultation, and one after the LLM feedback was received. The patient is also seen by a fully trained medical officer who is in charge of patient care. The MO conducts a blinded review of the CHEW's patient notes to measures changes in the CHEW's care as a result of the LLM feedback. The data comes from the information captured in the electronic medical record (EMR) of the patient and from survey data collected from CHEWs, reviewing MOs, and a panel of reviewing Medical Doctors.
Recruitment & Eligibility
- Status
- RECRUITING
- Sex
- All
- Target Recruitment
- 500
- Patient is at the clinic for outpatient consultation
- Parent/guardian consent is required for individuals under 18
- Patient does not require emergency care
- Patient is not at the clinic for a checkup (e.g. weight, blood pressure, follow up after recovery)
- Patient is not a trauma patient (visit is not for an accident, wound or injury)
- Patient is not at the clinic for a scheduled procedure or a birth
Study & Design
- Study Type
- INTERVENTIONAL
- Study Design
- SINGLE_GROUP
- Primary Outcome Measures
Name Time Method Indicator for an Error in the Treatment plan (with the Potential for Harm) Through study completion, an average of six months During SOAP note evaluation, the MO is asked to indicate whether the treatment plan for the patient contains any errors, conditional on the MO's own diagnosis. This is coded as 1 if the MO indicates there is an error and 0 otherwise.
The introductory text (here for SOAP Note A) is: Please evaluate whether the treatment in SOAP Note A is appropriate for this patient's condition. Please base this on your own diagnosis, not the CHEW's diagnosis in SOAP Note A.
This is followed by the question: Is the treatment plan for the patient in SOAP Note A completely appropriate given your own diagnosis (accounting for conditional treatments based on medical tests)? Answer "No" if the patient should receive different medical care given your diagnosis. This can include both minor differences (for example, the patient should be advised to rest) and major errors (for example, the patient should receive a completely different set of medications). (Answer options: yes/no/unsure)Indicator for an Error in the Treatment Plan that Causes a Loss of at least X Quality-Adjusted Life Days Through study completion, an average of six months This variable is coded as 1 if the MO indicates there is such an error and 0 otherwise. X is defined to be the highest benchmark on the appropriate DALY scale so that at least 5% of patients have an error that large in the unassisted SOAP note. In other words, severe errors are any errors that generate a harm rating at or above the 95th percentile of harm on the unassisted scale (pooling child and adult scales).
Indicator for the Better Treatment Plan (as Determined by the MOs) Through study completion, an average of six months Based on the DALY rating of SOAP Note A vs. B (counting instances with no errors as 0 DALY loss), the indicator is coded as 1 if the SOAP note has the better treatment plan (lower DALY loss) and 0 if MOs judge both notes to be the same in response to the following question: Are there any meaningful differences in the treatment plans of SOAP Note A and B?
Indicator for whether Treatment is Consistent with a Predetermined "Standard of Care" Through study completion, an average of six months At-risk patients receive malaria, anemia and UTI screening in accordance with certain demographic criteria. A dataset is then constructed with one observation for each (patient, screening test, note), up to six per patient.
The indicator of treatment misallocation records whether a patient was incorrectly treated for a condition based on the test result or lack of symptoms. The variable is coded as 1 if the patient tested positive and either received inappropriate or no treatment. It is also coded as 1 if the patient tested negative or was not tested based on the symptom screen but received treatment for the condition. The variable is only coded as 0 if the patient tested negative and was correctly not treated for the corresponding condition, or if they tested positive and received the correct treatment.
- Secondary Outcome Measures
Name Time Method Indicators Denoting Diagnosis and Treatment Alignment Between CHEWs and MOs Through study completion, an average of six months For each medication in the CHEW's treatment plan, there is a "clinical indication" (the diagnosis associated with the drug) along with an indicator that specifies if a given prescription is conditional on a medical test result. The research team will consider three indicators of a match:
* any match of the contents of the "clinical indication" field across medications;
* any match of the contents of the "medication" field across indications, including whether the medication is conditional on a test or not;
* a match of both medication and indication (and test conditionality).Alternative Indicators for Treatment Misallocation Through study completion, an average of six months The research team will construct the following indicators of treatment misallocation:
* Misallocation due to overprescription: a condition is treated that the patient is confirmed not to have
* Misallocation due to underprescription: a condition the patient is confirmed to have is not treated
* Misallocation due to incorrect dosing or drug choice: a condition the patient has is treated but the dosing or medication chosen is inappropriate.Relationship of QALY Loss to Severity of Patient Condition Through study completion, an average of six months In patients with only mild illnesses, the scope for QALY loss from mistakes may be limited relative to patients with more severe illnesses.
With this in mind, QALY loss is regressed on indicators for mild, moderate, and severe illnesses (as assessed by the MO) each interacted with the assisted note indicator, controlling for patient fixed effects. Results will be shown graphically.Indicators for the Appropriateness of Medical Testing Decisions Through study completion, an average of six months The potential misallocation of medical testing is operationalized in two ways:
1. For each test type, the research team will construct an indicator that is coded as 1 if the CHEW recommends conducting a test that turns out to be negative, and a second indicator that is 1 if the CHEW neglects to request a test that turns out to be positive.
2. Second, the research team will construct an indicator at the level of (patient, test, note) that measures whether the CHEW and MO requested the same or a comparable medical test (e.g. the CHEW requested a malaria RDT whereas the MO requested a malaria bloodsmear).
Combining these indicators, a mismatch occurs if and only if either: i) a test was not requested by the CHEW but was positive, or ii) the test was requested by the CHEW but the result was negative and no equivalent test was ordered by the MO.Average and Distribution of DALY Lost Through study completion, an average of six months The effect of LLM assistance DALY lost is measured directly rather than indirectly (as in probability of error and severe error, which note is the better note). The full distribution of DALY ratings for the assisted and unassisted notes will also be shown in the results.
MO Evaluation of SOAP Notes: Deviations from the MO's SOAP Through study completion, an average of six months The MO is asked to assess for each SOAP note whether medical tests ordered were necessary or clinically useful, whether there are missing or incorrect/unnecessary diagnoses, and whether there are missing or incorrect/unnecessary treatment plan elements.
MO Evaluation of SOAP Notes: Types of Harm Incurred Through study completion, an average of six months The MO is asked to assess any short-term harm (additional symptoms or discomfort for some period), and any long-term serious harm (risk of impairment, death etc.) from the treatment plan in the SOAP note.
MO Evaluation of SOAP Notes: Measuring Healthy Time Lost in DALY Through study completion, an average of six months The MO also provides an overall rating that is intended to reflect the "healthy time lost" from any errors in treatment in the SOAP note. For each assessment and plan constructed by a CHEW (with or without LLM advice), an MO will assess the expected magnitude of healthy life that would be lost if the CHEW plan were implemented instead of the MO's plan.
MD Evaluation of CHEW and MO Notes: Flagging MO Error Through study completion, an average of six months In a first step, they will review the MO notes only and record whether there is any error in the diagnosis or treatment proposed in the conditional note or in the final note. If an error is identified the MDs will rate the error by severity to distinguish medical mistakes from differences in opinion about a patient who is not present.
MD Evaluation of CHEW and MO Notes: SOAP Note Rating Through study completion, an average of six months The MD is asked to assess any short-term harm (additional symptoms or discomfort for some period), and any long-term serious harm (risk of impairment, death etc.) from the treatment plan in the SOAP note.
The MD also provides an overall rating that is intended to reflect the "healthy time lost" from any errors in treatment in the SOAP note. For each assessment and plan constructed by a CHEW (with or without LLM advice), an MO will assess the expected magnitude of healthy life that would be lost if the CHEW plan were implemented instead of the MO's plan. Healthy time is measured in units of disability-adjusted life year (DALYs), which reflect both length and quality of life.MD Evaluation of CHEW and MO Notes: LLM Review Through study completion, an average of six months The MDs will also review the LLM feedback and answer the following questions:
"Did the CHEW follow all, some, or none of the LLM recommendations?" If some or none: "Imagine the CHEW had followed all the recommendations of the LLM. Would the resulting treatment plan be an improvement over their assisted note?" (Yes/no) If yes: "Please explain."" "Did the LLM make any mistakes?" (Yes/no) If yes: "Was any aspect of the CHEW's assisted treatment plan worse than the unassisted plan because the CHEW followed the LLM's erroneous recommendation?" If yes: "Please explain.Indicator for the Appropriateness of Triage Decisions Through study completion, an average of six months For each (patient, note), an indicator records whether the CHEW triage decision (an intent to triage indicated in the SOAP note) and the MO suggested triage decision align.
Related Research Topics
Explore scientific publications, clinical data analysis, treatment approaches, and expert-compiled information related to the mechanisms and outcomes of this trial. Click any topic for comprehensive research insights.
Trial Locations
- Locations (2)
EHA Clinics REACH Community Clinic, Gyadi Gyadi
🇳🇬Kano City, Kano State, Nigeria
EHA Clinics, 33 Lamido Crescent
🇳🇬Kano City, Kano State, Nigeria