Observational Study on AI Accuracy in Diagnosing and Treating Failed or Painful Hip Arthroplasty

Recruiting

Conditions: Total Hip Arthroplasty (THA)

Registration Number: NCT07012577

Lead Sponsor: Istituto Ortopedico Rizzoli

Brief Summary: Primary Goal:

This study aims to evaluate the diagnostic and therapeutic accuracy of GPT-4 (an advanced AI language model) compared to three orthopedic surgeons with varying experience levels in cases of failed or painful total hip arthroplasty.

Key Research Questions:

Diagnostic Accuracy:

Does GPT-4 provide correct, partially correct, or incorrect diagnoses compared to human orthopaedic surgeons?

Diagnostic Completeness:

Are GPT-4's diagnostic suggestions complete, partially complete, or incomplete compared to those of orthopedic surgeons?

Treatment Accuracy:

Does GPT-4 recommend correct, partially correct, or incorrect treatments for failed hip arthroplasty?

Treatment Completeness:

Are GPT-4's treatment recommendations fully comprehensive, partially complete, or incomplete compared to those of orthopaedic surgeon?

Study Design:

Participants:

20 anonymized patient cases (ages 18-80) with failed or painful hip arthroplasties, treated at IRCCS Istituto Ortopedico Rizzoli (Bologna, Italy) between 2004-2024.

Cases were selected based on clear diagnostic and treatment records (no ambiguous or incomplete data).

Comparison Groups:

GPT-4 (via ChatGPT interface)

Three orthopedic doctors (with different experience levels: resident, specialist, senior surgeon)

Method:

Each case (clinical summary + X-ray image) is presented to GPT-4 and the three doctors.

They must provide a diagnosis and treatment recommendations.

Two independent evaluators (principal investigator + department head) blindly assess responses for correctness and completeness using a 3-point scale (0=wrong/incomplete, 2=correct/complete).

Statistical analysis compares GPT-4 vs. human performance.

Expected Outcomes:

Determine if AI can match or outperform doctors in diagnosing and treating hip arthroplasty failures.

Assess whether GPT-4 could serve as a supplementary tool in orthopedic decision-making.

Ethical \& Privacy Considerations:

No real-time patient data is used-only anonymized past cases.

No personal/sensitive data is shared with OpenAI (GPT-4 is used via a standard web interface).

Study complies with GDPR, HIPAA, and ethical AI guidelines.

Timeline:

Study duration: \~8 months (from ethics approval to final analysis).

Results will be published regardless of outcome.

Why This Study Matters:

First study evaluating GPT-4's role in complex orthopedic diagnostics.

Could influence future AI-assisted clinical decision-making in joint replacement surgeries.

Detailed Description: Not available

Recruitment & Eligibility

Status: RECRUITING

Sex: All

Target Recruitment: 20

Inclusion Criteria

Adults (≥18 and ≤80 years old).
Documented painful or failed total hip arthroplasty requiring clinical/radiological evaluation (2004-2024).
Complete pre-operative clinical history, imaging (X-ray/tomography), and surgical reports.
Clear diagnosis of failure mode (e.g., aseptic loosening, infection, fracture, wear).
Treatment and outcomes fully documented in the institutional database.
"Exemplary" cases with minimal diagnostic ambiguity (per Engh/MusculoSkleletal Infection Society criteria, etc.).

Exclusion Criteria

total hip arthroplasty with no documented failure/pain (well-functioning implants).
Incomplete clinical/radiological records (e.g., missing pre-operative imaging or surgical notes).
Complex/multifactorial failures (e.g., concurrent infection + loosening + fracture).
Radiographs/images non-interpretable (poor quality, missing views).
Cases with conflicting diagnoses/treatments in original records.

Study & Design

Study Type: OBSERVATIONAL

Study Design: Not specified

Primary Outcome Measures

Name	Time	Method
Diagnostic correctness	Immediate (post-case evaluation)	Proportion of fully correct diagnoses (score=2) by each rater, Scale 0 (worst outcome) - 2 (best outcome). 0: incorrect, 1: imprecise, 2: correct
Diagnostic completeness	Immediate (post-case evaluation)	Proportion of fully complete diagnoses (score=2). Scale 0 (worst outcome) - 2 (best outcome). 0: incomplete, 1: partially complete, 2: complete
Treatment recommendation correctness	Immediate (post-case evaluation)	Proportion of fully correct treatments (score=2) by each rater. Scale 0 (worst outcome) - 2 (best outcome). 0: incorrect, 1: imprecise, 2: correct
Treatmetn recommendation completeness	Immediate (post-case evaluation)	Proportion of fully complete treatments (score=2). Scale 0 (worst outcome) - 2 (best outcome). 0: incomplete, 1: partially complete, 2: complete

Secondary Outcome Measures

Name	Time	Method

Trial Locations

Locations (1): SC Ortopedia e Traumatologia e Chirurgia Protesica e dei Reimpianti di Anca e Ginocchio, IRCCS Istituto Ortopedico Rizzoli
🇮🇹
Bologna, Italy
SC Ortopedia e Traumatologia e Chirurgia Protesica e dei Reimpianti di Anca e Ginocchio, IRCCS Istituto Ortopedico Rizzoli
🇮🇹Bologna, Italy
Francesco Castagnini, MD
Contact
+390516366418
francescocastagnini@hotmail.it

Related Trials

Success of ChatGPT in Determining the Need for Postoperative Intensive Care

Recruiting

Kanuni Sultan Suleyman Training and Research Hospital

Posted 3/20/2024

Updated 5/13/2024

Efficacy and Safety of GTx-024 in Patients With Androgen Receptor-Positive Triple Negative Breast Cancer (AR+ TNBC)

TerminatedPhase 2

GTx

Posted 2/23/2015

Updated 11/18/2020

GTP Regimen in the Treatment of Refractory/Recurrent HLH

Not Yet RecruitingPhase 3

Beijing Friendship Hospital

Posted 9/14/2023

EUS-B-FNA in the Diagnosis of Malignant Parenchymal Lung Lesions

Completed

University of Milan

Posted 6/12/2019

Updated 10/1/2021

Noninvasive Detection and Assessment of Therapy Response in Multiple Myeloma Using Whole-Body MRI

TerminatedNot Applicable

University of Texas Southwestern Medical Center

Posted 7/30/2020

Updated 7/24/2025

Efficacy and Safety of PT003, PT005, and PT001 in Subjects With Moderate to Very Severe Chronic Obstructive Pulmonary Disease (COPD); (PINNACLE 1)

CompletedPhase 3

Pearl Therapeutics, Inc.

Posted 5/15/2013

Updated 3/28/2017

Non-invasive Biomarker Discovery for Pre-cervical or/and Cervical Cancer--ACTN4 and Other Biomarkers in Menstrual Blood

RecruitingNot Applicable

WomenX Biotech Limited

Posted 2/15/2024

Clinical Investigation of GT UrologIcal, LLC's Artificial Urinary Sphincter (RELIEF II)

CompletedNot Applicable

GT Urological, LLC

Posted 11/11/2014

Updated 11/22/2017

Efficacy Study of Nebulized TD-4208 for Chronic Obstructive Pulmonary Disease (COPD)

CompletedPhase 3

Mylan Inc.

Posted 6/1/2015

Updated 2/24/2022

Efficacy Study of Nebulized TD-4208 for Chronic Obstructive Pulmonary Disease (COPD)

CompletedPhase 3

Mylan Inc.

Posted 7/31/2015

Updated 2/24/2022

AI-Powered Research

Premium Access

Observational Study on AI Accuracy in Diagnosing and Treating Failed or Painful Hip Arthroplasty

Recruitment & Eligibility

Study & Design

Related Research Topics

Trial Locations

Related Trials

Success of ChatGPT in Determining the Need for Postoperative Intensive Care

Efficacy and Safety of GTx-024 in Patients With Androgen Receptor-Positive Triple Negative Breast Cancer (AR+ TNBC)

GTP Regimen in the Treatment of Refractory/Recurrent HLH

EUS-B-FNA in the Diagnosis of Malignant Parenchymal Lung Lesions

Noninvasive Detection and Assessment of Therapy Response in Multiple Myeloma Using Whole-Body MRI

Efficacy and Safety of PT003, PT005, and PT001 in Subjects With Moderate to Very Severe Chronic Obstructive Pulmonary Disease (COPD); (PINNACLE 1)

Non-invasive Biomarker Discovery for Pre-cervical or/and Cervical Cancer--ACTN4 and Other Biomarkers in Menstrual Blood

Clinical Investigation of GT UrologIcal, LLC's Artificial Urinary Sphincter (RELIEF II)

Efficacy Study of Nebulized TD-4208 for Chronic Obstructive Pulmonary Disease (COPD)

Efficacy Study of Nebulized TD-4208 for Chronic Obstructive Pulmonary Disease (COPD)

Clinical Trial Alerts

MedPath

Product

Company

Legal