Observational Study on AI Accuracy in Diagnosing and Treating Failed or Painful Hip Arthroplasty
NCT ID: NCT07012577
Last Updated: 2025-06-18
Study Results
The study team has not published outcome measurements, participant flow, or safety data for this trial yet. Check back later for updates.
Basic Information
Get a concise snapshot of the trial, including recruitment status, study phase, enrollment targets, and key timeline milestones.
RECRUITING
20 participants
OBSERVATIONAL
2025-05-31
2025-07-01
Brief Summary
Review the sponsor-provided synopsis that highlights what the study is about and why it is being conducted.
This study aims to evaluate the diagnostic and therapeutic accuracy of GPT-4 (an advanced AI language model) compared to three orthopedic surgeons with varying experience levels in cases of failed or painful total hip arthroplasty.
Key Research Questions:
Diagnostic Accuracy:
Does GPT-4 provide correct, partially correct, or incorrect diagnoses compared to human orthopaedic surgeons?
Diagnostic Completeness:
Are GPT-4's diagnostic suggestions complete, partially complete, or incomplete compared to those of orthopedic surgeons?
Treatment Accuracy:
Does GPT-4 recommend correct, partially correct, or incorrect treatments for failed hip arthroplasty?
Treatment Completeness:
Are GPT-4's treatment recommendations fully comprehensive, partially complete, or incomplete compared to those of orthopaedic surgeon?
Study Design:
Participants:
20 anonymized patient cases (ages 18-80) with failed or painful hip arthroplasties, treated at IRCCS Istituto Ortopedico Rizzoli (Bologna, Italy) between 2004-2024.
Cases were selected based on clear diagnostic and treatment records (no ambiguous or incomplete data).
Comparison Groups:
GPT-4 (via ChatGPT interface)
Three orthopedic doctors (with different experience levels: resident, specialist, senior surgeon)
Method:
Each case (clinical summary + X-ray image) is presented to GPT-4 and the three doctors.
They must provide a diagnosis and treatment recommendations.
Two independent evaluators (principal investigator + department head) blindly assess responses for correctness and completeness using a 3-point scale (0=wrong/incomplete, 2=correct/complete).
Statistical analysis compares GPT-4 vs. human performance.
Expected Outcomes:
Determine if AI can match or outperform doctors in diagnosing and treating hip arthroplasty failures.
Assess whether GPT-4 could serve as a supplementary tool in orthopedic decision-making.
Ethical \& Privacy Considerations:
No real-time patient data is used-only anonymized past cases.
No personal/sensitive data is shared with OpenAI (GPT-4 is used via a standard web interface).
Study complies with GDPR, HIPAA, and ethical AI guidelines.
Timeline:
Study duration: \~8 months (from ethics approval to final analysis).
Results will be published regardless of outcome.
Why This Study Matters:
First study evaluating GPT-4's role in complex orthopedic diagnostics.
Could influence future AI-assisted clinical decision-making in joint replacement surgeries.
Related Clinical Trials
Explore similar clinical trials based on study characteristics and research focus.
Pending Failure in Hard-hard Total Hip Arthroplasty
NCT02427984
A Short Metaphyseal Fitting Total Hip Arthroplasty in Young and Elderly Patients
NCT01345097
AI-assisted Preoperative Planning Technology for THA for DDH
NCT05929105
Kinematics and Muscle Strength in Two, Five or 10 Years Afther Total Hip Arthroplasty
NCT04214171
Patient Scores and Functional Tests After Hip Surgery
NCT07048041
Detailed Description
Dive into the extended narrative that explains the scientific background, objectives, and procedures in greater depth.
Conditions
See the medical conditions and disease areas that this research is targeting or investigating.
Study Design
Understand how the trial is structured, including allocation methods, masking strategies, primary purpose, and other design elements.
COHORT
RETROSPECTIVE
Study Groups
Review each arm or cohort in the study, along with the interventions and objectives associated with them.
Failed or Painful Total Hip Arthroplasty Patients
Patients with documented failed/painful THA (aseptic loosening, infection, fracture, etc.) selected from a tertiary center database (2004-2024).
GPT-4 Assessment
Diagnostic/Prognostic evaluation of any single case provided by AI (GPT-4). GPT-4 provides diagnosis/treatment recommendations via standardized prompts
Arthroplasty Fellow Assessment
Diagnostic/Prognostic evaluation of any single case provided by an human expert
Specializing Resident (4th year) Assessment
Diagnostic/Prognostic evaluation of any single case provided by an human expert
Junior Resident (3rd year) Assessment
Diagnostic/Prognostic evaluation of any single case provided by an human expert
Interventions
Learn about the drugs, procedures, or behavioral strategies being tested and how they are applied within this trial.
GPT-4 Assessment
Diagnostic/Prognostic evaluation of any single case provided by AI (GPT-4). GPT-4 provides diagnosis/treatment recommendations via standardized prompts
Arthroplasty Fellow Assessment
Diagnostic/Prognostic evaluation of any single case provided by an human expert
Specializing Resident (4th year) Assessment
Diagnostic/Prognostic evaluation of any single case provided by an human expert
Junior Resident (3rd year) Assessment
Diagnostic/Prognostic evaluation of any single case provided by an human expert
Eligibility Criteria
Check the participation requirements, including inclusion and exclusion rules, age limits, and whether healthy volunteers are accepted.
Inclusion Criteria
* Documented painful or failed total hip arthroplasty requiring clinical/radiological evaluation (2004-2024).
* Complete pre-operative clinical history, imaging (X-ray/tomography), and surgical reports.
* Clear diagnosis of failure mode (e.g., aseptic loosening, infection, fracture, wear).
* Treatment and outcomes fully documented in the institutional database.
* "Exemplary" cases with minimal diagnostic ambiguity (per Engh/MusculoSkleletal Infection Society criteria, etc.).
Exclusion Criteria
* Incomplete clinical/radiological records (e.g., missing pre-operative imaging or surgical notes).
* Complex/multifactorial failures (e.g., concurrent infection + loosening + fracture).
* Radiographs/images non-interpretable (poor quality, missing views).
* Cases with conflicting diagnoses/treatments in original records.
18 Years
80 Years
ALL
No
Sponsors
Meet the organizations funding or collaborating on the study and learn about their roles.
Istituto Ortopedico Rizzoli
OTHER
Responsible Party
Identify the individual or organization who holds primary responsibility for the study information submitted to regulators.
francesco castagnini
Principal investigator
Principal Investigators
Learn about the lead researchers overseeing the trial and their institutional affiliations.
Francesco Castagnini, MD
Role: PRINCIPAL_INVESTIGATOR
IRCCS Istituto Ortopedico Rizzoli
Locations
Explore where the study is taking place and check the recruitment status at each participating site.
SC Ortopedia e Traumatologia e Chirurgia Protesica e dei Reimpianti di Anca e Ginocchio, IRCCS Istituto Ortopedico Rizzoli
Bologna, , Italy
Countries
Review the countries where the study has at least one active or historical site.
Central Contacts
Reach out to these primary contacts for questions about participation or study logistics.
Facility Contacts
Find local site contact details for specific facilities participating in the trial.
References
Explore related publications, articles, or registry entries linked to this study.
Knee CJ, Campbell RJ, Graham DJ, Handford C, Symes M, Sivakumar BS. Examining the role of ChatGPT in the management of distal radius fractures: insights into its accuracy and consistency. ANZ J Surg. 2024 Jul-Aug;94(7-8):1391-1396. doi: 10.1111/ans.19143. Epub 2024 Jul 5.
Dagher T, Dwyer EP, Baker HP, Kalidoss S, Strelzow JA. "Dr. AI Will See You Now": How Do ChatGPT-4 Treatment Recommendations Align With Orthopaedic Clinical Practice Guidelines? Clin Orthop Relat Res. 2024 Dec 1;482(12):2098-2106. doi: 10.1097/CORR.0000000000003234. Epub 2024 Sep 6.
Artioli E, Veronesi F, Mazzotti A, Brogini S, Zielli SO, Giavaresi G, Faldini C. Assessing ChatGPT responses to common patient questions regarding total ankle arthroplasty. J Exp Orthop. 2024 Dec 31;12(1):e70138. doi: 10.1002/jeo2.70138. eCollection 2025 Jan.
Pagano S, Strumolo L, Michalk K, Schiegl J, Pulido LC, Reinhard J, Maderbacher G, Renkawitz T, Schuster M. Evaluating ChatGPT, Gemini and other Large Language Models (LLMs) in orthopaedic diagnostics: A prospective clinical study. Comput Struct Biotechnol J. 2024 Dec 26;28:9-15. doi: 10.1016/j.csbj.2024.12.013. eCollection 2025.
Provided Documents
Download supplemental materials such as informed consent forms, study protocols, or participant manuals.
Document Type: Study Protocol and Statistical Analysis Plan
Other Identifiers
Review additional registry numbers or institutional identifiers associated with this trial.
203/2025/Oss/IOR
Identifier Type: -
Identifier Source: org_study_id
More Related Trials
Additional clinical trials that may be relevant based on similarity analysis.