Observational Study on AI Accuracy in Diagnosing and Treating Failed or Painful Hip Arthroplasty

NCT ID: NCT07012577

Last Updated: 2025-06-18

Study Results

Results pending

The study team has not published outcome measurements, participant flow, or safety data for this trial yet. Check back later for updates.

Basic Information

Get a concise snapshot of the trial, including recruitment status, study phase, enrollment targets, and key timeline milestones.

Recruitment Status

RECRUITING

Total Enrollment

20 participants

Study Classification

OBSERVATIONAL

Study Start Date

2025-05-31

Study Completion Date

2025-07-01

Brief Summary

Review the sponsor-provided synopsis that highlights what the study is about and why it is being conducted.

Primary Goal:

This study aims to evaluate the diagnostic and therapeutic accuracy of GPT-4 (an advanced AI language model) compared to three orthopedic surgeons with varying experience levels in cases of failed or painful total hip arthroplasty.

Key Research Questions:

Diagnostic Accuracy:

Does GPT-4 provide correct, partially correct, or incorrect diagnoses compared to human orthopaedic surgeons?

Diagnostic Completeness:

Are GPT-4's diagnostic suggestions complete, partially complete, or incomplete compared to those of orthopedic surgeons?

Treatment Accuracy:

Does GPT-4 recommend correct, partially correct, or incorrect treatments for failed hip arthroplasty?

Treatment Completeness:

Are GPT-4's treatment recommendations fully comprehensive, partially complete, or incomplete compared to those of orthopaedic surgeon?

Study Design:

Participants:

20 anonymized patient cases (ages 18-80) with failed or painful hip arthroplasties, treated at IRCCS Istituto Ortopedico Rizzoli (Bologna, Italy) between 2004-2024.

Cases were selected based on clear diagnostic and treatment records (no ambiguous or incomplete data).

Comparison Groups:

GPT-4 (via ChatGPT interface)

Three orthopedic doctors (with different experience levels: resident, specialist, senior surgeon)

Method:

Each case (clinical summary + X-ray image) is presented to GPT-4 and the three doctors.

They must provide a diagnosis and treatment recommendations.

Two independent evaluators (principal investigator + department head) blindly assess responses for correctness and completeness using a 3-point scale (0=wrong/incomplete, 2=correct/complete).

Statistical analysis compares GPT-4 vs. human performance.

Expected Outcomes:

Determine if AI can match or outperform doctors in diagnosing and treating hip arthroplasty failures.

Assess whether GPT-4 could serve as a supplementary tool in orthopedic decision-making.

Ethical \& Privacy Considerations:

No real-time patient data is used-only anonymized past cases.

No personal/sensitive data is shared with OpenAI (GPT-4 is used via a standard web interface).

Study complies with GDPR, HIPAA, and ethical AI guidelines.

Timeline:

Study duration: \~8 months (from ethics approval to final analysis).

Results will be published regardless of outcome.

Why This Study Matters:

First study evaluating GPT-4's role in complex orthopedic diagnostics.

Could influence future AI-assisted clinical decision-making in joint replacement surgeries.

Detailed Description

Dive into the extended narrative that explains the scientific background, objectives, and procedures in greater depth.

Conditions

See the medical conditions and disease areas that this research is targeting or investigating.

Total Hip Arthroplasty (THA)

Study Design

Understand how the trial is structured, including allocation methods, masking strategies, primary purpose, and other design elements.

Observational Model Type

COHORT

Study Time Perspective

RETROSPECTIVE

Study Groups

Review each arm or cohort in the study, along with the interventions and objectives associated with them.

Failed or Painful Total Hip Arthroplasty Patients

Patients with documented failed/painful THA (aseptic loosening, infection, fracture, etc.) selected from a tertiary center database (2004-2024).

GPT-4 Assessment

Intervention Type OTHER

Diagnostic/Prognostic evaluation of any single case provided by AI (GPT-4). GPT-4 provides diagnosis/treatment recommendations via standardized prompts

Arthroplasty Fellow Assessment

Intervention Type OTHER

Diagnostic/Prognostic evaluation of any single case provided by an human expert

Specializing Resident (4th year) Assessment

Intervention Type OTHER

Diagnostic/Prognostic evaluation of any single case provided by an human expert

Junior Resident (3rd year) Assessment

Intervention Type OTHER

Diagnostic/Prognostic evaluation of any single case provided by an human expert

Interventions

Learn about the drugs, procedures, or behavioral strategies being tested and how they are applied within this trial.

GPT-4 Assessment

Diagnostic/Prognostic evaluation of any single case provided by AI (GPT-4). GPT-4 provides diagnosis/treatment recommendations via standardized prompts

Intervention Type OTHER

Arthroplasty Fellow Assessment

Diagnostic/Prognostic evaluation of any single case provided by an human expert

Intervention Type OTHER

Specializing Resident (4th year) Assessment

Diagnostic/Prognostic evaluation of any single case provided by an human expert

Intervention Type OTHER

Junior Resident (3rd year) Assessment

Diagnostic/Prognostic evaluation of any single case provided by an human expert

Intervention Type OTHER

Eligibility Criteria

Check the participation requirements, including inclusion and exclusion rules, age limits, and whether healthy volunteers are accepted.

Inclusion Criteria

* Adults (≥18 and ≤80 years old).
* Documented painful or failed total hip arthroplasty requiring clinical/radiological evaluation (2004-2024).
* Complete pre-operative clinical history, imaging (X-ray/tomography), and surgical reports.
* Clear diagnosis of failure mode (e.g., aseptic loosening, infection, fracture, wear).
* Treatment and outcomes fully documented in the institutional database.
* "Exemplary" cases with minimal diagnostic ambiguity (per Engh/MusculoSkleletal Infection Society criteria, etc.).

Exclusion Criteria

* total hip arthroplasty with no documented failure/pain (well-functioning implants).
* Incomplete clinical/radiological records (e.g., missing pre-operative imaging or surgical notes).
* Complex/multifactorial failures (e.g., concurrent infection + loosening + fracture).
* Radiographs/images non-interpretable (poor quality, missing views).
* Cases with conflicting diagnoses/treatments in original records.
Minimum Eligible Age

18 Years

Maximum Eligible Age

80 Years

Eligible Sex

ALL

Accepts Healthy Volunteers

No

Sponsors

Meet the organizations funding or collaborating on the study and learn about their roles.

Istituto Ortopedico Rizzoli

OTHER

Sponsor Role lead

Responsible Party

Identify the individual or organization who holds primary responsibility for the study information submitted to regulators.

francesco castagnini

Principal investigator

Responsibility Role PRINCIPAL_INVESTIGATOR

Principal Investigators

Learn about the lead researchers overseeing the trial and their institutional affiliations.

Francesco Castagnini, MD

Role: PRINCIPAL_INVESTIGATOR

IRCCS Istituto Ortopedico Rizzoli

Locations

Explore where the study is taking place and check the recruitment status at each participating site.

SC Ortopedia e Traumatologia e Chirurgia Protesica e dei Reimpianti di Anca e Ginocchio, IRCCS Istituto Ortopedico Rizzoli

Bologna, , Italy

Site Status RECRUITING

Countries

Review the countries where the study has at least one active or historical site.

Italy

Central Contacts

Reach out to these primary contacts for questions about participation or study logistics.

Francesco Castagnini, MD

Role: CONTACT

+390516366418

Facility Contacts

Find local site contact details for specific facilities participating in the trial.

Francesco Castagnini, MD

Role: primary

+390516366418

References

Explore related publications, articles, or registry entries linked to this study.

Knee CJ, Campbell RJ, Graham DJ, Handford C, Symes M, Sivakumar BS. Examining the role of ChatGPT in the management of distal radius fractures: insights into its accuracy and consistency. ANZ J Surg. 2024 Jul-Aug;94(7-8):1391-1396. doi: 10.1111/ans.19143. Epub 2024 Jul 5.

Reference Type BACKGROUND
PMID: 38967407 (View on PubMed)

Dagher T, Dwyer EP, Baker HP, Kalidoss S, Strelzow JA. "Dr. AI Will See You Now": How Do ChatGPT-4 Treatment Recommendations Align With Orthopaedic Clinical Practice Guidelines? Clin Orthop Relat Res. 2024 Dec 1;482(12):2098-2106. doi: 10.1097/CORR.0000000000003234. Epub 2024 Sep 6.

Reference Type BACKGROUND
PMID: 39246048 (View on PubMed)

Artioli E, Veronesi F, Mazzotti A, Brogini S, Zielli SO, Giavaresi G, Faldini C. Assessing ChatGPT responses to common patient questions regarding total ankle arthroplasty. J Exp Orthop. 2024 Dec 31;12(1):e70138. doi: 10.1002/jeo2.70138. eCollection 2025 Jan.

Reference Type BACKGROUND
PMID: 39741912 (View on PubMed)

Pagano S, Strumolo L, Michalk K, Schiegl J, Pulido LC, Reinhard J, Maderbacher G, Renkawitz T, Schuster M. Evaluating ChatGPT, Gemini and other Large Language Models (LLMs) in orthopaedic diagnostics: A prospective clinical study. Comput Struct Biotechnol J. 2024 Dec 26;28:9-15. doi: 10.1016/j.csbj.2024.12.013. eCollection 2025.

Reference Type BACKGROUND
PMID: 39850460 (View on PubMed)

Provided Documents

Download supplemental materials such as informed consent forms, study protocols, or participant manuals.

Document Type: Study Protocol and Statistical Analysis Plan

View Document

Other Identifiers

Review additional registry numbers or institutional identifiers associated with this trial.

203/2025/Oss/IOR

Identifier Type: -

Identifier Source: org_study_id

More Related Trials

Additional clinical trials that may be relevant based on similarity analysis.