Diagnostic Accuracy of Carebot AI MMG in Mammography Screening: Multicenter MRMC Study

Study Results

Results pending

The study team has not published outcome measurements, participant flow, or safety data for this trial yet. Check back later for updates.

Basic Information

Get a concise snapshot of the trial, including recruitment status, study phase, enrollment targets, and key timeline milestones.

Recruitment Status

COMPLETED

Total Enrollment

222 participants

Study Classification

OBSERVATIONAL

Study Start Date

2025-01-01

Study Completion Date

2025-11-03

Brief Summary

Review the sponsor-provided synopsis that highlights what the study is about and why it is being conducted.

This study evaluates the diagnostic performance of Carebot AI MMG, an artificial intelligence (AI)-enabled medical device for evaluating mammograms. The software analyzes standard full-field digital mammography (FFDM) images and classifies each examination as having no suspicious finding ("Low Risk"), a probably benign mass ("Medium Risk"), or a suspicious malignant mass ("High Risk").

The study is retrospective and observational. It uses anonymized mammography examinations from four screening centers, without any additional imaging or contact with patients. Three experienced breast radiologists independently read the same set of cases, and their assessments are used as the human benchmark. A histopathology-based reference standard, supplemented by radiologist consensus and follow-up information for negative cases, is used to determine whether cancer is present.

The main goal is to compare the AI system with human radiologists in terms of sensitivity and specificity for detecting breast cancer, and to assess whether the AI can achieve non-inferior performance at two predefined operating points: one favoring higher sensitivity and negative predictive value (rule-out) and one favoring higher specificity and positive predictive value (rule-in).

Detailed Description

Dive into the extended narrative that explains the scientific background, objectives, and procedures in greater depth.

Design and setting This is a retrospective, multicenter, multi-reader, multi-case (MRMC) diagnostic accuracy study of Carebot AI MMG, conducted on anonymized 2D full-field digital mammography (FFDM) examinations acquired as part of routine breast cancer screening. Mammograms were collected from four screening centers over a defined time period. No additional imaging was performed for the purpose of this study, and no subjects were contacted.

Data source and population The source dataset consists of 4,729 screening mammography examinations from women aged 32 to 88 years (mean approximately 57 years). Only 2D FFDM studies with a complete set of standard projections (LCC, RCC, LMLO, RMLO) were included. Examinations with incomplete series, unreadable or corrupted DICOM files, or missing/inconsistent key metadata were excluded, as were tomosynthesis (DBT) studies, men, and women under 18 years of age. To ensure sufficient precision of performance estimates, the dataset was enriched with additional biopsy-proven cancers. The final analytical subset comprises 222 examinations, including 48 malignant and 174 non-malignant studies, with representation across three mammography devices (Hologic Selenia Dimensions, Hologic Lorad Selenia, Fujifilm FDR-3000AWS).

Investigational device and comparator The investigational device is Carebot AI MMG (software version 2.9, deep-learning models v2.3), a stand-alone AI system that analyzes 2D FFDM exams and outputs a case-level classification into three categories, together with an internal risk score. Two predefined operating points are evaluated: a high-sensitivity (HSe) threshold, where both benign and malignant masses are treated as "positive" (rule-out setting), and a high-specificity (HSp) threshold, where only malignant masses are counted as "positive" (rule-in setting).

As a human comparator, three experienced radiologists (RAD 1-3) independently read the same anonymized studies using a dedicated DICOM viewer integrated with a labeling application. Radiologists were blinded to AI outputs, clinical information, and outcomes and recorded a case-level classification into the same three categories (Negative/Benign/Malignant). For primary analyses, their binary decisions are derived using the same HSe and HSp rules. In addition, a "random reader" benchmark is constructed for balanced accuracy by repeatedly sampling one radiologist's decision per case in a bootstrap framework (20,000 iterations).

Reference standard The reference standard is established at the study level. A case is labeled "Malignant" if there is histopathological confirmation of breast cancer from biopsy performed in temporal association with the index mammogram. A case is labeled "Non-malignant" if there is consensus between two local radiologists that the finding is negative or stably benign, typically corroborated by at least 2 years of imaging follow-up. Tumor staging (e.g., TNM) is not used in the present analysis. All 48 malignant cases from the participating centers are included in the analytical subset; no cancer-positive examinations were excluded.

Objectives and endpoints The primary objectives are: (1) to demonstrate that the balanced accuracy (BA) of Carebot AI MMG is at least 0.80 in both HSe and HSp operating points; (2) to demonstrate non-inferiority of the AI's balanced accuracy compared with the MRMC "random reader" benchmark with a non-inferiority margin of 0.05; and (3) to demonstrate non-inferiority of sensitivity (Se) of AI versus each of the three radiologists in both HSe and HSp, with a non-inferiority margin of 0.07. Secondary objectives are to describe specificity (Sp), positive predictive value (PPV), negative predictive value (NPV), and to characterize patterns of false-negative and false-positive decisions and their potential implications for clinical risk management.

Statistical analysis Diagnostic performance metrics (Se, Sp, PPV, NPV, BA) are calculated at the case level for Carebot AI MMG and each radiologist in both HSe and HSp. Wilson 95% confidence intervals are used for proportions. Paired McNemar tests are used to compare AI and individual readers in terms of Se and Sp. For balanced accuracy, a MRMC bootstrap procedure with 20,000 iterations is used to construct the distribution of the random reader and to estimate the probability that BA\_AI is greater than or equal to BA\_random minus the pre-specified margin. Non-inferiority in sensitivity is assessed using a Nam-Blackwelder-type framework on discordant pairs. False-negative and false-positive cases are reviewed qualitatively with emphasis on lesion conspicuity, breast density, and typical error patterns (e.g., dense parenchyma, lesions near the pectoral muscle, benign vascular structures, asymmetries).

Risk, ethics, and data protection The study is non-interventional and entirely retrospective. All mammography examinations were acquired as part of routine care before the study, and all DICOM data were irreversibly anonymized at the site level in compliance with GDPR and applicable national law before transfer to the sponsor. No additional radiation exposure or patient contact occurs, and no adverse events are expected. Given this design, the study does not meet the MDR definition of a clinical investigation under Article 62 and is not subject to prior notification under Article 74(1); individual informed consent is not required. Results are intended to support the clinical evaluation of Carebot AI MMG as a decision-support tool in organized mammography screening.

Conditions

See the medical conditions and disease areas that this research is targeting or investigating.

Breast Neoplasms Breast Cancer Detection Breast Cancer - Female Breast Cancer Screening

Keywords

Explore important study keywords that can help with search, categorization, and topic discovery.

Breast cancer screening Mammography Full-field digital mammography Artificial intelligence Deep learning Diagnostic accuracy Carebot AI MMG

Study Design

Understand how the trial is structured, including allocation methods, masking strategies, primary purpose, and other design elements.

Observational Model Type

CASE_CONTROL

Study Time Perspective

RETROSPECTIVE

Study Groups

Review each arm or cohort in the study, along with the interventions and objectives associated with them.

Malignant cases

Women with biopsy-proven breast cancer included in the analytical subset (n = 48). Each case corresponds to a screening full-field digital mammography (FFDM) examination with all four standard views (LCC, RCC, LMLO, RMLO), retrospectively identified from participating screening centers.

Carebot AI MMG software analysis

Intervention Type DEVICE

Retrospective stand-alone AI analysis of anonymized 2D full-field digital mammography (FFDM) examinations. The AI system (Carebot AI MMG, version 2.9) processes existing images and outputs case-level risk classifications; no additional imaging, randomization, or changes to patient management occur as part of this study.

Non-malignant cases

Women without histopathological evidence of breast cancer, classified as negative or stably benign by two independent local radiologists with at least 2 years of imaging follow-up (n = 174). Each case corresponds to a screening FFDM examination with all four standard views (LCC, RCC, LMLO, RMLO), retrospectively selected from the same screening population.

Carebot AI MMG software analysis

Intervention Type DEVICE

Retrospective stand-alone AI analysis of anonymized 2D full-field digital mammography (FFDM) examinations. The AI system (Carebot AI MMG, version 2.9) processes existing images and outputs case-level risk classifications; no additional imaging, randomization, or changes to patient management occur as part of this study.

Interventions

Learn about the drugs, procedures, or behavioral strategies being tested and how they are applied within this trial.

Carebot AI MMG software analysis

Retrospective stand-alone AI analysis of anonymized 2D full-field digital mammography (FFDM) examinations. The AI system (Carebot AI MMG, version 2.9) processes existing images and outputs case-level risk classifications; no additional imaging, randomization, or changes to patient management occur as part of this study.

Intervention Type DEVICE

Eligibility Criteria

Check the participation requirements, including inclusion and exclusion rules, age limits, and whether healthy volunteers are accepted.

Inclusion Criteria

* Female sex
* Age ≥ 18 years at the time of the screening mammogram
* Screening full-field digital mammography (FFDM) examination with all four standard views (LCC, RCC, LMLO, RMLO) available
* Sufficient image quality and complete DICOM metadata to allow retrospective analysis

Exclusion Criteria

* Male sex
* Age \< 18 years
* Digital breast tomosynthesis (DBT/3D) examinations without a corresponding full 2D FFDM four-view series
* Incomplete mammography series (missing one or more of LCC, RCC, LMLO, RMLO)
* Corrupted or unreadable DICOM files
* Missing or inconsistent key metadata (e.g., laterality, view, acquisition date)

Minimum Eligible Age

18 Years

Eligible Sex

FEMALE

Accepts Healthy Volunteers

Yes

Responsible Party

Identify the individual or organization who holds primary responsibility for the study information submitted to regulators.

Responsibility Role SPONSOR

Locations

Explore where the study is taking place and check the recruitment status at each participating site.

Poliklinika MEDICON Budějovická

Prague, , Czechia

Site Status

Dolnooravská nemocnica s poliklinikou MUDr. L. N. Jégého

Dolný Kubín, , Slovakia

Site Status

Nemocnica s poliklinikou Považská Bystrica

Považská Bystrica, , Slovakia

Site Status

Ľubovnianska nemocnica

Stará Ľubovňa, , Slovakia

Site Status

Countries

Review the countries where the study has at least one active or historical site.

Czechia Slovakia

Other Identifiers

Review additional registry numbers or institutional identifiers associated with this trial.

00005

Identifier Type: -

Identifier Source: org_study_id

CB-MMG-02-MC

Identifier Type: OTHER

Identifier Source: secondary_id

Diagnostic Accuracy of Carebot AI MMG in Mammography Screening: Multicenter MRMC Study

Study Results

Basic Information

Brief Summary

Detailed Description

Conditions

Keywords

Study Design

Study Groups

Malignant cases

Carebot AI MMG software analysis

Non-malignant cases

Carebot AI MMG software analysis

Interventions

Carebot AI MMG software analysis

Eligibility Criteria

Inclusion Criteria

Exclusion Criteria

Sponsors

Carebot s.r.o.

Responsible Party

Locations

Poliklinika MEDICON Budějovická

Dolnooravská nemocnica s poliklinikou MUDr. L. N. Jégého

Nemocnica s poliklinikou Považská Bystrica

Ľubovnianska nemocnica

Countries

Other Identifiers

00005

CB-MMG-02-MC