Machine Learning and Pregnancy Success Prediction in Fertility Treatments

NCT ID: NCT06884930

Last Updated: 2025-10-07

Study Results

Results pending

The study team has not published outcome measurements, participant flow, or safety data for this trial yet. Check back later for updates.

Basic Information

Get a concise snapshot of the trial, including recruitment status, study phase, enrollment targets, and key timeline milestones.

Recruitment Status

ACTIVE_NOT_RECRUITING

Total Enrollment

5000 participants

Study Classification

OBSERVATIONAL

Study Start Date

2025-04-16

Study Completion Date

2026-03-31

Brief Summary

Review the sponsor-provided synopsis that highlights what the study is about and why it is being conducted.

Infertility, as defined by the World Health Organization (WHO), is a disorder of the male or female reproductive system characterized by the inability to achieve a clinical pregnancy after 12 months or more of regular, unprotected sexual intercourse. In modern fertility treatment, assisted reproductive technologies (ART), including in vitro fertilization (IVF), have become a standard approach for addressing complex fertility issues and sterility. In Italy, infertility affects approximately 16.5% of couples.

Despite advancements in ART, comparing the failure rates of pregnancies achieved through ART with those of spontaneous pregnancies in Italy reveals significant differences, particularly in terms of success rates, miscarriage rates, and embryo implantation outcomes.

In this context, AI-based models have shown promising potential in predicting IVF success by analyzing complex datasets that include patient demographics, hormonal levels, and embryo morphology. Research indicates that AI can enhance embryo selection, predict the optimal timing for embryo transfer, and advance personalized medicine approaches in reproductive health.

This study aims to use of Machine Learning to identify patterns and factors associated with successful pregnancy outcomes by analyzing large-scale, anonymized ART data. The resulting predictive model could enable clinicians to better personalize treatment protocols for each patient, optimizing medication dosages, timing, and embryo selection. It could also improve pregnancy success rates while reducing the emotional and financial burden on patients, thus advancing the standard of care in ART.

Detailed Description

Dive into the extended narrative that explains the scientific background, objectives, and procedures in greater depth.

This is a multicentric, observational, retrospective, non-profit study, coordinated by the IRCCS San Raffaele Hospital, aims to analyze anonymized data collected between 2019 and 2024 from approximately 5,000 couples undergoing Assisted Reproductive Technology (ART) procedures across three participating centers. The study will examine key variables, including age, medical history, treatment protocols, ART techniques (such as In Vitro Fertilization \[IVF\] and Intracytoplasmic Sperm Injection \[ICSI\]), embryo quality, and pregnancy outcomes, to develop a machine learning-based predictive model for pregnancy outcomes. The selected timeframe ensures a sufficiently large dataset to facilitate robust development and validation of the predictive model.

By leveraging machine learning techniques, this study aims to enhance the accuracy of pregnancy outcome predictions, thereby improving patient counseling and treatment planning in ART procedures. The comprehensive dataset, encompassing a diverse range of variables and a substantial number of cases, will provide a robust foundation for developing a predictive model with high clinical applicability.

The primary objective of this study is to develop a Machine Learning-based predictive model for pregnancy outcomes in assisted reproductive technologies (ART), by analyzing large-scale, anonymized data, for scientific research purposes. The model aims to identify key patterns and factors that correlate with successful pregnancy outcomes to optimize individualized treatment protocols for patients undergoing ART.

SAMPLE SIZE:

The sample size will be approximately 5,000 pairs of subjects (women + men) based on the total number of ART cycles recorded at the participating centers during this period and the number of patients with complete data records that provide sufficient information for analysis. We expect approximately 1650 pairs for the class "success" and 3350 for the class "unsuccess" of the IVF treatment. Thus, the Machine Learning-based predictive model could be trained using a multi-parametric approach with a balanced set of 1350 pairs of subjects, using the remaining couples of subjects to test the performance of the model.

The minimum sample size for the retrospective study should be 295 pairs of subjects, calculated to yield a 95% confidence interval of ± 2.5% around an expected sensitivity of 94% and an expected specificity of 15% of the prediction model, with a prevalence of IVF treatment success of 30% and a dropout rate of 2%. This success prevalence is expected based on the clinical site's experience of the number of IVF treatments; the dropout rate is considered low, at 2%, considering the type of retrospective clinical study using software and the residual possibility of complete data not being valid. Sensitivity, specificity, positive and negative predictive values are calculated with their 95% confidence intervals.

Considering the minimum sample size and the available number of samples, we expect to achieve statistical significance from the hypothesis testing and to obtain a multi-modal signature of predictors of IVF success.

STATISTICAL DESIGN A structured methodology will be employed to develop and assess machine learning models for binary classification tasks, specifically distinguishing between "Success" and "Not Success." The process will commence with the selection of informative and non-redundant features, eliminating those with low variance and high correlation. Subsequently, three distinct classifier models-Random Forest, Support Vector Machines (SVM), and K-Nearest Neighbors (KNN)-will be trained and evaluated using k-fold cross-validation to ensure robust performance assessment. To address potential data imbalances, the ADASYN technique will be applied, generating synthetic samples for the minority class. Model performance will be quantified using various metrics, including accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (ROC-AUC), to identify the most effective model. Finally, a statistical analysis of the most pertinent features will be conducted using non-parametric tests and corrections for multiple comparisons, aiming to elucidate class differences and ensure result reliability.

This structured approach will ensure that the models are meticulously tuned and validated through rigorous testing and analysis, leading to accurate and reliable machine learning models for binary classification tasks.

INFORMED CONSENT AND DATA PROTECTION In accordance with data protection regulations, the study will utilize anonymized data previously collected through routine clinical practice and stored in MedITEX IVF, a management software used at the participating assisted reproduction centers. No direct patient interaction or intervention will occur as part of the study. All data will be anonymized following best practice guidelines to ensure patient confidentiality, adhering to ethical standards and applicable data privacy regulations.

The Investigator (or the Center receiving the data) commits to processing the data solely for the purposes of the study, storing it in a secure network system, and restricting access to authorized personnel who have undertaken confidentiality agreements. If external suppliers are involved, they will be appointed as Data Processors with appropriate agreements in place. The Investigator will also facilitate the exercise of data subject rights, including access, rectification, cancellation, limitation, opposition, and portability, within 30 days of receiving the relevant request. In the event of data communication outside the institution in pseudonymized form, efforts will be made to prevent the identification of data subjects. Within 30 days following the end of the study, the Investigator will ensure the deletion or irreversible anonymization of the communicated data and promptly communicate this in writing.

A study-specific Data Protection Impact Assessment (DPIA), reviewed by the Data Protection Officer (DPO) of the coordinating institution, has been conducted in accordance with applicable data protection laws.

Conditions

See the medical conditions and disease areas that this research is targeting or investigating.

Infertility (IVF Patients)

Study Design

Understand how the trial is structured, including allocation methods, masking strategies, primary purpose, and other design elements.

Observational Model Type

OTHER

Study Time Perspective

RETROSPECTIVE

Study Groups

Review each arm or cohort in the study, along with the interventions and objectives associated with them.

IVF patients

No interventions assigned to this group

Eligibility Criteria

Check the participation requirements, including inclusion and exclusion rules, age limits, and whether healthy volunteers are accepted.

Inclusion Criteria

* Patients who underwent ART procedures, including IVF and ICSI, between 2019 and 2024.
* Women aged between 18 and 43 years.

Exclusion Criteria

* Patiens with incomplete or missing data records that do not provide sufficient information for analysis.
* women outside the 18 to 43 age range
Minimum Eligible Age

18 Years

Maximum Eligible Age

43 Years

Eligible Sex

ALL

Accepts Healthy Volunteers

No

Sponsors

Meet the organizations funding or collaborating on the study and learn about their roles.

Fondazione IRCCS Ca' Granda, Ospedale Maggiore Policlinico

OTHER

Sponsor Role collaborator

ASST Grande Ospedale Metropolitano Niguarda

OTHER

Sponsor Role collaborator

IRCCS San Raffaele

OTHER

Sponsor Role lead

Responsible Party

Identify the individual or organization who holds primary responsibility for the study information submitted to regulators.

Enrico Papaleo

MD

Responsibility Role PRINCIPAL_INVESTIGATOR

Locations

Explore where the study is taking place and check the recruitment status at each participating site.

IRCCS San Raffaele Hospital

Milan, Milano, Italy

Site Status

Countries

Review the countries where the study has at least one active or historical site.

Italy

References

Explore related publications, articles, or registry entries linked to this study.

Zhang Q, Liang X, Chen Z. A review of artificial intelligence applications in in vitro fertilization. J Assist Reprod Genet. 2025 Jan;42(1):3-14. doi: 10.1007/s10815-024-03284-6. Epub 2024 Oct 14.

Reference Type RESULT
PMID: 39400647 (View on PubMed)

Jiang VS, Bormann CL. Artificial intelligence in the in vitro fertilization laboratory: a review of advancements over the last decade. Fertil Steril. 2023 Jul;120(1):17-23. doi: 10.1016/j.fertnstert.2023.05.149. Epub 2023 May 19.

Reference Type RESULT
PMID: 37211062 (View on PubMed)

Attività del Registro Nazionale Italiano della Procreazione Medicalmente Assistita - 17° Report 2021

Reference Type RESULT

European IVF Monitoring Consortium (EIM) for the European Society of Human Reproduction and Embryology (ESHRE); Smeenk J, Wyns C, De Geyter C, Kupka M, Bergh C, Cuevas Saiz I, De Neubourg D, Rezabek K, Tandler-Schneider A, Rugescu I, Goossens V. ART in Europe, 2019: results generated from European registries by ESHREdagger. Hum Reprod. 2023 Dec 4;38(12):2321-2338. doi: 10.1093/humrep/dead197.

Reference Type RESULT
PMID: 37847771 (View on PubMed)

Infertility prevalence estimates, 1990-2021. Geneva: World Health Organization; 2023. Licence: CC BY-NC-SA 3.0 IGO.

Reference Type RESULT

Other Identifiers

Review additional registry numbers or institutional identifiers associated with this trial.

MaLIV-PMA

Identifier Type: -

Identifier Source: org_study_id

More Related Trials

Additional clinical trials that may be relevant based on similarity analysis.

Prediction of Ovarian Response
NCT00557687 COMPLETED