Data Clustering Study With Artificial Intelligence and Phenotyping of Patients With Acute Pulmonary Embolism

NCT ID: NCT06183944

Last Updated: 2025-11-20

Study Results

Results pending

The study team has not published outcome measurements, participant flow, or safety data for this trial yet. Check back later for updates.

Basic Information

Get a concise snapshot of the trial, including recruitment status, study phase, enrollment targets, and key timeline milestones.

Recruitment Status

RECRUITING

Total Enrollment

2500 participants

Study Classification

OBSERVATIONAL

Study Start Date

2023-12-11

Study Completion Date

2026-07-01

Brief Summary

Review the sponsor-provided synopsis that highlights what the study is about and why it is being conducted.

The aim will be to identify clinically relevant phenotypes in patients with acute pulmonary embolism. Hierarchical clustering methods combined with unsupervised learning (machine learning) will be used to obtain groups of patients who are homogeneous at diagnosis. Evaluating their prognosis at 6 months (recurrence or chronic thromboembolic pulmonary hypertension), account the first 3 months of anticoagulant treatment, would provide an aid to medical decision-making.

This research will include a retrospective and a prospective parts. The retrospective part will include patients who have been admitted to CHITS for acute pulmonary embolism since 2019. For the prospective part, it is planned to include patients with same characteristics over the years 2024 and 2025. More than 2,500 patients are expected to be included.

This research will have no impact on current patient care. Data from consultations and various examinations carried out as part of care will be collected for six months post-diagnosis in order to meet the research objectives.

Detailed Description

Dive into the extended narrative that explains the scientific background, objectives, and procedures in greater depth.

Context :

Artificial Intelligence : clustering and unsupervised learning:

Artificial Intelligence (AI) is a field that combines computer science with data sets, with the aim of enabling a machine to imitate the cognitive abilities of human being. Machine learning (ML) and its sub-domain deep learning, which uses layers of neurons, are two major sub-domains of AI. The difference lies in training of each algorithm. Supervised learning, which involves training a model on known input and output data to predict future outputs, and unsupervised learning involves the discovery of hidden patterns and intrinsic underlying structures in the input data.

The aim of clustering methods is to group a set of individuals into homogeneous classes. Non-hierarchical methods can be used to classify massive data but require to fixe in advance the number of classes. Hierarchical methods, which are more time-consuming to compute, consist of a series of nested partitions represented by a clustering tree. The optimal number of classes can be determined a posteriori by reading the tree. In presence of a large number of individuals, it is common to combine non-hierarchical and hierarchical techniques. When classes are not clearly known in advance, clustering methods are use with unsupervised learning (ML) \[1\]. Datasets are generally divided into three disjoint datasets: training data, used to train the chosen algorithm(s); validation data, used to check performance of result; and test data, used only at the end of the process.

Venous thromboembolic disease:

Venous thromboembolic disease (VTE) is a common pathology whose incidence is imperfectly known, but increases with age, reaching 1% in subjects over 75 years old. In France, it is estimated that every year over 100,000 people develop VTE, which is responsible for between 5,000 and 10,000 deaths. Deep vein thrombosis (DVT) and pulmonary embolism (PE) are the two main types of VTE. DVT corresponds to partial or total occlusion of a deep vein by a thrombus, most often localized in the lower limbs. PE is defined as partial or total occlusion of the pulmonary arteries or their branches. The main risk of DVT is the occurrence of PE, which can be life threatening. Other VTE-specific complications and possible adverse outcomes include thromboembolic recurrence (either DVT or PE), chronic thromboembolic pulmonary hypertension and post-thrombotic syndrome in DVT. Current management of VTE is mainly based on anticoagulant therapy. The duration of treatment varies according to the estimated risk of recurrence if treatment is withdrawn, essentially depending on whether or not there is a prior major risk factor \[2\]. In this subgroup of PE patients, in the absence of major risk factors, risk of recurrence is considered intermediate and varies according to whether the event is a first episode or a recurrence, and whether there are obstructive pulmonary sequelae or not \[3\]. More recently, the therapeutic strategy has become more complex, with inclusion of minor risk factors that modulate duration of treatment without relevant evidence. Moreover, regardless of the duration of treatment, the dosage of anticoagulation beyond the sixth month is uncertain for Direct Oral Anticoagulants.

Hypotheses :

The aim will be to use the database to identify clinically relevant phenotypes in patients with acute pulmonary embolism. Hierarchical clustering methods combined with unsupervised learning (machine learning) will be used to obtain groups of patients who are homogeneous at diagnosis. Evaluating their prognosis at 6 months (recurrence or chronic thromboembolic pulmonary hypertension), account the first 3 months of anticoagulant treatment, would provide an aid to medical decision-making.

An analysis of the six-month evolution of homogeneous patient groups with acute pulmonary embolism, constructed using clustering methods with unsupervised learning has never been conducted before. This innovative project within a large-scale hospital infrastructure is likely to offer doctors a decision-making aid, and patients a scientifically-validated form of therapeutic management.

Material and Methods :

This research will include a retrospective and a prospective parts. The retrospective part will include patients who have been admitted to CHITS for acute pulmonary embolism since 2019 (around 1900 patients). For the prospective part, it is planned to include patients with same characteristics over the years 2024 and 2025 (approximately 765 patients). If individual information is not available or they object to the processing of their data for 25% of the patients, a large volume of data on over 2,500 patients could potentially be analysed in this trial. This research will have no impact on current patient care. Data from consultations and various examinations carried out as part of the care will be collected for six months post-diagnosis to meet the research objectives.

Unsupervised clustering methods used in this study combine hierarchical and non-hierarchical methods. Following the hierarchical ascending clustering, Ward's index is used to determine the number of groups of interest. The centroids of these groups are then considered to initialize a partitioning algorithm, such as the k-means algorithm. Once most medically relevant groups have been determined, six-month evolution (stable, aggravation or progress) are compared. Factors influencing progression during the first three months of treatment can also be included in a statistic model, depending on their ability to predict aggravation. All these explorations should provide a basis for medical decision-making.

Conditions

See the medical conditions and disease areas that this research is targeting or investigating.

Pulmonary Embolism

Study Design

Understand how the trial is structured, including allocation methods, masking strategies, primary purpose, and other design elements.

Observational Model Type

COHORT

Study Time Perspective

OTHER

Study Groups

Review each arm or cohort in the study, along with the interventions and objectives associated with them.

Patient with acute pulmonary embolism

Patient with acute pulmonary embolism in Centre Hospitalier Intercommunal Toulon La Seyne sur Mer, hospitalised or not since 2019

Hierarchical clustering methods

Intervention Type OTHER

Hierarchical clustering methods will be used to form homogeneous groups of patients based on their data at diagnosis: presence or absence of symptoms, clinical and biological data, and presence or absence of favouring factors. Patient evolution at 6 months can fall into categories: stable, aggravation or progress, which are determined by events such as recurrence, hemorrhage, functional sequelae or death.

Interventions

Learn about the drugs, procedures, or behavioral strategies being tested and how they are applied within this trial.

Hierarchical clustering methods

Hierarchical clustering methods will be used to form homogeneous groups of patients based on their data at diagnosis: presence or absence of symptoms, clinical and biological data, and presence or absence of favouring factors. Patient evolution at 6 months can fall into categories: stable, aggravation or progress, which are determined by events such as recurrence, hemorrhage, functional sequelae or death.

Intervention Type OTHER

Eligibility Criteria

Check the participation requirements, including inclusion and exclusion rules, age limits, and whether healthy volunteers are accepted.

Inclusion Criteria

* Age ≥ 18 years;
* Patient with acute pulmonary embolism in CHITS (hospitalised or not).

Exclusion Criteria

* Sub-segmental pulmonary embolisms ;
* Patient opposition.
Minimum Eligible Age

18 Years

Eligible Sex

ALL

Accepts Healthy Volunteers

No

Sponsors

Meet the organizations funding or collaborating on the study and learn about their roles.

Centre Hospitalier Intercommunal de Toulon La Seyne sur Mer

OTHER

Sponsor Role lead

Responsible Party

Identify the individual or organization who holds primary responsibility for the study information submitted to regulators.

Responsibility Role SPONSOR

Principal Investigators

Learn about the lead researchers overseeing the trial and their institutional affiliations.

Jean-Noël POGGI, MD

Role: STUDY_DIRECTOR

Centre Hospitalier Intercommunal Toulon La Seyne sur Mer

Locations

Explore where the study is taking place and check the recruitment status at each participating site.

centre hospitalier intercommunal Toulon La Seyne sur Mer - Internal and vascular medicine

Toulon, , France

Site Status RECRUITING

Countries

Review the countries where the study has at least one active or historical site.

France

Central Contacts

Reach out to these primary contacts for questions about participation or study logistics.

Jean-Philippe Suppini

Role: CONTACT

04 94 14 55 25 ext. +33

Sophie Lafond

Role: CONTACT

04 83 77 20 62 ext. +33

Facility Contacts

Find local site contact details for specific facilities participating in the trial.

Jean-Noël POGGI, MD

Role: primary

04 94 14 57 87 ext. +33

References

Explore related publications, articles, or registry entries linked to this study.

Gal J, Bailleux C, Chardin D, Pourcher T, Gilhodes J, Jing L, Guigonis JM, Ferrero JM, Milano G, Mograbi B, Brest P, Chateau Y, Humbert O, Chamorey E. Comparison of unsupervised machine-learning methods to identify metabolomic signatures in patients with localized breast cancer. Comput Struct Biotechnol J. 2020 Jun 3;18:1509-1524. doi: 10.1016/j.csbj.2020.05.021. eCollection 2020.

Reference Type BACKGROUND
PMID: 32637048 (View on PubMed)

Gallo A, Valerio L, Barco S. The 2019 European guidelines on pulmonary embolism illustrated with the aid of an exemplary case report. Eur Heart J Case Rep. 2021 Jan 4;5(2):ytaa542. doi: 10.1093/ehjcr/ytaa542. eCollection 2021 Feb.

Reference Type BACKGROUND
PMID: 33598618 (View on PubMed)

Duffett L, Castellucci LA, Forgie MA. Pulmonary embolism: update on management and controversies. BMJ. 2020 Aug 5;370:m2177. doi: 10.1136/bmj.m2177.

Reference Type BACKGROUND
PMID: 32759284 (View on PubMed)

Yu T, Shen R, You G, Lv L, Kang S, Wang X, Xu J, Zhu D, Xia Z, Zheng J, Huang K. Machine learning-based prediction of the post-thrombotic syndrome: Model development and validation study. Front Cardiovasc Med. 2022 Sep 16;9:990788. doi: 10.3389/fcvm.2022.990788. eCollection 2022.

Reference Type BACKGROUND
PMID: 36186967 (View on PubMed)

Other Identifiers

Review additional registry numbers or institutional identifiers associated with this trial.

2023-CHITS-016

Identifier Type: -

Identifier Source: org_study_id

More Related Trials

Additional clinical trials that may be relevant based on similarity analysis.

Surgery in Pulmonary Embolisms
NCT06070129 NOT_YET_RECRUITING