AI Models Advance Molecular Property Prediction for Drug Discovery
Three new computational approaches enhance molecular property prediction and drug design through advanced neural networks, multimodal learning, and thermodynamic analysis of receptor-ligand interactions.
Researchers have developed three distinct computational approaches to improve molecular property prediction and drug design, addressing key challenges in pharmaceutical development where discovering molecules with desirable properties is of great importance.
A multimodal pre-training molecular representation learning framework called M2UMol separately matches 2D modality to multiple modalities and undergoes pre-training jointly with a modality classifier. The framework transfers multimodal knowledge into the 2D modal encoder and allows for inputting incomplete modalities in the pre-training stage. In downstream tasks with only the 2D modality given, M2UMol enables the precise simulation of molecular multimodal information based on the pre-trained 2D modal encoder. Comprehensive experimental results show the superior performance of M2UMol in a wide range of molecular tasks with higher efficiency in pre-training than pioneer models. The raw data of the pre-training dataset were sourced from the public dataset DrugBank, and a user-friendly package based on M2UMol integrates molecular representation learning, key functional group analysis, and molecular multimodal retrieval. The code, pre-trained weights of M2UMol, and the package are available at https://github.com/Zhankun-Xiong/M2UMol.
A separate approach called the Hierarchical Interaction Message Net (HimNet) employs a Hierarchical Interaction Message Passing Mechanism to enable interaction-aware representation learning across atomic, motif, and molecular levels via hierarchical attention-guided message passing. This design allows HimNet to effectively balance global and local information, ensuring rich and task-relevant feature extraction for downstream property prediction tasks. The system was systematically evaluated on eleven datasets, including eight widely-used MoleculeNet benchmarks and three challenging, high-value datasets for metabolic stability, malaria activity, and liver microsomal clearance, covering a broad range of pharmacologically relevant properties. Extensive experiments demonstrate that HimNet achieves the best or near-best performance in most molecular property prediction tasks. The custom code for HimNet is deposited in the Zenodo repository at https://doi.org/10.5281/zenodo.18030100 and is also available on GitHub at https://github.com/Hugh415/HimNet under an MIT license.
In a complementary approach focusing on thermodynamic analysis, a research team led by a professor from the Department of Life System Engineering at Tokyo University of Science systematically investigated the binding thermodynamics of the histamine H1 receptor. The histamine H1 receptor is a GPCR subtype that plays a key role in mediating allergic reactions, inflammation, vascular permeability, airway constriction, wakefulness, and cognitive functions in the human body. G-protein-coupled receptors are one of the largest families of cell surface proteins in the human body that recognize hormones, neurotransmitters, and drugs, regulating a wide range of physiological processes and serving as targets of more than 30% of currently marketed drugs.
The team successfully measured the thermodynamic signatures of doxepin geometric isomers (E- and Z-isomers) to the H1R, prepared via a budding yeast expression system, using isothermal titration calorimetry and molecular dynamics simulations. Doxepin, a tricyclic antidepressant, is also a potent antihistamine targeting H1R and exists as a mixture of E- and Z-isomers. The Z-isomer exhibits approximately five times higher affinity for H1R binding than the E-isomer. The researchers identified a key threonine residue (Thr1123.37) that contributed to this isomer-dependent selectivity.
The researchers synthesized two variants of H1R: a wild-type (H1R_WT) variant and a T1123.37V mutant, in which the Thr1123.37 residue is swapped with a different amino acid. The results showed no differences in binding energy for interactions of doxepin between H1R_WT and the T1123.37V mutant; however, the enthalpic and entropic contributions differed. Binding to H1R_WT was predominantly enthalpy-driven, whereas binding to the mutant receptor showed a reduced enthalpic contribution accompanied by a relatively larger entropic contribution.
Binding of the Z-isomer to H1R_WT was associated with a larger enthalpic gain and a greater entropic penalty compared to the E-isomer. These differences were absent in the T1123.37V mutant. The binding energy of the Z-isomer was higher than that of the E-isomer for H1R_WT, while in the case of the mutant receptor, the binding energies of both isomers were comparable. These observations underscore the role of Thr1123.37 in balancing the enthalpic gains and entropic losses during ligand binding, as well as the more pronounced effect in the interaction with the Z-isomer.
Molecular dynamics simulations showed that the high-affinity binding of the Z-isomer arises from conformational restrictions, consistent with the observed high enthalpy and reduced entropy associated with binding. The study was published online in ACS Medicinal Chemistry Letters on January 26, 2026.