A1 Refereed original research article in a scientific journal
Enhancing the Predictive Power of Macrocyclic Drug Permeability by Knowledge Distillation from Analogous Pretraining Data
Authors: Zhang, Yu; Pentikäinen, Olli T.
Publisher: American Chemical Society (ACS)
Publication year: 2025
Journal: Journal of Medicinal Chemistry
Article number: acs.jmedchem.5c02620
ISSN: 0022-2623
eISSN: 1520-4804
DOI: https://doi.org/10.1021/acs.jmedchem.5c02620
Publication's open availability at the time of reporting: Open Access
Publication channel's open availability : Partially Open Access publication channel
Web address : https://pubs.acs.org/doi/10.1021/acs.jmedchem.5c02620
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/506315810
Macrocyclic drugs offer powerful opportunities for modulating protein-protein interactions, yet their development is limited by poor and unpredictable membrane permeability. Experimental testing is slow, and 3D modeling of macrocycles is computationally demanding due to their large conformational space. To address this, we present Multi_DDPP, a deep learning (DL) model that predicts macrocycle permeability directly from 2D structures. Multi_DDPP employs knowledge distillation to leverage permeability data from multiple cell lines, improving generalizability, and uses a task-specific swing-range strategy to reduce label noise. By integrating diverse molecular representations, including physicochemical descriptors, fingerprints, molecular graphs, and hybrid features, the model outperforms existing ML and DL approaches. Node masking highlights the substructures that contribute most to permeability, and regression extensions incorporating physiological parameters further refine these predictions. Early 2D-based permeability prediction with Multi_DDPP avoids the costly generation of 3D conformers and enables the efficient prioritization of macrocycles with favorable pharmacokinetic potential.
Downloadable publication This is an electronic reprint of the original article. |
Funding information in the publication:
We acknowledge the Finnish IT Center for Science (CSC) for providing computational resources (O.T.P.: Project Nos.
jyy2516 and jyy2585). Funding was provided by the Novo Nordisk Foundation (O.T.P.; Pioneer Innovator Grant 0068926
and Distinguished Innovator Grant 0075825) and the Finnish Cultural Foundation (Varsinais-Suomi Regional Fund; Y.Z., 85251449).