In Silico Explainable Multiparameter Optimization Approach for De Novo Drug Design against Protein

The aim of drug design and development is to produce a drug that can inhibit the target protein and possess a balanced physicochemical and toxicity profile. Traditionally, this is a multistep process where different parameters such as activity and physicochemical and pharmacokinetic properties are optimized sequentially, which often leads to high attrition rate during later stages of drug design and development.They have developed a deep learning-based de novo drug design method that can design novel small molecules by optimizing target specificity as well as multiple parameters (including late-stage parameters) in a single step. All possible combinations of parameters were optimized to understand the effect of each parameter over the other parameters. An explainable predictive model was used to identify the molecular fragments responsible for the property being optimized. The proposed method was applied against the human 5-hydroxy tryptamine receptor 1B (5-HT1B), a protein from the central nervous system (CNS). Various physicochemical properties specific to CNS drugs were considered along with the target specificity and blood–brain barrier permeability (BBBP), which act as an additional challenge for CNS drug delivery. The contribution of each parameter toward molecule design was identified by analyzing the properties of generated small molecules from optimization of all possible parameter combinations. The final optimized generative model was able to design similar inhibitors compared to known inhibitors of 5-HT1B. In addition, the functional groups of the generated small molecules that guide the BBBP predictive model were identified through feature attribution techniques.

Ligand-based de novo small molecule design. (a) Pre-trained generative model on the ChEMBL database; (b) dataset curated from small molecules that modulate the activity of structurally related proteins; (c) transfer learning with the curated dataset; (d) MPO using reinforcement learning; (e) physicochemical properties and structural alerts (rule-based filters) were used to filter drug-like molecules specific to the target protein of interest.

Drug design and development is a long process with low success rate. It has been observed that, due to undesirable biological profiles, majority of the drugs fail during various stages of drug development. For example, during hit identification, the activity of drug-like molecules against the target protein remains the main focus, while the other parameters are mostly optimized during later stages of drug development. Optimization of multiple parameters during the initial stages of drug design can lead to better success rate and reduction in time. Although multiparameter optimization (MPO) in initial stages is desirable, choosing the optimal combination of parameters to be optimized is often challenging. Multiple parameters of drug-like molecules are often conflicting because improving one of the parameters of interest might adversely affect other related parameters. Consequently, appropriate selection of the parameters to be optimized can be an MPO problem in itself.

Interactions of the de novo generated molecule with the 5-HT1B receptor. (a) Binding pocket of agonist ergotamine (blue sticks) and Mol_7422 (magenta sticks) molecules. The receptor is shown in green sticks. (b) Residues of the 5-HT1B receptor interacting with the Mol_7422 molecule (magenta) are shown in green sticks.

The pharmacological properties to be optimized also depend on the target tissue of interest. One classic example is the drug candidates of the central nervous system, where multiple physicochemical properties influence absorption, distribution, metabolism, and excretion (ADME), binding efficiency, and safety. Apart from target specificity, these drugs additionally require effective blood–brain barrier permeability (BBBP). Multiple properties such as the octanol–water partition coefficient (log P), molecular weight (MW), polar surface area (PSA), and hydrogen bonding are important factors governing the BBBP of molecules targeting proteins of the central nervous system.

Recent advances in the field of artificial intelligence and the success of reinforcement learning techniques in molecular optimization have shown promising results. There have been various methods that attempt MPO to optimize different properties of generated molecules. Winter et al. applied the particle swarm optimization algorithm during drug design, while another study has used matched molecular pairs (MMP) to learn the chemical transformations involved in molecular optimization and has tested their model’s ability to simultaneously optimize log D, solubility, and clearance properties. Recently, two other studies have used MPO to optimize the BBBP along with other properties. Although the study by Deng et al. does not consider the BBBP explicitly for optimization, it uses related basic properties to model the same, which might not capture the complexity involved. The study by Pereira et al. used a BBBP prediction model and binding affinity model to design molecules against a target protein. Moret et al. have developed a beam search-based generative model to simultaneously generate and prioritize molecules in an automated fashion, without employing additional selection methods.

While most of these methods used a maximum of two parameters for optimization, a recent study has performed simultaneous optimization of 11 physicochemical properties of drug-like molecules. However, there has been no attempt to understand the relative importance of each parameter during the optimization. Most deep learning-based de novo drug design methods are ligand-based, which require a target-specific ligand dataset for initial training. This restricts the application of ligand-based deep neural network models against novel drug targets or cases where limited experimental data is available. In their earlier work, they have proposed a method that can overcome the issue of insufficiency in a target-specific ligand dataset and design small molecules specific to novel target proteins. In this work, the de novo ligand-based drug design algorithm includes optimization of multiple physicochemical and late-stage pharmacological properties along with target specificity for CNS drug candidates. The reward function of the reinforcement learning framework used in their previous study was modified to adapt the method to multiparameter optimization. The method helps confine the design and optimization process to a specific region of the chemical space with the desired property profile.

As a proof of concept, the method was used to design novel small molecules against the human 5-hydroxy tryptamine receptor 1B (5-HT1B) protein, which acts as a major target protein for therapeutics in the central nervous system (CNS). 5-HT1B belongs to the G-protein coupled receptor family and is the target of serotonin (5-HT). It has been implicated in cancer proliferation and several CNS disorders including obsessive-compulsive disorder (OCD), depression, migraine, and Parkinson’s disease. While designing novel small molecules, various parameters like binding affinity, physicochemical properties like log P and MW, and probability of crossing the BBB were also optimized. Further, the BBBP prediction model was interpreted using feature attribution methods to understand the key molecular features learned by the model, which were also cross-validated with known molecular features governing BBBP reported in the literature. They propose that incorporation of late-stage pharmacological properties such as BBBP in the early stage drug design process can reduce late-stage attrition and improve the success rate of the overall drug design and development.

An In Silico Explainable Multiparameter Optimization Approach for De Novo Drug Design against Proteins from the Central Nervous System Navneet Bung, Sowmya Ramaswamy Krishnan, and Arijit Roy Journal of Chemical Information and Modeling 2022 62 (11), 2685-2695 DOI: 10.1021/acs.jcim.2c00462