ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling
Background: Determination of acute toxicity, expressed as median lethal dose (LD50), is one of the most important steps in the drug discovery pipeline. Because in vivo assays for oral acute toxicity in mammals are time-consuming and costly, there is thus an urgent need to develop in silico prediction models of oral acute toxicity.
Results: In this study, based on a comprehensive data set containing 7314 diverse chemicals with rat oral LD50values, relevance vector machine (RVM) technique was employed to build the regression models for the prediction of oral acute toxicity in rate, which was compared with those built using other six machine learning approaches, including k-nearest-neighbor regression, random forest (RF), support vector machine, local approximate Gaussian process, multilayer perceptron ensemble, and extreme gradient boosting. A subset of the original molecular descriptors and structural fingerprints (PubChem or SubFP) was chosen by the Chi squared statistics. The prediction capabilities of individual QSAR models, measured by q2ext for the test set containing 2376 molecules, ranged from 0.572 to 0.659.
Conclusion: Considering the overall prediction accuracy for the test set, RVM with Laplacian kernel and RF were recommended to build in silico models with better predictivity for rat oral acute toxicity. By combining the predictions from individual models, four consensus models were developed, yielding better prediction capabilities for the test set (q2ext=0.669–0.689). Finally, some essential descriptors and substructures relevant to oral acute toxicity were identified and analyzed, and they may be served as property or substructure alerts to avoid toxicity. They believe that the best consensus model with high prediction accuracy can be used as a reliable virtual screening tool to filter out compounds with high rat oral acute toxicity.
Determination of acute toxicity in mammals (e.g. rats or mice) is one of the most important tasks for the safety evaluation of drug candidates. Acute toxicity is usually expressed as median lethal dose (LD50), which is the dose amount of a tested molecule to kill 50 % of the treated animals within a given period. According to the regulations and guidelines for the toxicity testing of pharmaceutical substances established by the Organization for Economic Co-operation and Development (OECD), the U.S. Food and Drug Administration (FDA), the National Institutes of Health (NIH), the European Agency for the Evaluation of Medicinal Products (EMEA), etc., the use of alternative in vitro or in silico toxicity assessment methods that avoid the use of animals are strongly recommended. Moreover, in vivo testing for acute toxicity is time-consuming and costly, and therefore extensive efforts have been devoted to the development of in silico methods for toxicity.
Over past decades, a number of quantitative structure-activity relationship (QSAR) models have been developed to predict rodent acute toxicity, It is well-known that acute toxic effect results from multiple potential modes of action (MOA), and it is quite difficult to develop a universal model with reliable prediction accuracy to an extensive data set. Therefore, most QSAR models were built from small data sets of congeneric compounds and thus had limited application domains. Recently, several theoretical models were developed based on relatively large-scale data sets with diverse compounds. For example, developed five QSAR models for 7385 compounds with rat oral acute toxicity data, and the two models developed by kNN and RF achieved comparable performance for the test set (r 2 = 0.66 and 0.70, respectively) to TOPKAT. However, in Zhu’s study, 997 molecules were identified as outliers and eliminated from the training set. Another study reported by Raevsky and coworkers proposed a so-called Arithmetic Mean Toxicity (AMT) modeling approach, which produced local models based on a k-nearest neighbors approach.
Diversity distribution of the training set (n = 4938) and external test set (n = 2376). a, b Chemical space defined by PCA factorization; c chemical space defined by molecular weight (MW) as X-axis and SlogP as Y-axis; d comparison of toxicity value distribution in different data sets. Gray circle stands for the training set, and black rhombus stands for the test set
This approach gave correlation coefficients (r 2) from 0.456 to 0.783 for 10,241 tested compounds, but the prediction accuracy for a molecule depended on the number and structural similarity of its neighbors with experimental data in the training set . Recently, employed local lazy learning (LLL) method to develop LD50 prediction models, and the rat acute toxicity of a molecule could be predicted by the experimental data of its k nearest neighbors. A consensus model by integrating the predictions of individual LLL models yielded a correlation coefficient r 2 of 0.712 for the test set containing 2896 compounds. Similar to Raevsky’s approach, Lu’s approach relied on the priori knowledge of the experimental data of a query’s neighbors, and therefore, the actual prediction capability of this method was associated with the chemical diversity and structural coverage of the training set.
Due to the complicated mechanisms involved in acute toxicity, it is a difficult task to build a single QSAR model with reliable prediction accuracy by using traditional statistical approaches, such as multiple linear regression (MLR), partial least squares (PLS), principal components' regression (PCR), etc. However, machine learning methods have shown promising potential to establish complex QSARs for the data sets with diverse ranges of molecular structures and mechanisms. Certainly, each machine learning method has its intrinsic advantages, shortcomings, and practical constraints. Moreover, the performance of different machine learning methods depends on the structural diversity and representativeness of the molecules in the data set. Therefore, it is quite important to choose the most suitable machine learning method to develop the prediction model for a specific toxicity data set.
Among all existed machine learning methods, most of them may have the common problem of overtraining and overfitting in solving high-dimensional and complex nonlinear problems because they usually need to estimate and optimize many hyperparameters. It is well-known that the complexity of a model often grows linearly with the dimension of data, and thus some forms of post-processing are required to reduce the computational complexity. In order to solve this problem, the relevance vector machine (RVM) method introduced the Bayesian criteria into learning process, and it employs a sparse prior to reduce the unrelevant support vectors of the decision boundary in feature space and gets a sparser model accordingly. Contrary to the similar algorithm, support vector machine (SVM), the penalty parameter C and the insensitive-loss parameter ε are automatically valuated and error bars are got through covariance function in the RVM regression. Meanwhile, RVM has a comparable generalization ability, and its non-zero weights reflect prototype of sampling more than SVM. Therefore, RVM may be a good choice for QSAR modelling.
Scatter plot of the experimental pLD50 values versus the predicted values for the molecules in the (a) training and (b) test sets given by the consensus model without the MPLE predictions.
In this study, based on a large public data set containing 7385 rat oral acute toxicity data compiled by the previous study, RVM was employed to establish the regression models for the prediction of oral acute toxicity in rat, and was compared with the other six machine learning methods, including SVM, k-nearest-neighbor regression (kNN), random forest (RF), local approximate Gaussian process (laGP), multilayer perceptron ensemble (MPLE), and eXtreme gradient boosting (XGBoost). The performance of all the seven machine learning methods was assessed and compared by the predictive power and application domains of the models to the external test set. Moreover, the possibility to achieve better prediction of rat oral acute toxicity by combining the predictions from multiple QSAR models was explored1.
Lei, T., Li, Y., Song, Y. et al. ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling. J Cheminform8, 6 (2016). https://doi.org/10.1186/s13321-016-0117-7