Prediction of biochemical reactions catalyzed by enzymes in humans

in Algorithms/Drug Discovery/Machine Learning by

There are many biological important enzymes which exist in the human body, one of them is Cytochrome P450 (CyP450) enzymes which are mostly considered in drug discovery due to their involvement in the majority (75%) of drug metabolism [1]. Therefore, various in-silico methods have been applied to predict the possible substrates of CyP 450 enzymes [2-4]. Recently, an in-silico model has been developed to predict the potential chemical reactions mediated by the enzymes present in humans including CyP450 enzymes [5].

This method is based on descriptors calculation and supervised machine learning. The pipeline utilizes two databases: Human Metabolome Database (HMDB) and Braunschweig Enzyme Database (BRENDA) as training datasets, and DrugBank as the test dataset. The physicochemical properties of a query molecule are matched with the substrates obtained from the databases and then assumed to be catalyzed by as the same enzyme as the matched substrate. PaDEL-Descriptor is used to calculate the chemical and physical properties of substrates [6]. Similarly, all the matched substrates are found from the databases and the query molecule is scored according to the integrated scoring method, which captures the distribution of individual scores by assigning a positive weight to the scores higher than the average [5]. Four different machine learning algorithms are applied in this pipeline: artificial neural network, naïve Bayes, multiple linear regression, and random forest which are used for cross-validation to obtain the best model. Further, the performance is validated by the test dataset, i.e., DrugBank.

According to their performance evaluation, this prediction model can successfully predict the enzymatic reactions for substrates (query molecules) and can also be applied to find some other biologically relevant reactions catalyzed by human enzymes [5].


  1. Guengerich, F. P. (2007). Cytochrome p450 and chemical toxicology. Chemical research in toxicology21(1), 70-83.
  2. Yap, C. W., Xue, Y., & Chen, Y. Z. (2006). Application of support vector machines to in silico prediction of cytochrome p450 enzyme substrates and inhibitors. Current topics in medicinal chemistry6(15), 1593-1607.
  3. Jensen, B. F., Vind, C., Padkjær, S. B., Brockhoff, P. B., & Refsgaard, H. H. (2007). In silico prediction of cytochrome P450 2D6 and 3A4 inhibition using Gaussian kernel weighted k-nearest neighbor and extended connectivity fingerprints, including structural fragment analysis of inhibitors versus noninhibitors. Journal of medicinal chemistry50(3), 501-511.
  4. Olsen, L., Oostenbrink, C., & Jørgensen, F. S. (2015). Prediction of cytochrome P450 mediated metabolism. Advanced drug delivery reviews86, 61-71.
  5. Yu, M. S., Lee, H. M., Park, A., Park, C., Ceong, H., Rhee, K. H., & Na, D. (2018). In silico prediction of potential chemical reactions mediated by human enzymes. BMC bioinformatics19(8), 207.
  6. Yap, C. W. (2011). PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints. Journal of computational chemistry32(7), 1466-1474.
Download PDF

Muniba is a Bioinformatician based in the South China University of Technology. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Leave a Reply