Machine learning in prediction of ageing-related genes/proteins

in Algorithms/Databases/Machine Learning by

Ageing has a great impact on human health, when people’s age advance towards 80 years, approximately half of the proteins in the body get damaged through oxidation. The chemical degradations occurring in our body produce energy by the consumed food via oxidation in the presence of oxygen. There are some proteins which have been found to be associated with ageing and age-related diseases such as Alzheimer’s disease [1], which makes them relevant to explore their functionality and characteristics. The regulation and molecular basis of ageing are still poorly understood. More than three hundred ageing-related genes have been associated with human ageing so far. Many studies have revealed that ageing has genetic components [2-5].

Machine learning is being rapidly applied in the field of computational biology, which works on the human-designed algorithms that can learn from and use to make predictions on data. Besides, there are several databases has been developed for studying ageing-related genes/proteins. For example, GenAge is a highly maintained, manually curated benchmark database which is composed of ageing-related genes [6]. The genes in this database are related to longevity and/or ageing in humans and some model organisms such as yeast, mice, flies, worms, etc. This database summarizes 305 human ageing-related genes (version 18) amongst which some of the genes are found directly linked with the human ageing. The other ageing-related genes/proteins databases include AGEMAP [7], NetAge [8], LongevityMap [9], and so on.

Features of ageing-related genes/proteins [10]:

  • more protein-protein interaction partners.
  • more ageing-related protein-protein interaction partners.
  • higher co-expression coefficients with other genes.
  • higher K-core value; K-core or coreness value of a node is a maximal subgraph in which each vertex has a degree K [11,12].

Some of the supervised machine learning methods such as support vector machines (SVMs), k-nearest neighbor (KNN), and decision tree classifiers have been applied in identifying and understanding ageing-related genes and/or proteins in Caenorhabditis elegans [13], Drosophila melanogaster [14], and mice [15]. Recently, a simple classification model has been introduced based on different protein features such as response to oxidative stress, number of ageing-related protein interaction partners, and so on [16]. They have applied three different filtering algorithms: a scalable tree boosting system, regression analysis, and SVM, to identify ageing-related proteins, discover characteristic ageing-related features in humans, and quantify the relevance of the identified proteins in the process of ageing.

The approaches and methods of machine learning in computational biology will be discussed in upcoming articles in detail. Though machine learning is making some advancements in identifying new ageing-related genes and proteins, the metabolic understanding of ageing is still not understood well, it needs some other new and better approaches and models to identify maximum proteins related to ageing and their pathways.


  2. de Magalhães, J. P. (2003). Is mammalian aging genetically controlled?. Biogerontology4(2), 119-120.
  3. de Magalhães, J. P., Cabral, J. A., & Magalhães, D. (2005). The influence of genes on the aging process of mice: a statistical assessment of the genetics of aging. Genetics169(1), 265-274.
  4. Vellai, T., Takács-Vellai, K., Sass, M., & Klionsky, D. J. (2009). The regulation of aging: does autophagy underlie longevity?. Trends in cell biology19(10), 487-494.
  5. Kenyon, C. J. (2010). The genetics of ageing. Nature464(7288), 504.
  6. Tacutu, R., Thornton, D., Johnson, E., Budovsky, A., Barardo, D., Craig, T., Diana, E., Lehmann, G., Toren, D., Wang, J., Fraifeld, V. E., de Magalhaes, J. P. (2018) “Human Ageing Genomic Resources: new and updated databases.” Nucleic Acids Research 46(D1):D1083-D1090.
  7. Zahn, J. M., Poosala, S., Owen, A. B., Ingram, D. K., Lustig, A., Carter, A., … & Lakatta, E. G. (2007). AGEMAP: a gene expression database for aging in mice. PLoS genetics3(11), e201.
  8. Tacutu, R., Budovsky, A., & Fraifeld, V. E. (2010). The NetAge database: a compendium of networks for longevity, age-related diseases and associated processes. Biogerontology11(4), 513-522.
  9. Budovsky, A., Craig, T., Wang, J., Tacutu, R., Csordas, A., Lourenço, J., … & de Magalhães, J. P. (2013). LongevityMap: a database of human genetic variants associated with longevity. Trends in Genetics29(10), 559-560.
  10. Li, Y.-H., Zhang, G.-G. & Guo, Z. Computational Prediction of Aging Genes in Human. In Biomedical Engineering and Computer Science (ICBECS), 2010 International Conference on, 1–4 (IEEE 2010)
  11. Dorogovtsev, S. N., Goltsev, A. V., & Mendes, J. F. F. (2006). K-core organization of complex networks. Physical review letters96(4), 040601.
  12. Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems1695(5), 1-9.
  13. Li, Y. H., Dong, M. Q., & Guo, Z. (2010). Systematic analysis and prediction of longevity genes in Caenorhabditis elegans. Mechanisms of ageing and development131(11), 700-709.
  14. Song, X., Zhou, Y. C., Feng, K., Li, Y. H., & Li, J. H. (2012, December). Discovering aging-genes by topological features in Drosophila melanogaster protein-protein interaction network. In Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on (pp. 94-98). IEEE.
  15. Feng, K., Song, X., Tan, F., Li, Y. H., Zhou, Y. C., & Li, J. H. (2012, May). Topological analysis and prediction of aging genes in Mus musculus. In Systems and Informatics (ICSAI), 2012 International Conference on (pp. 2268-2271). IEEE.
  16. Kerepesi, C., Daróczy, B., Sturm, Á., Vellai, T., & Benczúr, A. (2018). Prediction and characterization of human ageing-related proteins by using machine learning. Scientific reports8(1), 4094.
Download PDF

Muniba is a Bioinformatician based in the South China University of Technology. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Leave a Reply