Bioinformatics and stem cell research- A mini review
Stem cells are cells that can be differentiated into other types and are thus pluripotent with the ability to become cells of all lineages. Cells found in the blastocyst of embryos are called as embryonic stem cells or ESCs [1, 2] that are considered “gold” standard of pluripotency . There are also adult stem cells found in several tissues for the purpose of repair such as mesenchymal stem cells that have been differentiated into various other tissues . These pluripotent stem cells hold promise to aid in studying embryo development, differentiation of cells, and regenerative medicine that aims at “personalized” medicine . Another class of stem cell known as induced pluripotent stem cells (iPSCs) were created by expressing key transcription factors in adult cells . The field of regenerative medicine has seen plenty of research articles on stem cells. While some stem cell types have been differentiated into specific cell types to cure several diseases such as neurodegenerative disorders, cancer, diabetes, heart disease, etc. , iPSCs have been differentiated into retinal cells, endothelial cells, and neurons .
Where does bioinformatics fit in?
Bioinformatics is a merger of hardware, mathematics, networking, and databases to develop tools that can be used by a person interested in life sciences to process and analyze data . Bioinformatic tools can potentially help in identifying its possible function, for example, KEGG can identify pathways, orthologs, and functions of sequences submitted . The use of bioinformatics in stem cell biology initially revolved around the self-renewal dynamics of adult stem cells  that later saw the application of molecular biology data along with the use of genome sequencing. With molecular profiling of single cells and systems biology that aid in modeling stem cell patterns, the field of bioinformatics can play a key role in stem cell biology .
A few tools:
Let’s discuss a few examples to gain a better understanding. The transcriptome of pluripotent stem cells has been first studied using DNA microarrays with classification algorithms that aid in distinguishing among differentiated, multipotent, and pluripotent stem cells . In the case of larger datasets, the classification of pluripotent stem cells can be facilitated by the use of machine learning. One such tools is an algorithm PluriTest that uses measurements of DNA microarrays to analyze pluripotent cells using bioinformatics models . PluriNetWork can uncover mechanisms and molecules involved in the pluripotency of stem cells using a combination of links to literature, gene ontology, and automated analysis . Mechanisms in stem cells such as regulation associated at a post-transcriptional level have been studied using next-generation sequencing techniques . For example, the involvement of ZFP217, a zinc finger protein associated with chromatin in the regulation of pluripotency in human embryonic stem cells is shown with a MeRIP-Seq method .
Taken together, these and many other genome-wide molecular profiling studies have collectively contributed to our understanding of the multilayered regulation of pluripotency, and have further served as models to understand the regulation of cell-type identity for other, less-investigated lineage . A common curated system used a combination of social networking software as well as Wiki to combine research data, key genes and protein circuits to be used with ease and analysis with Cytoscape software . Such a network is a common system composed of literature and details of transcription factors and signals that is tailor-made for a particular requirement .
The field of epigenetics makes an entry to analyze differences between ESCs and iPSCs as well as to study patterns seen with iPSCs such as their bias towards lineages of a donor [3, 10]. For instance, a study published in 2011 used a support vector machine learning algorithm based on methylation data of ESCs and iPSCs  that could identify the ESCs with precision but iPSCs at 61% sensitivity . Regions of differential methylation were analyzed using ‘comprehensive high-throughput arrays for relative methylation’ (CHARM) to uncover promoters of factors for distinct lineages [16, 17].
Another application of bioinformatics in stem cell biology is to assess the differentiation ability of a stem cell using a “scorecard” approach. Bock et al, 2011 developed a deviation scorecard with methylation patterns and gene expression of human ESCs as they hypothesized that any deviation here could prevent differentiation to particular lineages. Differences in iPSC lines in comparison to ESCs were tabulated . Several genes were listed as markers of germ layers, that when expressed at early stages indicate the differentiation potential, for example, hypermethylation of GRM (glutamate receptor) in motor neurons [3, 10].
An algorithm TeratoScore uses gene expression of teratomas to evaluate the differentiation ability of human pluripotent stem cells as they can differentiate into all three germ layers. The origin of a tumor, either pluripotent or tissue-specific cells can be classified by the tool . Another tool CellNet uses gene expression profiles to give a prediction of a specific cell type in the query along with transcription factors . The efficiency of differentiation of pluripotent stem cells can be predicted using a platform called KeyGenes that uses RNA-Seq or microarray data of human fetal tissues .
A data repository for stem cells called the Cellfinder looks at augmenting human embryonic stem cell registry (hESCreg) into a tool that facilitates the design of projects and analysis of the registry . Additionally, a web-interface called StemBase contains SAGE (Serial Analysis of Gene Expression) data of mouse and human stem cells and allows for studying specific genes or markers .
This short review has highlighted a few of the tools that find use in stem cell research. The above-mentioned tools show that the field of bioinformatics holds much promise in analyzing stem cells using web interfaces and tools. With further inputs from the various “OMICS” that unravel the roles of molecules in a single cell, the use of bioinformatics can aid in analyzing fates of cells as well as potentially delve deeper into this exciting field of stem cells that are being pitched in as a panacea for several diseases that would help us realize an important goal of stem cell biology: a detailed glimpse into understanding the nuances of the cells vital for development and maintenance of life.
- Evans M. Discovering Pluripotency: 30 years of mouse embryonic stem cells. Nat Rev Mol Cell Biol. 2011;12(10):680- 6.
- Babu PBR and Krishnamoorthy P. Applications of Bioinformatics Tools in Stem Cell Research: An Update. J Pharm Res. 2012;5(9),4863-6.
- Nestor MW and Noggle SA. Standardization of human stem cell pluripotency using bioinformatics. Stem Cell Res Ther. 2013;4:37.
- Pacini S. Deterministic and stochastic approaches in the clinical application of mesenchymal stromal cells (MSCs). Front Cell Dev Biol. 2014;12:50.
- Takahashi K and Yamanaka S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell. 2006;126:663–6.
- Bilic J and Izpisua Belmonte JC. Concise review: Induced pluripotent stem cells versus embryonic stem cells: close enough or yet too far apart? Stem Cells. 2012; 30:33–41.
- Orozco A, Morera J, Jiménez S, Boza R. A review of Bioinformatics training applied to research in Molecular Medicine, Agriculture and Biodiversity in Costa Rica and Central America. Brief Bioinform. 2013;14(5):661–70.
- Kyoto Encyclopedia of Genes and Available on: https://www.genome.jp/kegg/
- Till JE, et al. A stochastic model of stem cell proliferation, based on the growth of spleen colony-forming cells. Proc Natl Acad Sci U.S.A. 1964;51:29–36.
- Bian Q and Cahan P. Computational Tools for Stem Cell Biology. Trends Biotechnol. 2016;34(12).
- Müller FJ, Laurent LC, Kostka D, Ulitsky I, Williams R, Lu C, et al. Regulatory networks define phenotypic classes of human stem cell lines. Nature. 2008;455:401–5.
- Müller F-J, Schuldt BM, Williams R, Mason D, Altun G, Papapetrou EP, et al. A bioinformatic assay for pluripotency in human cells. Nat Methods. 2011;8:315.
- Aguilo F, et al. Coordination of m6A mRNA methylation and gene transcription by ZFP217 regulates pluripotency and reprog-ramming. Cell Stem Cell. 2015;17:689–704.
- Narad P, Upadhyaya K and Som A. Reconstruction, visualization and explorative analysis of human pluripotency network. Network Biology. 2017;7:57-75.
- Bock C, Kiskinis E, Verstappen G, Gu H, Boulting G, Smith ZD, et al. Reference maps of human ES and iPS cell variation enable high-throughput characterization of pluripotent cell lines. Cell. 2011;144:439–52.
- Kim K, et al. Epigenetic memory in induced pluripotent stem cells. Nature. 2010;467:285–90.
- Kim K, et al. Donor cell type can influence the epigenome and differentiation potential of human induced pluripotent stem cells. Nat. Biotechnol. 2011;29:1117–9.
- Avior Y, et al. TeratoScore: assessing the differentiation potential of human pluripotent stem cells by quantitative expres-sion analysis of teratomas. Stem Cell Rep. 2015;4:967–74.
- Cahan P, et al. CellNet: network biology applied to stem cell engineering. Cell. 2014;158:903–15.
- Roost MS, et al. KeyGenes, a tool to probe tissue differentiation using a human fetal transcriptional atlas. Stem Cell Rep. 2015;4:1112–24.
- Borstlap J, Luong MX, Rooke HM, et al. International stem cell registries. In Vitro Cell Dev Biol – Animal. 2010;46(3-4):242-6.
- Sandie R, Palidwor GA, Huska MR, et al. Recent developments in Stembase: a tool to study gene expression in human and murine stem cells. BMC Res Notes. 2009;2:39.
[Tutorial] Installing Pyrx on Windows.
Pyrx  is another virtual screening software that also offers to perform docking using Autodock Vina. In this article, we will install Pyrx on Windows. (more…)
How to solve ‘Could NOT find CUDA: Found unsuitable version “10.1”‘ error during GROMACS installation?
Compiling GROMACS  with GPU can be trivial. Previously, we have provided a few articles on the same. In this article, we will solve an error frequently occurring during GROMACS installation.
Installing Autodock4 on MacOS.
Previously, we installed the Autodock suite  on Ubuntu. Visit this article for details. Now, let’s install it on MacOS.
How to install Autodock4 on Ubuntu?
Autodock suite is used for docking small molecules . Recently, Autodock-GPU  is developed to accelerate the docking process. Its installation is described in this article. In this tutorial, we will install Autodock 4.2.6 on Ubuntu.
DS Visualizer: Uses & Applications
Discovery Studio (DS) Visualizer (from BIOVIA) is a visualization tool for viewing, sharing, and analyzing proteins . Here are some uses and applications of DS Visualizer.
Protein structure & folding information exploited from remote homologs.
Remote homologs are similar protein structures that share similar functions, but there is no easily detectable sequence similarity in them. A new study has revealed that the protein folding information can be exploited from remote homologous structures. A new tool is developed to recognize such proteins and predict their structure and folding pathway. (more…)
Pathonoia- A new tool to detect pathogens in RNA-seq data.
Detecting viruses and bacteria in RNA-seq data with less false positive rate is a difficult task. A new tool is introduced to detect pathogens in RNA-seq data with high precision and recall known as Pathonoia .
AlphaFill- New algorithm to fill ligands in AlphaFold models.
AlphaFold is a popular artificial intelligence based protein prediction tool . Though it predicts good protein structures, it lacks the capability to predict the small molecules present in the structure such as ligands. For this purpose, AlphaFill is introduced by Hekkelman et al.,. (more…)
How to calculate binding pocket volume using PyVol plugin in PyMol?
Previously, we provided a tutorial for PyVol plugin  installation. In this article, we will calculate the binding pocket volume of protein using the same plugin in PyMol . (more…)
How to generate electron density map using Pymol?
Electron density maps are available for most of the protein structures in PDB. Therefore, in this article, we are using PDB to generate electron density maps in Pymol.
Installing PyVOL plugin in Pymol on Ubuntu (Linux).
PyVOL  is an excellent plugin of Pymol  for pocket visualization of proteins. In this article, we will install the PyVOL plugin in Pymol on Ubuntu. (more…)
How to execute matlab from terminal in Ubuntu (Linux)?
While trying to install Matlab , it generally gives an error stating “matlab: command not found.”. Here, we provide a solution to this error.
How to install Kpax on Ubuntu (Linux)?
Kpax is a bioinformatics program to search and align protein structures . It is currently available for Linux platforms only. In this article, we are going to install the latest version of Kpax (5.1.3) on Ubuntu (Linux). (more…)
How to run do_dssp command (mkdssp) in Gromacs 2022?
In the latest version of GROMACS (2022) , there are some issues regarding the gmx do_dssp command. Apparently, this command either does not run displaying a fatal error, or if it runs then it does not read any frame from MD simulation files. In this article, we are going to run the same command for GROMACS 2022. (more…)
Installing SMINA on Ubuntu (Linux).
SMINA is a fork of AutoDock Vina supporting better scoring function and high-performance energy minimization . In this article, we are going to install SMINA on Ubuntu (Linux). (more…)
How to install ClusCo on Ubuntu (Linux)?
ClusCo is a bioinformatics software to perform clustering and comparison of protein models . In this article, we are going to install ClusCo on Ubuntu (Linux). (more…)
How to run LigAlign plugin on Pymol?
Running a plugin on an old version of Pymol  can give you multiple errors that are not easy to troubleshoot. For example, LigAlign plugin  runs on an old version of Pymol. Previously, we explained how to install LigAlign on Pymol. In this article, we will run the LigAlign command on Pymol. (more…)
How to install the LigAlign plugin on Pymol on Ubuntu (Linux)?
Few errors appear when we try to run the LigAlign plugin  in Pymol . For example, if you try to run the ligand_alignment plugin, it will give you multiple errors including “Unable to initialize LigAlign v1.00“, or “can’t run LigAlign v1.00” or “incorrect Python syntax” or “Plugin has been installed but initialization failed“. In this article, we explain the reason for this issue and how you can rectify these errors. (more…)
How to install multiple Pymol versions on Ubuntu (Linux)?
Sometimes we need to access old versions of Pymol . Running some plugins on Pymol is difficult due to Python incompatibility. New Pymol versions require Python 3.x whereas older versions run on Python 2.x. Therefore, we need to maintain multiple versions of Pymol on a single system. In this article, we will install Pymol 1.7.x along with the latest version (Pymol 2.5.2) on Ubuntu. Later, we will create shortcuts for them.
[Tutorial] Installing Pymol on Mac OS.
Previously, we provided a tutorial for Pymol installation on Ubuntu. In this article, we are going to install Pymol on Mac OS. (more…)
You must be logged in to post a comment Login