In this article, we summarize the most widely used tools (online/ standalone) for transcription binding site prediction in DNA sequences.
1. PROMO
This tool identifies putative transcription factor binding sites in DNA sequences [1]. It acts as a virtual laboratory where it predicts the transcription factor binding sites based on constructed specific binding site weight matrices from the TANSFAC database [2]. It provides an easy-to-use graphical user interface and downloadable output files. PROMO is currently using version 8.3 of TRANSFAC [1]. It is available online at http://alggen.lsi.upc.es/cgi-bin/promo_v3/promo/promoinit.cgi?dirDB=TF_8.3
2. TRANSFAC
It is a database of eukaryotic transcription factors, their DNA binding profiles, and genomic binding sites [2]. TRANSFAC consists of information about transcriptional regulation that can help in predicting potential transcription factor binding sites in DNA sequences. However, this is not freely accessible. You can buy it online at https://genexplain.com/transfac/#section0.
3. JASPAR CORE
JASPAR CORE is an online curated database of transcription factor binding sites [3,4]. This database consists of a non-redundant set of profiles and experimentally defined transcription factor binding sites of eukaryotes. It stores the profiles in the form of position frequency matrices (PFMs) and transcription factor flexible models (TFFMs). Users can find profiles for multiple species across six taxonomic groups. It is freely accessible at http://jaspar.genereg.net/.
4. TFBIND
This is another online prediction tool of transcription factor binding sites in DNA sequences [5]. TFBIND searches for TATA boxes, GC boxes, CCAAT boxes, transcription start sites (TSS). The tool uses a weight matrix as defined by TRANSFAC R.3.4. It is quite easy to use. Users have to upload or enter a nucleic acid sequence in FASTA format and submit it. TFBIND also allows the users to obtain compressed results. It is freely accessible at http://tfbind.hgc.jp/.
5. Tfsitescan
Tfsitescan is an online predictor of transcription factor binding sites in DNA sequences [6]. It works best with sequences consisting of around 500 nucleotides. Users can search for mammalian sites, prokaryotic sites, amphibian sites, yeast, avian, drosophila, and many other sites. It is freely available at http://www.ifti.org/cgi-bin/ifti/Tfsitescan.pl
References
- Messeguer, X., Escudero, R., Farré, D., Nuñez, O., Martı́nez, J., & Albà, M. M. (2002). PROMO: detection of known transcription regulatory elements using species-tailored searches. Bioinformatics, 18(2), 333-334.
- Wingender, E., Dietze, P., Karas, H., & Knüppel, R. (1996). TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic acids research, 24(1), 238-241.
- Khan, A., Fornes, O., Stigliani, A., Gheorghe, M., Castro-Mondragon, J. A., Van Der Lee, R., … & Mathelier, A. (2018). JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic acids research, 46(D1), D260-D266.
- Fornes, O., Castro-Mondragon, J. A., Khan, A., Van der Lee, R., Zhang, X., Richmond, P. A., … & Mathelier, A. (2020). JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic acids research, 48(D1), D87-D92.
- Tsunoda, T., & Takagi, T. (1999). Estimating transcription factor bindability on DNA. Bioinformatics (Oxford, England), 15(7), 622-630.
- http://www.ifti.org/cgi-bin/ifti/Tfsitescan.pl