BLAST [1,2] is a local alignment tool widely used as a preliminary step for the identification of gene or protein functions. The command-line package of NCBI-Blast offers several useful features. These features include making a BLAST database of a set of nucleotide or protein sequences, blast a query sequence against them or all-against-all blast. In this article, these commands are explained.
The NCBI-Blast+ package [3] is freely accessible and can be downloaded from here. There are both Linux and Windows packages available.
A blast database is required made up of the local sequences in order to blast a single query sequence or multiple sequences. Therefore, to make a blast database, open a terminal and type the following commands.
1. Making BLAST database of local sequences
The input file must consist of sequences in FASTA format.
$ makeblastdb -in input.fasta -parse_seqids -dbtype prot -out blastdb
Here, -parse_seqids is used because it may later help in parsing the sequence ids of the given sequences for further analyses. -in refers to the input file, -dbtype can be protein or nucleotide and -out is the name of the BLAST database to be created. If your input file is present in another directory then provide the complete path.
2. BLAST the local database against a single sequence
$ blastp -db blastdb -query seq.fasta -outfmt 0 -out result.txt -numthreads 4
where, -db is the BLAST database created in the previous step, -query is a file consisting of FASTA sequence, -outfmt is the output format which can be defined in several ways as shown here, and -numthreads refers to the number of CPUs to be used during the search. In the case of nucleotide sequences, use blastn or any other appropriate blast executable.
3. all against all
To BLAST local sequences against the local database created from the same input sequences, the input sequences are used as a query file in FASTA format.
$ blastp -db blastdb -query input.fasta -outfmt 0 -out result.txt -numthreads 4
As you can see in the above command, the database is the same local database created in the first step and the query are the input sequences from which the local database was created in the first place.
If you want to use the Windows version, then run the same commands by providing the path to the executables. The installation tutorial will be explained in the upcoming article.
References
- Altschul, S. F. (2001). BLAST algorithm. eLS.
- Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research, 25(17), 3389-3402.
- Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., & Madden, T. L. (2009). BLAST+: architecture and applications. BMC bioinformatics, 10(1), 421.