Software
How to blast against a particular set of local sequences (local database)?
BLAST [1,2] is a local alignment tool widely used as a preliminary step for the identification of gene or protein functions. The command-line package of NCBI-Blast offers several useful features. These features include making a BLAST database of a set of nucleotide or protein sequences, blast a query sequence against them or all-against-all blast. In this article, these commands are explained.
The NCBI-Blast+ package [3] is freely accessible and can be downloaded from here. There are both Linux and Windows packages available.
A blast database is required made up of the local sequences in order to blast a single query sequence or multiple sequences. Therefore, to make a blast database, open a terminal and type the following commands.
1. Making BLAST database of local sequences
The input file must consist of sequences in FASTA format.
$ makeblastdb -in input.fasta -parse_seqids -dbtype prot -out blastdb
Here, -parse_seqids is used because it may later help in parsing the sequence ids of the given sequences for further analyses. -in refers to the input file, -dbtype can be protein or nucleotide and -out is the name of the BLAST database to be created. If your input file is present in another directory then provide the complete path.
2. BLAST the local database against a single sequence
$ blastp -db blastdb -query seq.fasta -outfmt 0 -out result.txt -numthreads 4
where, -db is the BLAST database created in the previous step, -query is a file consisting of FASTA sequence, -outfmt is the output format which can be defined in several ways as shown here, and -numthreads refers to the number of CPUs to be used during the search. In the case of nucleotide sequences, use blastn or any other appropriate blast executable.
3. all against all
To BLAST local sequences against the local database created from the same input sequences, the input sequences are used as a query file in FASTA format.
$ blastp -db blastdb -query input.fasta -outfmt 0 -out result.txt -numthreads 4
As you can see in the above command, the database is the same local database created in the first step and the query are the input sequences from which the local database was created in the first place.
If you want to use the Windows version, then run the same commands by providing the path to the executables. The installation tutorial will be explained in the upcoming article.
References
- Altschul, S. F. (2001). BLAST algorithm. eLS.
- Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research, 25(17), 3389-3402.
- Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., & Madden, T. L. (2009). BLAST+: architecture and applications. BMC bioinformatics, 10(1), 421.
Bioinformatics Programming
Free_Energy_Landscape-MD: Python package to create Free Energy Landscape using PCA from GROMACS.
In molecular dynamics (MD) simulations, a free energy landscape (FEL) serves as a crucial tool for understanding the behavior of molecules and biomolecules over time. It is difficult to understand and plot a meaningful FEL and then extract the time frames at which the plot shows minima. In this article, we introduce a new Python package (Free_Energy_Landscape-MD) to generate an FEL based on principal component analysis (PCA) from MD simulation done by GROMACS [1].
Bioinformatics News
VS_Analysis: A Python package to perform post-virtual screening analysis
Virtual screening (VS) is a crucial aspect of bioinformatics. As you may already know, there are various tools available for this purpose, including both paid and freely accessible options such as Autodock Vina. Conducting virtual screening with Autodock Vina requires less effort than analyzing its results. However, the analysis process can be challenging due to the large number of output files generated. To address this, we offer a comprehensive Python package designed to automate the analysis of virtual screening results.
Bioinformatics Programming
vs_interaction_analysis.py: Python script to perform post-virtual screening analysis
Analyzing the results of virtual screening (VS) performed with Autodock Vina [1] can be challenging when done manually. In earlier instances, we supplied two scripts, namely vs_analysis.py [2,3] and vs_analysis_compounds.py [4]. This time, we have developed a new Python script to simplify the analysis of VS results.
Software
How to install Interactive Genome Viewer (IGV) & tools on Ubuntu?
Interactive Genome Viewer (IGV) is an interactive tool to visualize genomic data [1]. In this article, we are installing IGV and tools on Ubuntu desktop.
MD Simulation
[Tutorial] Installing VIAMD on Ubuntu (Linux).
Visual Interactive Analysis of Molecular Dynamics (VIAMD) is a tool that allows the interactive analysis of molecular dynamics simulations [1]. In this article, we are installing it on Ubuntu (Linux).
Docking
[Tutorial] Performing docking using DockingPie plugin in PyMOL.
DockingPie [1] is a PyMOL plugin to perform computational docking within PyMOL [2]. In this article, we will perform simple docking using DockingPie1.2.
Docking
How to install the DockingPie plugin on PyMOL?
DockingPie [1] is a plugin of PyMOL [2] made to fulfill the purpose of docking within the PyMOL interface. This plugin will allow you to dock using four different algorithms, namely, Vina, RxDock, SMINA, and ADFR. It will also allow you to perform flexible docking. Though the installation procedure is the same for all OSs, in this article, we are installing this plugin on Ubuntu (Linux).
Software
Video Tutorial: Calculating binding pocket volume using PyVol plugin.
This is a video tutorial for calculating binding pocket volume using the PyVol plugin [1] in Pymol [2].
Software
How to generate topology from SMILES for MD Simulation?
If you need to generate the topology of molecules using their SMILES, a simple Python script is available.
Software
[Tutorial] Installing jdock on Ubuntu (Linux).
jdock is an extended version of idock [1]. It has the same features as the idock along with some bug fixes. However, the binary name and the GitHub repository names are changed. We are installing jdock on Ubuntu (Linux).
Software
How to upgrade cmake on Ubuntu (Linux)?
In bioinformatics, cmake is used to install multiple software including GROMACS, jdock, and so on. Here is a short tutorial on how to upgrade cmake on Ubuntu and get rid of the previous version. (more…)
Software
How to install GMXPBSA on Ubuntu (Linux)?
GMXPBSA is a tool to calculate binding free energy [1]. It is compatible with Gromacs version 4.5 and later. In this article, we will install GMXPBSA version 2.1.2 on Ubuntu (Linux).
Docking
[Tutorial] Installing Pyrx on Windows.
Pyrx [1] is another virtual screening software that also offers to perform docking using Autodock Vina. In this article, we will install Pyrx on Windows. (more…)
MD Simulation
How to solve ‘Could NOT find CUDA: Found unsuitable version “10.1”‘ error during GROMACS installation?
Compiling GROMACS [1] with GPU can be trivial. Previously, we have provided a few articles on the same. In this article, we will solve an error frequently occurring during GROMACS installation.
Software
Installing Autodock4 on MacOS.
Previously, we installed the Autodock suite [1] on Ubuntu. Visit this article for details. Now, let’s install it on MacOS.
Docking
How to install Autodock4 on Ubuntu?
Autodock suite is used for docking small molecules [1]. Recently, Autodock-GPU [2] is developed to accelerate the docking process. Its installation is described in this article. In this tutorial, we will install Autodock 4.2.6 on Ubuntu.
Software
DS Visualizer: Uses & Applications
Discovery Studio (DS) Visualizer (from BIOVIA) is a visualization tool for viewing, sharing, and analyzing proteins [1]. Here are some uses and applications of DS Visualizer.
Software
Protein structure & folding information exploited from remote homologs.
Remote homologs are similar protein structures that share similar functions, but there is no easily detectable sequence similarity in them. A new study has revealed that the protein folding information can be exploited from remote homologous structures. A new tool is developed to recognize such proteins and predict their structure and folding pathway. (more…)
RNA-seq analysis
Pathonoia- A new tool to detect pathogens in RNA-seq data.
Detecting viruses and bacteria in RNA-seq data with less false positive rate is a difficult task. A new tool is introduced to detect pathogens in RNA-seq data with high precision and recall known as Pathonoia [1].
Software
AlphaFill- New algorithm to fill ligands in AlphaFold models.
AlphaFold is a popular artificial intelligence based protein prediction tool [1]. Though it predicts good protein structures, it lacks the capability to predict the small molecules present in the structure such as ligands. For this purpose, AlphaFill is introduced by Hekkelman et al.,[2]. (more…)
You must be logged in to post a comment Login