Phylogenetics
Installing and executing ProtTest3 on Ubuntu

Prottest3 is a software which is used to select a best-fit amino acid replacement model for a set of protein sequences [1]. ProtTest3 finds a best-fit model on the basis of the smallest value of one of three criteria: Akaike Information Criterion (AIC), Corrected Akaike Information Criterion, Bayesian Information Criterion (BIC) score or Decision Theory Criterion (DT) selected by the user. In this article, we will learn how to download and install the command-line version of ProtTest3 on Ubuntu.
Before installing the software, update and upgrade your system by typing the following the commands in a terminal:
$ sudo apt-get update
$ sudo apt-get upgrade
Downloading ProtTest3
A ProtTest3 package compatible with your system can be downloaded from here. We are downloading the tar file for Ubuntu. Open the terminal and change to the directory where you want to download the software, let’s say ‘Downloads’.
$ cd Downloads
$ wget https://bitbucket.org/diegodl/prottest3/downloads/prottest-3.2-20120316.tar.gz
$ tar xvzf prottest-3.2-20120316.tar.gz
If you wish to run ProtTest3 on a cluster computer, then you need to download the MPJ Express from here.
Make sure you have the latest version of Java installed on your system or you can download it from here.
Installing ProtTest3
After that, you will see a new directory named with ‘prottest’, change to that directory.
$ cd prottest3
$ set PROTTEST_HOME=/home/user/Downloads/prottest3
In this directory, you will find the .jar file of ProtTest3, which you can directly use with the command line arguments on a terminal.
Installing MPJ Express
After downloading the tar package, type the following commands in a terminal:
$ cd Downloads
$ tar xvzf mpj.tar.gz
$ export MPJ_HOME=$PROTTEST_HOME/mpj
$ export PATH=$MPJ_HOME/bin:$PATH
Executing ProtTest3
To run ProtTest3, you need to run the .jar file located in the prottest3 directory.
$ java -jar prottest3.jar -i <alignment_file> -t <user_defined_tree_file> -o <output_file> -[matrix] -<models_to_evaluate> -<selection_criteria>
There are many other command-line options that you can see in its manual.
If you want to use multiple threads then you can do it with an additional thread option using which you can define the number of cores to be used.
$ java -jar prottest3.jar -i <alignment_file> -t <user_defined_tree_file> -o <output_file> -[matrix] -<models_to_evaluate> -<selection_criteria> -threads 4
Executing ProtTest3 on a cluster computer
If you have a large number of protein sequences in your dataset, then you can always go for cluster execution.
At first, start MPJ, and then run the HPC shell scripts provided in the same directory.
$ cd Downloads/prottest3
$ mpjboot machines
For the HPC script, the basic syntax is as follows:
$ ./runProtTestHPC.sh <no_of_processors> <parameters>
$ ./runProtTestHPC.sh 2 -i <alignment_file> -t <user_defined_tree_file> -o <output_file> -[matrix] -<models_to_evaluate> -<selection_criteria>
Hope this article helps!
If you have any queries, then comment below or post a question here.
References
- Darriba, D., Taboada, G. L., Doallo, R., & Posada, D. (2011). ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics, 27(8), 1164-1165.
Phylogenetics
How to find a best fit model using IQ-TREE?

Previously, we have provided an installation tutorial for IQ-TREE on Ubuntu. In this article, we are going to perform model selection for a dataset using the standalone tool of IQ-TREE. (more…)
Phylogenetics
Installing TREE-PUZZLE on Ubuntu

TREE-PUZZLE is a software to reconstruct phylogenetic trees using the maximum likelihood method [1,2]. It requires sequence data as input and implements a fast search algorithm and quartet puzzling. It can process large datasets easily. In this article, we will install TREE-PUZZLE on Ubuntu. (more…)
Phylogenetics
Tutorial: Constructing phylogenetic tree using MEGA7

MEGAX is a bioinformatics software/tool used for phylogenetic tree construction. In this article, we will construct a maximum likelihood (ML) tree for a number of protein sequences using MEGA7 [1]. (more…)
You must be logged in to post a comment Login