Prottest3 is a software which is used to select a best-fit amino acid replacement model for a set of protein sequences [1]. ProtTest3 finds a best-fit model on the basis of the smallest value of one of three criteria: Akaike Information Criterion (AIC), Corrected Akaike Information Criterion, Bayesian Information Criterion (BIC) score or Decision Theory Criterion (DT) selected by the user. In this article, we will learn how to download and install the command-line version of ProtTest3 on Ubuntu.
Before installing the software, update and upgrade your system by typing the following the commands in a terminal:
$ sudo apt-get update
$ sudo apt-get upgrade
Downloading ProtTest3
A ProtTest3 package compatible with your system can be downloaded from here. We are downloading the tar file for Ubuntu. Open the terminal and change to the directory where you want to download the software, let’s say ‘Downloads’.
$ cd Downloads
$ wget https://bitbucket.org/diegodl/prottest3/downloads/prottest-3.2-20120316.tar.gz
$ tar xvzf prottest-3.2-20120316.tar.gz
If you wish to run ProtTest3 on a cluster computer, then you need to download the MPJ Express from here.
Make sure you have the latest version of Java installed on your system or you can download it from here.
Installing ProtTest3
After that, you will see a new directory named with ‘prottest’, change to that directory.
$ cd prottest3
$ set PROTTEST_HOME=/home/user/Downloads/prottest3
In this directory, you will find the .jar file of ProtTest3, which you can directly use with the command line arguments on a terminal.
Installing MPJ Express
After downloading the tar package, type the following commands in a terminal:
$ cd Downloads
$ tar xvzf mpj.tar.gz
$ export MPJ_HOME=$PROTTEST_HOME/mpj
$ export PATH=$MPJ_HOME/bin:$PATH
Executing ProtTest3
To run ProtTest3, you need to run the .jar file located in the prottest3 directory.
$ java -jar prottest3.jar -i <alignment_file> -t <user_defined_tree_file> -o <output_file> -[matrix] -<models_to_evaluate> -<selection_criteria>
There are many other command-line options that you can see in its manual.
If you want to use multiple threads then you can do it with an additional thread option using which you can define the number of cores to be used.
$ java -jar prottest3.jar -i <alignment_file> -t <user_defined_tree_file> -o <output_file> -[matrix] -<models_to_evaluate> -<selection_criteria> -threads 4
Executing ProtTest3 on a cluster computer
If you have a large number of protein sequences in your dataset, then you can always go for cluster execution.
At first, start MPJ, and then run the HPC shell scripts provided in the same directory.
$ cd Downloads/prottest3
$ mpjboot machines
For the HPC script, the basic syntax is as follows:
$ ./runProtTestHPC.sh <no_of_processors> <parameters>
$ ./runProtTestHPC.sh 2 -i <alignment_file> -t <user_defined_tree_file> -o <output_file> -[matrix] -<models_to_evaluate> -<selection_criteria>
Hope this article helps!
If you have any queries, then comment below or post a question here.
References
- Darriba, D., Taboada, G. L., Doallo, R., & Posada, D. (2011). ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics, 27(8), 1164-1165.