Tutorial: Constructing phylogenetic tree using MEGA7

Tariq Abdullah
4 Min Read

MEGAX is a bioinformatics software/tool used for phylogenetic tree construction. In this article, we will construct a maximum likelihood (ML) tree for a number of protein sequences using MEGA7 [1].

Preparing input

Save protein sequences in FASTA format in a file, here named as ‘sequences.fasta’. We are using around 20 sequences of average length. If you are going to use a large number of protein sequences, then try to construct the tree on a workstation otherwise it will crash and you will not get any output. A phylogenetic tree is constructed in the following steps using MEGA7:

1. Aligning sequences

At first, align all input sequences as shown below.

  • Go to Align (dropdown) --> Edit/Build Alignment --> Retreive sequences from a file --> OK.
  • Select your input file, here sequences.fasta. A new window will open showing all the sequences.
  • Go to Edit --> Select Allor simply press Ctrl+A.
  • Go to Alignment --> Align by ClustalW --> Align Protein --> OK. At this step, you can parameters of alignment but we are aligning using default values. You can align sequences by MUSCLE by selecting Align by MUSCLE instead of selecting Align by ClustalW from the Alignment option at the top menu bar.
  • After processing, it will show you the aligned sequences in the same window.
  • If you want to save the session, then go to Data --> Save Session. Select the appropriate folder and click Save.

2. Exporting into the MEGA format

We need the aligned sequences in the MEGA format for constructing the phylogenetic tree. To convert into the MEGA format, follow the steps mentioned below:

  • Go to Data --> Export Alignment --> Mega Format. You can also export into other formats such as FASTA, Phylip/Paup at this step.
  • Select the appropriate folder and click Save.

3. Constructing the phylogenetic tree

You can construct different kinds of trees such as ML, Neighbor-Joining, Maximum Parsimony, and so on depending upon your data. We are constructing an ML tree in this tutorial.

  • Go to the main window of MEGA7. Click Phylogeny --> Construct/Test Maximum Likelihood Tree.
  • Select the converted file (.meg) and click Open. 
  • A new window will appear ‘Analysis Parameters’. Here you can set different values such as bootstrapping value, substitution model, and so on. It is recommended to test phylogeny by bootstrapping for 500-1000 times. Additionally, select the substitution model appropriately. You can use other software such as Prottest3 to find an appropriate model for your data. See this article.
  • After setting parameters, click Compute. It will take time depending upon the number of sequences and bootstrap values.
  • Finally, it will show you the constructed tree. You can save the tree session and export it into Newick format.

References

  1. Kumar, S., Stecher, G., & Tamura, K. (2016). MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Molecular biology and evolution33(7), 1870-1874.

 

Share This Article
Tariq is founder of Bioinformatics Review and Lead Developer at IQL Technologies. His areas of expertise include algorithm design, phylogenetics, MicroArray, Plant Systematics, and genome data analysis. If you have questions, reach out to him via his homepage.
Leave a Comment

Leave a Reply