Connect with us

Sequence Analysis

How to use Clustal Omega and MUSCLE command-line tools for multiple sequence alignment?

Dr. Muniba Faiza

Published

on

Clustal Omega [1,2] and MUSCLE are bioinformatics tools that are used for multiple sequence alignment (MSA). In one of our previous articles, we explained the usage of the ClustalW2 command-line tool for MSA and phylogenetic tree construction. In this article, we will use Clustal Omega and MUSCLE for MSA exploring other arguments that facilitate different output formats.

Assuming the name of the file consisting of FASTA sequences to be aligned is ‘input.fasta’.

For aligning sequences using Clustal Omega, open a terminal in Ubuntu (Ctrl+Alt+T) and type the following command:

  1. to get output in Clustal format

$ /path/to/clustalo --in=input.fasta --out=output.aln --force --outfmt=clustal --wrap=80

2. to get output in Stockholm format

$ /path/to/clustalo --in=input.fasta --out=output.sto --force --outfmt=st --wrap=80

3. to get output in default FASTA format

$ /path/to/clustalo --in=input.fasta

Here, you can command to overwrite the existing file for your output using --force argument. In order to output a particular number of sequence residues in a single line, you can use --wrap argument. If you want 60 residues to be displayed in a single line then write, --wrap=60.

For aligning sequences using the MUSCLE command-line tool, type the following command:

  1. to get output in FASTA format

$ /path/to/muscle -in input.fasta -out output.fasta -fasta

2. to get output in ClustalW format

$ /path/to/muscle -in input.fasta -out output.aln -clw

3. to get output in HTML format

$ /path/to/muscle -in input.fasta -out output.html -html

4. to get output in MSF format

$ /path/to/muscle -in input.fasta -out output.msf -msf

5. to get output in PHYLIP sequential format

$ /path/to/muscle -in input.fasta -out output.phy -phys

6. to get output in interleaved format

$ /path/to/muscle -in input.fasta -out output.phy -phyi

MUSCLE also facilitates to preform profile-profile alignment of two MSAs. For that use -profile argument and define the two input MSAs with -in1 input1.aln and -in2 input2.aln. Similarly, users can also output parameters log using -log argument followed by the output log filename (e.g., -log log.txt).

References

  1. Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., … & Thompson, J. D. (2011). Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega. Molecular systems biology7(1), 539.
  2. Sievers, F., & Higgins, D. G. (2018). Clustal Omega for making accurate alignments of many protein sequences. Protein Science27(1), 135-145.
  3. Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic acids research32(5), 1792-1797.
  4. Higgins, D. G., & Sharp, P. M. (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene73(1), 237-244.

Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Sequence Analysis

HMMER- Uses & Applications

Tariq Abdullah

Published

on

hmmer

HMMER [1] is a well-known bioinformatics tool/software. It offers a web server and a command-line tool for users. Here are some additional applications of HMMER. (more…)

Continue Reading

Sequence Analysis

Easy installation of some alignment software on Ubuntu (Linux) 18.04 & 20.04

Dr. Muniba Faiza

Published

on

Easy installation of some alignment software on Ubuntu (Linux) 18.04 & 20.04

There are commonly used alignment programs such as muscle, blast, clustalx, and so on, that can be easily installed from the repository. In this article, we are going to install such software on Ubuntu 18.04 & 20.04. (more…)

Continue Reading

Sequence Analysis

FEGS- A New Feature Extraction Model for Protein Sequence Analysis

Tariq Abdullah

Published

on

FEGS- A New Feature Extraction Model for Protein Sequence Analysis

Protein sequence analyses include protein similarity, Protein function prediction, protein interactions, and so on. A new feature extraction model is developed for easy analysis of protein sequences. (more…)

Continue Reading

LATEST ISSUE

ADVERT