Dealing with thousands of FASTA sequences is a tedious task without using bioinformatics programming. It eases multiple minute tasks to be performed on FASTA sequences or their headers such as removal, addition, or substitution of certain characters in the header, or manipulating the sequence format, and so on. In such cases, shell bash commands provide an easy way to perform such tasks on FASTA sequences. Keep Reading
In a large file of FASTA sequences, it is nearly impossible to perform some operations manually.
This is a simple Perl script to find out duplicate sequences in a multi-fasta file using a FASTA header. Keep Reading
Markov Cluster Algorithm (MCL) is a clustering algorithm that clusters networks . One of its applications is in clustering protein or peptide sequences. This is a fast and scalable clustering algorithm. Previously, we have shown protein/peptide sequence clustering using Cd-hit software. Keep Reading