Bioinformatics Softwares,Concepts,Articles,Career & More

A perl script to convert multiline FASTA sequences into a single line

in Bioinformatics Programming by

There are different software or tools which require different kinds of input, especially, when you are trying to developing a pipeline or want to process multiple large files.

If you are dealing with a big FASTA file consisting of thousands of sequences split into a particular number of residues per line, and you want each sequence into a single line, then you can use this simple Perl program.

There are two cases to input your multiline fasta file, either you define the filename in your Perl script or get it through the command line.

1. Define input file within the script

The multifasta input file is “input.fasta”.

#!/usr/bin/perl
use strict;
use warnings;
my $input_fasta = "input.fasta";
open(IN,"<", "input_fasta") || die ("Can't open $input_fasta $!");

my $line = ;
print $line;

while ($line = )
{
chomp $line;
if ($line=~m/^>/gi) { 

   print "\n",$line,"\n";
}
else { 
print $line; 
}
}

print "\n";

2. As a command-line argument

#!/usr/bin/perl
use strict;
use warnings;
my $input_fasta = $ARGV[0];
open(IN,"<", "input_fasta") || die ("Can't open $input_fasta $!");

my $line = ;
print $line;

while ($line = )
{
chomp $line;
if ($line=~m/^>/gi) { 

   print "\n",$line,"\n";
}
else { 
print $line; 
}
}

print "\n";

Download PDF

Muniba is a Bioinformatician based in the South China University of Technology. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Leave a Reply

Latest from Bioinformatics Programming

Go to Top