A perl script to convert multiline FASTA sequences into a single line

Dr. Muniba Faiza
2 Min Read

There are different software or tools which require different kinds of input, especially, when you are trying to developing a pipeline or want to process multiple large files.

If you are dealing with a big FASTA file consisting of thousands of sequences split into a particular number of residues per line, and you want each sequence into a single line, then you can use this simple Perl program.

There are two cases to input your multiline fasta file, either you define the filename in your Perl script or get it through the command line.

1. Define input file within the script

The multifasta input file is “input.fasta”.

#!/usr/bin/perl
use strict;
use warnings;
my $input_fasta = "input.fasta";
open(IN,"<", "input_fasta") || die ("Can't open $input_fasta $!");

my $line = <IN>;
print $line;

while ($line = <IN>)
{
chomp $line;
if ($line=~m/^>/gi) { 

   print "\n",$line,"\n";
}
else { 
print $line; 
}
}

print "\n";

2. As a command-line argument

#!/usr/bin/perl
use strict;
use warnings;
my $input_fasta = $ARGV[0];
open(IN,"<", "$input_fasta") || die ("Can't open $input_fasta $!");

my $line = <IN>;
print $line;

while ($line = <IN>)
{
chomp $line;
if ($line=~m/^>/gi) { 

   print "\n",$line,"\n";
}
else { 
print $line; 
}
}

print "\n";

Share This Article
Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba
Leave a Comment

Leave a Reply