How to extract fasta sequences from a multi-fasta file based on matching headers in a separate file?

This is a simple Perl script to extract FASTA sequences from a large fasta file depending on the matching fasta headers present in another file.

For example, your fasta sequences are present in a file named, “input.fa” and the headers are in another file called “headers.txt”.

#! /usr/bin/perl
use warnings;
use strict;
my $headerfile = 'headers.txt';
my $input = 'input.fa';
open( HEADERFILE, '<', $headerfile ) or die $!;
chomp ( my @headers = map { split } <$headerfile> );    #splitting lines on whitespaces.
close HEADERFILE;
my %seqs;
open( INPUTFILE, '<', $input ) or die $!;
{
local $/ = '';         #Reading until blank line
while ( <$input> ) {
     my ( $header, $sequence ) = m/>\s*(\S+)\n(.*)/ms;
     $sequences{$header} = $sequence;
}
open( my $seqsfile, ">", "input.fa" );
foreach my $header (@headers) {
             if ( $sequences{$header} ) {
                       print $header, "\n";
                       print $sequences{$header}, "\n";
             }
}

close( $seqsfile );
}

close INPUTFILE;
exit;
Tariq is a professional Software Developer at IQL. His areas of expertise include algorithm design, phylogenetics, MicroArray, Plant Systematics, and genome data analysis. If you have questions, reach out to him via Researchgate.

Leave a Reply

HOW TO CITE THIS ARTICLE Tariq Abdullah (2019). How to extract fasta sequences from a multi-fasta file based on matching headers in a separate file?. Bioinformatics Review, 5 (04)
Previous Story

How to calculate dN, dS, and dN/dS ratio on a set of genes using MEGA?

Next Story

"What is the scope of bioinformatics?" Do we really need to ask this?

Latest from Bioinformatics Programming

Willing to stay updated?

By investing less than 30 seconds you can start recieving all our new articles in your mailbox. Stay updated with latest Bioinformatics Research, trends and tools of trade.

 

Lost your password? Please enter your email address. You will receive mail with link to set new password.

Help-Desk