Connect with us

Genomics

How to Compress and Decompress FASTQ, SAM/BAM & VCF Files using genozip?

Dr. Muniba Faiza

Published

on

compressing and decompressing files using genozip

genozip is a tool for lossless compression of large files including VCF, FASTQ, and SAM/BAM files [1]. In this article, we explain the usage of the genozip tool for the compression and decompression of these files.

To create a reference file

genozip can compress with or without a reference file but it is better to use a reference file to get much better results.

$ genozip --make-reference input.fa

It will output input.ref.genozip.

To compress FASTQ file using a reference file

For example, you have three FASTQ files: file1.fq, file2.fq, and file3.fq, then compress them using the reference file as shown below:

$ genozip --reference input.ref.genozip file1.fq file2.fq file3.fq

To compress VCF file using a reference file

$ genozip --reference input.ref.genozip files.vcf.gz

To compress SAM/BAM file using a reference file

$ genozip --reference input.ref.genozip file.bam

To compress paired ends

$ genozip --reference input.ref.genozip --pair sample1.fastq.gz sample2.fastq.gz

To decompress paired ends

$ genounzip --reference input.ref.genozip --unbind sample1+2.fastq.genozip

To compress & test the compression

$ genounzip inputfile.vcf --test

To convert SAM/BAM files to FASTQ

You can also convert SAM/BAM files to FASTQ format using the following command:

$ genounzip inputfile.bam.genozip --fastq

For more options, type the following in your terminal:

$ genounzip --help


References

  1. Lan, D., Tobler, R., Souilmi, Y., & Llamas, B. (2020). genozip: a fast and efficient compression tool for VCF files. Bioinformatics (Oxford, England).

Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Advertisement
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Genomics

How to install Cortex on Ubuntu?

Dr. Muniba Faiza

Published

on

Cortex - genome analysis framework

Cortex is a user-friendly framework for genome analysis [1]. It acquires less memory and is quite efficient in performance. It’s installation involves various steps. In this article, we will install Cortex on Ubuntu. (more…)

Continue Reading

Genomics

Installing BCFtools on Ubuntu

Tariq Abdullah

Published

on

Installing bcftools on Ubuntu

BCFtools is a set of utilities that are used to manipulate variant call files (VCF) and binary call files (BCF). It can be used for both compressed and uncompressed sort of files. In this article, we will install BCFtools on Ubuntu. (more…)

Continue Reading

Genomics

Installing CRISPRCasFinder on Ubuntu

Dr. Muniba Faiza

Published

on

install crisprcasfinder on ubuntu

CRISPR/Cas9 is a genome editing technology trending fastly. It is used to identify CRISPR associated genes within the genomes of prokaryotic bacterias. Several tools are available for this. Amongst them, CRISPRCasFinder is one that is used to search for CRISPRs and Cas genes in sequence data [1]. In this article, we will install CRISPRCasFinder on Ubuntu. (more…)

Continue Reading

LATEST ISSUE

ADVERT