Connect with us

Genomics

How to Compress and Decompress FASTQ, SAM/BAM & VCF Files using genozip?

Dr. Muniba Faiza

Published

on

compressing and decompressing files using genozip

genozip is a tool for lossless compression of large files including VCF, FASTQ, and SAM/BAM files [1]. In this article, we explain the usage of the genozip tool for the compression and decompression of these files.

To create a reference file

genozip can compress with or without a reference file but it is better to use a reference file to get much better results.

$ genozip --make-reference input.fa

It will output input.ref.genozip.

To compress FASTQ file using a reference file

For example, you have three FASTQ files: file1.fq, file2.fq, and file3.fq, then compress them using the reference file as shown below:

$ genozip --reference input.ref.genozip file1.fq file2.fq file3.fq

To compress VCF file using a reference file

$ genozip --reference input.ref.genozip files.vcf.gz

To compress SAM/BAM file using a reference file

$ genozip --reference input.ref.genozip file.bam

To compress paired ends

$ genozip --reference input.ref.genozip --pair sample1.fastq.gz sample2.fastq.gz

To decompress paired ends

$ genounzip --reference input.ref.genozip --unbind sample1+2.fastq.genozip

To compress & test the compression

$ genounzip inputfile.vcf --test

To convert SAM/BAM files to FASTQ

You can also convert SAM/BAM files to FASTQ format using the following command:

$ genounzip inputfile.bam.genozip --fastq

For more options, type the following in your terminal:

$ genounzip --help


References

  1. Lan, D., Tobler, R., Souilmi, Y., & Llamas, B. (2020). genozip: a fast and efficient compression tool for VCF files. Bioinformatics (Oxford, England).

Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Advertisement
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Genomics

VISPR- A new tool to visualize CRISPR screening experiments

Tariq Abdullah

Published

on

VISPR- A new tool to visualize CRISPR screening experiments

As CRISPR/Cas9 is a well-known genome editing technology, it is important to explore and analyze CRISPR screening experiments. In this article, we discuss a new tool developed for better visualization of CRISPR screening experiments. (more…)

Continue Reading

Genomics

How to install Cortex on Ubuntu?

Dr. Muniba Faiza

Published

on

Cortex - genome analysis framework

Cortex is a user-friendly framework for genome analysis [1]. It acquires less memory and is quite efficient in performance. It’s installation involves various steps. In this article, we will install Cortex on Ubuntu. (more…)

Continue Reading

Genomics

Installing BCFtools on Ubuntu

Tariq Abdullah

Published

on

Installing bcftools on Ubuntu

BCFtools is a set of utilities that are used to manipulate variant call files (VCF) and binary call files (BCF). It can be used for both compressed and uncompressed sort of files. In this article, we will install BCFtools on Ubuntu. (more…)

Continue Reading

LATEST ISSUE

ADVERT