Connect with us

Genomics

Genozip- a new compression tool for VCF files

Tariq Abdullah

Published

on

vcf compression tool

Variant Call Format (VCF) is a text file format used to store thousands of genomic datasets. Since these files consist of a large number of gene sequences, their file size is quite large even after compression. Recently, a new compression tool has been introduced known as genozip [1].

genozip tool allows compression of VCF files without any loss. This tool utilizes a compression algorithm specific to genotypes that are only one data type represented in VCF files.

Features of genozip:

  • capable of storing data of any phasing structure, ploidy, and variant types with up to 99 alternate alleles per variant.
  • allows pipeline analyses along with lossless compression.
  • allows secure storage and distribution.
  • can be easily operated on major operating systems (Linux, Windows, and MAC).
  • allows seamless integration into analytical pipelines.
  • data can be encrypted with a password.
  • compression can be optimized according to the users’ needs.
  • consists of several other options.

genozip has been tested on a benchmark dataset that shows faster and higher compression ratios than the other tested tools [1]. For more details about this tool, click here.


References

  1. Lan, D., Tobler, R., Souilmi, Y., & Llamas, B. (2020). genozip: a fast and efficient compression tool for VCF files. Bioinformatics (Oxford, England).

Tariq is founder of Bioinformatics Review and CEO at IQL Technologies. His areas of expertise include algorithm design, phylogenetics, MicroArray, Plant Systematics, and genome data analysis. If you have questions, reach out to him via his homepage.

Genomics

How to install Cortex on Ubuntu?

Dr. Muniba Faiza

Published

on

Cortex - genome analysis framework

Cortex is a user-friendly framework for genome analysis [1]. It acquires less memory and is quite efficient in performance. It’s installation involves various steps. In this article, we will install Cortex on Ubuntu. (more…)

Continue Reading

Genomics

How to Compress and Decompress FASTQ, SAM/BAM & VCF Files using genozip?

Dr. Muniba Faiza

Published

on

compressing and decompressing files using genozip

genozip is a tool for lossless compression of large files including VCF, FASTQ, and SAM/BAM files [1]. In this article, we explain the usage of the genozip tool for the compression and decompression of these files. (more…)

Continue Reading

Genomics

Installing BCFtools on Ubuntu

Tariq Abdullah

Published

on

Installing bcftools on Ubuntu

BCFtools is a set of utilities that are used to manipulate variant call files (VCF) and binary call files (BCF). It can be used for both compressed and uncompressed sort of files. In this article, we will install BCFtools on Ubuntu. (more…)

Continue Reading

LATEST ISSUE

ADVERT