IonCRAM: New Tool for Ion Torrent Sequence Files Compression

/
1 min read
IonCRAM for Ion Torrent file compression

One of the major next-generation sequencing (NGS) technologies that is most frequently used in medical research is Ion Torrent. Software for Ion Torrent machines provides output in BAM files that are huge in size. Additionally, their compression is also space expensive.

For efficient compression of these BAM files into CRAM format, new software has been introduced. This software is known as IonCRAM [1]. IonCRAM is a lossless reference-based compression tool. It is capable of compressing BAM files into archiving CRAM files such that it could save up to 43% of space. This is open-source software and is freely available at GitHub, CodeOcean, and http://ioncram.saudigenomeproject.com.

How IonCRAM works?

Its algorithm is based on binning the flow signals and quality values initially introduced by Illumina [2]. At first, the algorithm sorts the reads by genomic coordinates in the BAM file followed by their prefix while sorting the respective CIGAR string to bring similar reads closer to each other. The sorted reads are then scanned for blocks of reads that are mapped to the same locus. Finally, flow signals from each block are collected and compressed together.

IonCRAM is available for Ubuntu and CentOS. For further reading, click here.

References

  1. Shokrof, M., Abouelhoda, M. IonCRAM: a reference-based compression tool for ion torrent sequence files. BMC Bioinformatics 21, 397 (2020).
  2. Illumina inc., “Understanding Illumina Quality Scores,” 2012.
Tariq is founder of Bioinformatics Review and a professional Software Developer at IQL Technologies. His areas of expertise include algorithm design, phylogenetics, MicroArray, Plant Systematics, and genome data analysis. If you have questions, reach out to him via his homepage.

Leave a Reply

video tutorial: Pymol basics
Previous Story

Video Tutorial: Basic Functions of PyMol

Installing FragGeneScan on Ubuntu
Next Story

Installing FragGeneScan on Ubuntu

Latest from NGS

ALFALFA explained

High throughput sequencing has revolutionized the new world of bioinformatics research. Since everyone is aware of

0 $0.00