GenVisR : A tool for genomic visualization

Faiza M

Genomics Software

GenVisR : A tool for genomic visualization

Last updated: May 20, 2020 5:48 pm

Dr. Muniba Faiza

7 Min Read

Illustration of rainbow DNA (deoxyribonucleic acid) with defocus on background

The ever-increasing progress of sequencing techniques has developed a massive amount of genomic data [1]. This has led to an exponential growth of genomic datasets which provide huge information to the scientists. For identifying patterns and investigating biological information, it is necessary to visualize the genomes, but it is quite difficult to develop such tools.

Contents

References:How to cite this article: Faiza, M., 2016. GenVisR : A tool for genomic visualization. Bioinformatics Review, 2(7):page 9-13. The article is available at https://bioinformaticsreview.com/20160712/genvisr-a-tool-for-genomic-visualization/

GenVisR is a Bioconductor R package which provides flexible, user-friendly suite of tools for easy visualization of genomic data. It allows to visualize and interpret genomic data for multiple species under study in three categories: Variants, Copy number alterations and data quality [2]. GenVisR is a compilation of various functions and tools developed for the easy visualization of genomic data.

1. Visualization of Variants

GenVisR provides many functions to analyze the small variants within a genome which is required to be studied during the investigation of the genetic basis of a disease. The available functions in GenVisR to visualize small variants are:

a. Lolliplot

It keeps a precise control over visualization options provided in GenVisR, for example, to visualize the protein domains, a user can opt for Ensemble annotation databases. It also enables the user to plot mutations (Fig.1).

Fig.1 Output from lollipop for selected TCGA breast cancer samples (Cancer Genome Atlas Network, 2012) shows two mutational hotspots in PIK3CA within the accessory and catalytic kinase domains [2].

b. Waterfall

It allows to track the variant recurrence across the multiple genes and illustrates all the mutations in variants and also further differentiates between the variant types. The results are displayed by arranging the samples in a hierarchical manner such that the most recent genes are ranked first and so on (Fig.2).

Fig.2 Output from waterfall showing mutations for five genes across 50 selected TCGA breast cancer samples with mutation type indicated by color in the grid and per sample/gene mutation rates indicated in the top and left sidebars [2].

c. TvTi

It is useful to find the rate of transition and transversion mutations occurred in a set of genes.

2. Visualization of alterations in Copy Number

Copy number alterations within the genome are identified in various diseases [3]. GenVisR provides various functions to easily visualize the copy number alterations.

a. GenCov

It displays the amplifications and deletions within the genomic region of interest (Fig.3).

Fig.3 Output from GenCon displaying coverage (bottom plots) showing focal deletions in sample A (last exon) and B (second intron) within a gene of interest. GC content (top plot) is encoded via a range of colors for each exon [2].

b. cnView

It allows the user to plot copy numbers in a broader view and shows an ideogram for an individual sample at the chromosome level.

c. cnSpec

It displays copy numbers on a larger scale than cnView. It shows a heat map arranged in a grid indexed by chromosomes and samples.

c. cnFreq

It displays the frequency of samples within the genomic dataset which has gained or lost the copy numbers at specific gene loci.

d. lohSpec

Loss of Heterozygosity (loh) is important for studying genomic diseases. The function lohSpec displays all the LOH regions within the genomic dataset (Fig.4).

Fig.4 Output from lohSpec for HCC1395 (Griffith et al., 2015), HCC38 and HCC1143 (Daemen et al., 2013) breast cancer cell lines shows LOH events, across all chromosomes, shaded as dark blue.

3. Visualization of Data Quality

The quality assessment of sequencing data is of utmost importance for the proper interpretation of variants within the genome. GenVisR provides few functions for the quality assessment of the data.

a. covBars

It is a framework which displays the sequencing coverage for the targeted bases (Fig.5).

Fig.5 Output from covBars shows cumulative coverage for 10 samples indicating that for each sample, at least 75% of targeted regions were covered at 35 depth [2].

b. compIdent

It helps to identify the mixed samples that are thought to originate from the same genome (Fig.6).

Fig.6 Output from compIdent for the HCC1395 breast cancer cell line (tumor and normal) shows variant coverage (bottom plot) and SNP allele fraction (main plot) indicating highly related samples [2].

WORKING OF GenVisR:

Since GenVisR is an R package, therefore, it requires a simple R script to run a particular function and it accepts a default file format known as MAF (Mutation Annotation Format). It was first developed for The Cancer Genome Atlas project (Cancer Genome Atlas Research Network, 2008). For example, as illustrated by Z.L.Skidmore et al. (2016), to create Fig. 2 the following script was written in a standard MAF file containing variant mutation data and choosing which genes to plot [2] :

genes ¼ c(“PIK3CA”, “TP53”, “USH2”, “MLL3”, “BRCA1”)

GENVISR::WATERFALL(X ¼ MAF_FILE, PLOTGENES¼GENES)

References:

Kodama,Y. et al. (2012) The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res., 40, D54–D56.
Zachary L. Skidmore1 , Alex H. Wagner1 , Robert Lesurf1 , Katie M. Campbell1 , Jason Kunisaki1 , Obi L. Griffith1,2,3,4,* and Malachi Griffith. Bioinformatics, 2016, 1–3 doi: 10.1093/bioinformatics/btw325.
Beroukhim,R. et al. (2010) The landscape of somatic copy-number alteration across human cancers. Nature, 463, 899–905.

How to cite this article: Faiza, M., 2016. GenVisR : A tool for genomic visualization. Bioinformatics Review, 2(7):page 9-13. The article is available at https://bioinformaticsreview.com/20160712/genvisr-a-tool-for-genomic-visualization/

Share This Article

ByDr. Muniba Faiza

Follow:

Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba