Connect with us

Sequence Analysis

Biotite: A bioinformatics framework for sequence and structure data analysis

Dr. Muniba Faiza

Published

on

Sequence and structural data in bioinformatics are ever-increasing and the need for its analysis is ever-demanding likewise. As bioinformaticians analyze the data with their keen knowledge and reach important conclusions, similarly, bioinformaticists provide with the enhanced and advanced tools and software for data analysis. There are some computational biology frameworks available for the structural data analysis of molecular dynamics simulation such as MDAnalysis [1] and MDTraj [2]. A new framework has been introduced known as Biotite, a Python package used to represent sequence and structure data [3].

The package is open source and freely available at GitHub (https://github.com/biotite-dev/biotite). This package is simple to use especially for the beginners in programming and computationally efficient because of the implementation of Numpy and Cython. Biotite consists of four sub packages: sequence, structure, databases, and application. The sequence and structure modules serve for the analysis of sequence and structural data analysis respectively, database downloads files from the other databases such as RCSB PDB, and application provides interface for external software [3]. 

The sequence subpackage encodes each character of the sequence into a symbol code which is stored in a NumPy ndarray in the sequence object. The nucleotide and protein sequences can be read and written into FASTA format. Besides, sequences can be easily aligned globally [4] and locally [5] using dynamic programming and can be easily visualized according to the similarity percentage.

The structure subpackage uses AtomArrayStack to represent multi-model three-dimensional structures of proteins which has a (m×n×3) coordinate ndarray with n number of atoms and m number of models, and easily parse the files in MMTF format [6]. It is also capable of loading trajectories files of molecular dynamics simulation and can measure angles, dihedrals, and distances between the atoms. Besides, users can easily perform structure superimposition and calculate RMSD, RMSF, and secondary structure assignment.

Biotite is an efficient framework for bioinformatics analyses such as downloading files, reading and writing structural files, and their modification.

For further reading, click here.

References

1. Michaud‐Agrawal, N., Denning, E. J., Woolf, T. B., & Beckstein, O. (2011). MDAnalysis: a toolkit for the analysis of molecular dynamics simulations. Journal of computational chemistry32(10), 2319-2327. 

2. McGibbon, R. T., Beauchamp, K. A., Harrigan, M. P., Klein, C., Swails, J. M., Hernández, C. X., … & Pande, V. S. (2015). MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophysical journal109(8), 1528-1532.

3. Kunzmann P., Hamacher K. (2018) Biotite: a unifying open source computational biology framework in Python. BMC Bioinformatics, 19 (1), 346

4. Waterman, M. S. (1981). Identification of common molecular subsequence. Mol. Biol147, 195-197.

5. Gotoh, O. (1982). An improved algorithm for matching biological sequences. Journal of molecular biology162(3), 705-708.

6. Bradley, A. R., Rose, A. S., Pavelka, A., Valasatava, Y., Duarte, J. M., Prlić, A., & Rose, P. W. (2017). MMTF—An efficient file format for the transmission, visualization, and analysis of macromolecular structures. PLoS computational biology13(6), e1005575.

Dr. Muniba is a Bioinformatician based in New Delhi, India. She has completed her PhD in Bioinformatics from South China University of Technology, Guangzhou, China. She has cutting edge knowledge of bioinformatics tools, algorithms, and drug designing. When she is not reading she is found enjoying with the family. Know more about Muniba

Advertisement
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Sequence Analysis

HMMER- Uses & Applications

Tariq Abdullah

Published

on

hmmer

HMMER [1] is a well-known bioinformatics tool/software. It offers a web server and a command-line tool for users. Here are some additional applications of HMMER. (more…)

Continue Reading

Sequence Analysis

Easy installation of some alignment software on Ubuntu (Linux) 18.04 & 20.04

Dr. Muniba Faiza

Published

on

Easy installation of some alignment software on Ubuntu (Linux) 18.04 & 20.04

There are commonly used alignment programs such as muscle, blast, clustalx, and so on, that can be easily installed from the repository. In this article, we are going to install such software on Ubuntu 18.04 & 20.04. (more…)

Continue Reading

Sequence Analysis

FEGS- A New Feature Extraction Model for Protein Sequence Analysis

Tariq Abdullah

Published

on

FEGS- A New Feature Extraction Model for Protein Sequence Analysis

Protein sequence analyses include protein similarity, Protein function prediction, protein interactions, and so on. A new feature extraction model is developed for easy analysis of protein sequences. (more…)

Continue Reading

LATEST ISSUE

ADVERT