A lot of protein structures are determined on a large scale and submitted in Protein Data Bank (PDB) www.rcsb.org [1]. After the experimental determination of these structures, they are used in many scientific studies and experiments are performed upon them such as mutagenesis, docking, and so on.
Sometimes, it is important to study protein sequence modifications such as glycosylation, acetylation, phosphorylation, and sulfonation. These modifications are a dynamic process in living cells leading to diverse protein functions. Protein modifications may occur before, after or during the protein synthesis. They are commonly referred to as post-translational modifications (PTMs) [2]. The mapping of these modifications helps us understand protein functions under a specific condition such as health and diseases which are related to a specific pattern of gene expression. Protein modifications are submitted to and made available in UniProtKB database [3] on the basis of the data obtained from various sources such as literature, observation in 3D structures, related proteins, annotations in specific databases, and so on. There are several available databases which collect annotations and modifications in the proteins such as RESID [4], PSI-MOD ontology [5], and Unimod [6].
A new software has been developed by Gao et al., (2017) [7] to map the annotations in protein structures available in PDB. It is a BioJava [8] package, known as ModFinder developed to identify protein modifications in 3D structures (https://github.com/biojava/biojava/tree/master/biojavamodfinder). The ModFinder module is run weekly after the update of the PDB database and the updated modifications are loaded as annotations into the RCSB PDB database. They can be easily searched in RCSB PDB using “Advanced Search/Sequence Features” option on the website [9,10]. The protein modifications can be easily traced into the structure by using ModFinder [7].
ModFinder is a novel software which offers the identification and visualization of protein modifications in the protein 3D structures available in PDB. Since it may be proved beneficial for most of the researchers, therefore, it would be easy to map the annotations in DNA sequences also.
References:
1. H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne.
(2000) The Protein Data Bank Nucleic Acids Research, 28: 235-242.
2. Farriol‐Mathis, N., Garavelli, J. S., Boeckmann, B., Duvaud, S., Gasteiger, E., Gateau, A., … & Bairoch, A. (2004). Annotation of post‐translational modifications in the Swiss‐Prot knowledge base. Proteomics, 4(6), 1537-1550.
3. Apweiler, R., Bairoch, A., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., … & Martin, M. J. (2004). UniProt: the universal protein knowledgebase. Nucleic acids research, 32(suppl 1), D115-D119.
4. Garavelli, J.S. (2004) The RESID Database of protein modifications as a resource and annotation tool. Proteomics, 4, 1527-1533.
5. Montecchi-Palazzi, L. et al. (2008) The PSI-MOD community standard for representation of protein modification data. Nat. Biotechnol., 26, 864-866.
6. Creasy, D. M., & Cottrell, J. S. (2004). Unimod: Protein modifications for mass spectrometry. Proteomics, 4(6), 1534-1536.
7. Gao J., Prlić A., Bi C., Bluhm W. F., Dimitropoulos D., Xu D., Bourne P. E., Rose P. W. BioJava-ModFinder: Identification of Protein Modifications in 3-D Structures from the Protein Data Bank. Bioinformatics btx101. DOI: https://doi.org/10.1093/bioinformatics/btx101 Published: 17 February 2017
8. Prlić, A. et al. (2012) BioJava: an open-source framework for bioinformatics in 2012. Bioinformatics, 28, 2693-2695.
9. Rose, P.W. et al. (2011) The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Res., 39, D392-D401.
10. Rose, P.W., et al (2015) The RCSB Protein Data Bank: views of structural biology for basic and applied research and education. Nucleic Acids Res. 43, D345-D356