Prediction of Protein-Protein Binding Affinity through their Amino Acid Sequence

Faiza M.

Algorithms

Prediction of Protein-Protein Binding Affinity through their Amino Acid Sequence

Published

8 years ago

on

September 2, 2016

By

Dr. Muniba Faiza

Share

Protein-protein interactions (PPIs) have become necessary in order to study many biological processes. In order to study the PPIs, the binding affinity among the proteins is predicted. Experimental prediction of PPIs requires expensive setup and is very tedious. Therefore, computational methods are used to predict the binding affinity, which is less time taking and provides accurate results.

The binding affinity prediction among the protein complexes poses a problem which has been addressed since the past two decades [1,2]. Various computational methods for binding affinity prediction have been proposed using the empirical scoring functions [3,4,5], knowledge-based methods [6,7,8,9], and QSARs [10,11]. These methods have a few limitations, such as they could handle a small amount of data only, and the results are not much accurate [12].

Yugandhar and Gromiha (2014), have proposed a most accurate and novel method of binding affinity prediction using their amino acid sequence [12]. In this method, the protein-protein complexes are first classified on the basis of their molecular weights, functions, percentage of binding site residues, then the relation between the sequence and the structural properties is analyzed, and thereby the binding affinity.

The sequence-based features include predicted binding site residues [13] and property values of 20 amino acids from AAindex database [14]. The structure-based features include predicted binding site residues using the SPPIDER webserver [15], the number of hydrogen bonds [16], accessible surface area [17], non-bonded interaction energy [18], electrostatic energy and energy due to bond length, bond angle, and torsion angle [19]. The lesser number of properties are used because several of them are inter-related to each other which could cause a bias in the generation of the model [12]. After that, they compared the correlation between all possible pairs of properties, which left them with 113 features/ properties [12]. For the ease of identification of features affecting binding affinity, the protein complexes are classified into different groups: [12].

Antigen-Antibody: Complex formed by interaction between antigen and antibody.
Enzyme-Inhibitor: Complex formed by interaction between enzyme and inhibitor.
Other enzymes: Complexes in which one of the interacting proteins is enzyme and the other one is any thing other than an inhibitory protein.
G-protein containing: Complexes in which one of the interacting proteins is a G-protein.
Receptor containing: Complexes in which one of the interacting proteins functions as a receptor.
Miscellaneous: Which does not fall in any of the above classes.

How is the binding affinity predicted using amino acid sequences?

An independent regression model is generated for all the classified groups by combining more than one feature using multiple regression technique [20]. The performance of generated model is validated by jack-knife test (a resampling test performed for machine learning algorithms). After that, a step-wise least square fit test is performed using multiple regression technique for identifying the combinations of features to predict the binding affinity at high accuracy [12], and P-value is estimated to know the significance of the data (combinations of protein complexes). If the P-value <0.05, then it is statistically significant, otherwise other combinations of features are considered followed by the same procedure.

Yugandhar and Gromiha (2014), developed a web server PPA-Pred which is used for predicting binding affinity of protein-protein complexes through their amino acids sequence (http://www.iitm.ac.in/bioinfo/PPA_Pred/). This server can handle protein sequences containing maximum length of 50 amino acids. It requires the functional information and amino acid sequence in FASTA format and results the binding affinity, delta-G value and Kd value [12]. Kd is a dissociation constant which is derived from the following equation:

ln Kd = delta-G / RT

where,

delta-G is the dissociation free energy, Kd is the dissociation constant, R is the gas constant (1.987 10–3 kcal mol–1 K–1), and T is the temperature (assumed to be room temperature i.e. 25C) [12].

For further reading, click here.

References:

Horton,N. and Lewis,M. (1992) Calculation of the free energy of association for protein complexes. Protein Sci., 1, 169–181
Kastritis,P.L. and Bonvin,A.M. (2010) Are scoring functions in protein-protein docking ready to predict interactomes? Clues from a novel binding affinity benchmark. J. Proteome Res., 9, 2216–2225.
Audie,J. and Scarlata,S. (2007) A novel empirical free energy function that explains and predicts protein-protein binding affinities. Biophys. Chem., 129, 198–211
Jiang,L. et al. (2002) Potential of mean force for protein-protein interaction studies. Proteins, 46, 190–196.
Ma,X.H. et al. (2002) A fast empirical approach to binding free energy calculations based on protein interface information. Protein Eng., 15, 677–681.
Moal,I.H. et al. (2011) Protein-protein binding affinity prediction on a diverse set of structures. Bioinformatics, 27, 3002–3009.
Su,Y. et al. (2009) Quantitative prediction of protein-protein binding affinity with a potential of mean force considering volume correction. Protein Sci., 18, 2550–2558
Vreven,T. et al. (2012) Prediction of protein-protein binding free energies. Protein Sci., 21, 396–404.
Zhang,C. et al. (2005) A knowledge-based energy function for protein-ligand, protein-protein, and protein-DNA complexes. J. Med. Chem., 48, 2325–2335
Tian,F. et al. (2012) Structure-based prediction of protein-protein binding affinity with consideration of allosteric effect. Amino Acids, 43, 531–543.
Zhou,P. et al. (2013) Biomacromolecular quantitative structure-activity relationship (BioQSAR): a proof-of-concept study on the modeling, prediction and interpretation of protein–protein binding affinity. J. Comput. Aided Mol. Des., 27, 67–78.
K. Yugandhar and M. Michael Gromiha. Protein–protein binding affinity prediction from amino acid sequence.Vol. 30 no. 24 2014, pages 3583–3589. doi:10.1093/bioinformatics/btu580
Ofran,Y. and Rost,B. (2007) Interaction sites identified from sequence. Bioinformatics, 23, e13–e16
Kawashima,S. et al. (2008) AAindex: amino acid index database, progress report 2008. Nucleic Acids Res., 36, D202–D205.
Porollo,A. and Meller,J. (2007) Prediction-based fingerprints of protein-protein interactions. Proteins, 66, 630–645
McDonald,I.K. and Thornton,J.M. (1994) Satisfying hydrogen-bonding potential in proteins. J. Mol. Biol., 238, 777–793
Hubbard,S.J. and Thornton,J.M. (1993) NACCESS 2.1.1. Department of Biochemistry and Molecular Biology, University College, London
Gromiha,M.M. et al. (2009) Energy based approach for understanding the recognition mechanism in protein-protein complexes. Mol. Biosyst., 5, 1779–1786.
Guex,N. and Peitsch,M.C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis, 18, 2714–2723
Grewal,P.S. (1987) Numerical Methods of Statistical Analysis. Sterling Publishers, New Delhi

How to cite this article:

Faiza, M., (2016) miRNA targets and their functions. Bioinformatics Review, 2(9):page 4-8.

The article is available at http://bioinformaticsreview.com/20160902/prediction-of-protein-protein-binding-affinity-through-their-amino-acid-sequence/

Algorithms

MOCCA- A New Suite to Model cis- regulatory Elements for Motif Occurrence Combinatorics

Published

3 years ago

on

May 11, 2021

By

Tariq Abdullah

MOCCA- A New Suite to Model cis- regulatory Elements for Motif Occurrence Combinatorics

cis-regulatory elements are DNA sequence segments that regulate gene expression. cis-regulatory elements consist of some regions such as promoters, enhancers, and so on. These regions consist of specific sequence motifs. (more…)

Algorithms

vs_Analysis.py: A Python Script to Analyze Virtual Screening Results of Autodock Vina

Published

3 years ago

on

May 9, 2021

By

Dr. Muniba Faiza

VS-Analysis: A Python Script to Analyze Virtual Screening Results of Autodock Vina

The output files obtained as a result of virtual screening (VS) using Autodock Vina may be large in number. It is difficult or quite impossible to analyze them manually. Therefore, we are providing a Python script to fetch top results (i.e., compounds showing low binding affinities). (more…)

Algorithms

How to search motif pattern in FASTA sequences using Perl hash?

Published

5 years ago

on

September 15, 2019

By

Dr. Muniba Faiza

Here is a simple Perl script to search for motif patterns in a large FASTA file with multiple sequences.

(more…)

Algorithms

How to read fasta sequences from a file using PHP?

Published

5 years ago

on

May 22, 2019

By

Tariq Abdullah

Here is a simple function in PHP to read fasta sequences from a file. (more…)

Algorithms

How to read fasta sequences as hash using perl?

Published

5 years ago

on

March 16, 2019

By

Tariq Abdullah

This is a simple Perl script to read a multifasta file as a hash. (more…)

Algorithms

BETSY: A new backward-chaining expert system for automated development of pipelines in Bioinformatics

Published

5 years ago

on

November 19, 2018

By

Dr. Muniba Faiza

Bioinformatics analyses have become long and difficult as it involves a large number of steps implemented for data processing. Bioinformatics pipelines are developed to make this process easier, which on one hand automate a specific analysis, while on the other hand, are still limited for investigative analyses requiring changes to the parameters used in the process. (more…)

Algorithms

Algorithm and workflow of miRDB

Published

6 years ago

on

September 13, 2018

By

Tariq Abdullah

As mentioned in the previous article, Micro RNAs (miRNAs) are the short endogenous RNAs (~22 nucleotides) and originate from the non-coding RNAs [1], produced in single-celled eukaryotes, viruses, plants, and animals [2]. They play significant roles in various biological processes such as degradation of mRNA [3]. Several databases exist storing a large amount of information about miRNAs, one of such databases miRBase [4] was explained in the previous article, today we will explain the algorithm of miRDB [5,6], another database for miRNA target prediction. (more…)

Algorithms

miRBase: Explained

Published

6 years ago

on

September 3, 2018

By

Dr. Muniba Faiza

Micro RNAs (miRNAs) are the short endogenous RNAs (~22 nucleotides) and originate from the non-coding RNAs [1], produced in single-celled eukaryotes, viruses, plants, and animals [2]. miRNAs are capable of controlling homeostasis [2] and play significant roles in various biological processes such as degradation of mRNA and post-translational inhibition through complementary base pairing [3]. (more…)

Algorithms

Prediction of biochemical reactions catalyzed by enzymes in humans

Published

6 years ago

on

July 3, 2018

By

Dr. Muniba Faiza

There are many biological important enzymes which exist in the human body, one of them is Cytochrome P450 (CyP450) enzymes which are mostly considered in drug discovery due to their involvement in the majority (75%) of drug metabolism [1]. Therefore, various in-silico methods have been applied to predict the possible substrates of CyP 450 enzymes [2-4]. Recently, an in-silico model has been developed to predict the potential chemical reactions mediated by the enzymes present in humans including CyP450 enzymes [5]. (more…)

Algorithms

A new high-level Python interface for MD simulation using GROMACS

Published

6 years ago

on

June 18, 2018

By

Tariq Abdullah

The roots of the molecular simulation application can be traced back to physics where it was applied to simplified hard-sphere systems [1]. This field of molecular simulation study has gained a lot of interest since then and applied to perform simulations to fold small protein at multi-microsecond scale [2-4], predict functional properties of receptors and to capture the intermediate transitions of the complex [5], and to study the movement and behavior of ligand in a binding pocket and also to predict interactions between receptors and ligands [6,7]. (more…)

Algorithms

Machine learning in prediction of ageing-related genes/proteins

Published

6 years ago

on

April 22, 2018

By

Dr. Muniba Faiza

Ageing has a great impact on human health, when people’s age advance towards 80 years, approximately half of the proteins in the body get damaged through oxidation. The chemical degradations occurring in our body produce energy by the consumed food via oxidation in the presence of oxygen. (more…)

Algorithms

Simulated sequence alignment software: An alternative to MSA benchmarks

Published

6 years ago

on

March 21, 2018

By

Dr. Muniba Faiza

In our previous article, we discussed different multiple sequence alignment (MSA) benchmarks to compare and assess the available MSA programs. However, since the last decade, several sequence simulation software have been introduced and are gaining more interest. In this article, we will be discussing various sequence simulating software being used as alternatives to MSA benchmarks. (more…)

Algorithms

Benchmark databases for multiple sequence alignment: An overview

Published

6 years ago

on

February 21, 2018

By

Dr. Muniba Faiza

Multiple sequence alignment (MSA) is a very crucial step in most of the molecular analyses and evolutionary studies. Many MSA programs have been developed so far based on different approaches which attempt to provide optimal alignment with high accuracy. Basic algorithms employed to develop MSA programs include progressive algorithm [1], iterative-based [2], and consistency-based algorithm [3]. Some of the programs incorporate several other methods into the process of creating an optimal alignment such as M-COFFEE [4] and PCMA [5]. (more…)

Algorithms

ab-initio prediction of protein structure: An introduction

Published

6 years ago

on

December 10, 2017

By

Dr. Muniba Faiza

We have heard a lot about the ab-initio term in Bioinformatics, which could be difficult to understand for newbies in the field of bioinformatics. Today, we will discuss in detail what ab-initio is and what are the applicable methods for it. (more…)

Algorithms

Intrinsically disordered proteins’ predictors and databases: An overview

Published

7 years ago

on

October 17, 2017

By

Tariq Abdullah

Intrinsically unstructured proteins (IUPs) are the natively unfolded proteins which must be unfolded or disordered in order to perform their functions. They are commonly referred to as intrinsically disordered proteins (IDPs) and play significant roles in regulating and signaling biological networks [1]. IDPs are also involved in the assembly of signaling complexes and in the dynamic self-assembly of membrane-less nuclear and cytoplasmic organelles [1]. The disordered regions in a protein can be highly conserved among the species in respect of both the composition and the sequence [2]. (more…)

Algorithms

An introduction to the predictors of pathogenic point mutations

Published

7 years ago

on

September 7, 2017

By

Dr. Muniba Faiza

Single nucleotide variation is a change in a single nucleotide in a sequence irrespective of the frequency of the variation. Single nucleotide variants (SNVs) play a very important role in causing several diseases such as the tumor, cancer, etc. Many efforts have been made to identify the SNVs which were initially based on identifying non-synonymous mutations in coding regions of the genomes. (more…)

Algorithms

SparkBLAST: Introduction

Published

7 years ago

on

July 13, 2017

By

Dr. Muniba Faiza

The basic local alignment search tool (BLAST) [1,2] is known for its speed and results, which is also a primary step in sequence analysis. The ever-increasing demand for processing huge amount of genomic data has led to the development of new scalable and highly efficient computational tools/algorithms. For example, MapReduce is the most widely accepted framework which supports design patterns representing general reusable solutions to some problems including biological assembly [3] and is highly efficient to handle large datasets running over hundreds to thousands of processing nodes [4]. But the implementation frameworks of MapReduce (such as Hadoop) limits its capability to process smaller data. (more…)

Algorithms

Role of Information Theory, Chaos Theory, and Linear Algebra and Statistics in the development of alignment-free sequence analysis

Published

7 years ago

on

July 4, 2017

By

Sruthi

Sequence alignment is customary to not only find similar regions among a pair of sequences but also to study the structural, functional and evolutionary relationship between organisms. Many tools have been discovered to achieve the goal of alignment of a pair of sequences, separately for nucleotide sequence and amino acid sequence, BLOSSUM & PAM [1] are a few to name. (more…)

Algorithms

Bioinformatics Challenges and Advances in RNA interference

Published

7 years ago

on

June 25, 2017

By

RNA interference is a post-transcriptional gene regulatory mechanism to down-regulate the gene expression either by mRNA degradation or by mRNA translation inhibition. The mechanism involves a small partially complementary RNA against the target gene. To perform the action, it also requires a class of dedicated proteins to process these primary RNAs into mature microRNAs. The guide sequence determines the specificity of the miRNA. Therefore, the knowledge of the guide sequence is crucial for predicting its targets and also exploiting the sequence to create a new regulatory circuit. In this short review, we will briefly discuss the role and challenges in miRNA research for unveiling the target prediction by bioinformatics and to foster our understanding and applications of RNA interference. (more…)

Algorithms

Systems pharmacology and drug development

Published

7 years ago

on

June 12, 2017

By

Dr. Muniba Faiza

Systems pharmacology is an emerging area in the field of medicinal chemistry and pharmacology which utilizes systems network to understand drug action at the organ and organism level. It applies the computational and experimental systems biology approaches to pharmacology, which includes network analyzes at multiple biological organization levels facilitating the understanding of both therapeutic and adverse effects of the drugs. Nearly a decade ago, the term systems pharmacology was used to define the drug action in a specific organ system such as reproductive pharmacology [1], but to date, it has been expanded to different organ and organism levels [2]. (more…)

Bioinformatics Review

Prediction of Protein-Protein Binding Affinity through their Amino Acid Sequence

How is the binding affinity predicted using amino acid sequences?

References:

How to cite this article:

Faiza, M., (2016) miRNA targets and their functions. Bioinformatics Review, 2(9):page 4-8.

The article is available at http://bioinformaticsreview.com/20160902/prediction-of-protein-protein-binding-affinity-through-their-amino-acid-sequence/

You may like

Leave a Reply

MOCCA- A New Suite to Model cis- regulatory Elements for Motif Occurrence Combinatorics

vs_Analysis.py: A Python Script to Analyze Virtual Screening Results of Autodock Vina

How to search motif pattern in FASTA sequences using Perl hash?

How to read fasta sequences from a file using PHP?

How to read fasta sequences as hash using perl?

BETSY: A new backward-chaining expert system for automated development of pipelines in Bioinformatics

Algorithm and workflow of miRDB

miRBase: Explained

Prediction of biochemical reactions catalyzed by enzymes in humans

A new high-level Python interface for MD simulation using GROMACS

Machine learning in prediction of ageing-related genes/proteins

Simulated sequence alignment software: An alternative to MSA benchmarks

Benchmark databases for multiple sequence alignment: An overview

ab-initio prediction of protein structure: An introduction

Intrinsically disordered proteins’ predictors and databases: An overview

An introduction to the predictors of pathogenic point mutations

SparkBLAST: Introduction

Role of Information Theory, Chaos Theory, and Linear Algebra and Statistics in the development of alignment-free sequence analysis

Bioinformatics Challenges and Advances in RNA interference

Systems pharmacology and drug development

LATEST ISSUE

ADVERT