Molecular modelling of peptides and protein-ligand complexes using knowledge-based potentials


 

Principal Investigator : Debasisa Mohanty

Ph D Student
Gitanjali Yadav (since Jan 2001)

Collaborator
Rajesh S Gokhale

The main theme of the research project is to understand the structural principles that govern folding of peptides to stable conformations and binding of various ligands to proteins and use these structural principles for developing computational approaches for structure prediction of peptides and protein-ligand complexes. The specific objective of the project is to investigate, whether knowledge-based potentials i.e. scoring functions obtained from analysis of structural features in databases of known protein structures can be used for predicting the (i) experimentally observed monomeric hairpin structures for short peptides, (ii) bound conformation of peptides in MHC-peptide complexes and ranking of peptides as per their binding free energy and (iii) substrate specificity of b-Ketoacyl synthase like condensing enzymes which catalyze carbon-carbon bond formation during polyketide biosynthesis.

A.    Prediction of b hairpin structure

The explicit solvent molecular dynamics protocol, which had been used successfully to demonstrate strong tendency for structure in the short peptide ITVNGKTY, was applied to few other peptides to check if they can adopt b hairpin structures in solution. Ensembles of structures were generated for these sequences and energy vs RMSD (from b hairpin conformation) plots were obtained using a residue based statistical potential, to check if b hairpin structures can be predicted as low energy conformations. From the energy vs RMSD plots, it was found that residue based statistical potentials could not unambiguously distinguish b hairpin structures from other possible conformations. This may be attributed to the relatively few contacts formed in a short b hairpin. Hence, it was decided to use the rotamer library approach and construct all atom models and rank various conformations using atom based statistical potentials.

B.    MHC-peptide interactions

In order to test the predictive ability of the MHC-peptide modelling software, it was decided to check, whether the program can rank the sequence of the bound peptides in MHC-peptide complexes as low energy binders from all possible overlapping 9 or 10 mers fragments obtained from the sequences of the proteins from which the bound peptide originated. The results of this analysis indicated that, using a residue based statistical potential, in majority of the cases it is possible to rank the bound peptides within top 20%. Hence, the residue based statistical potential can serve as a quick screening tool for identifying potential MHC binding sequences and detailed atomic models have to be taken into consideration for more accurate prediction of peptide specificity for MHC. Analysis of all the known MHC-peptide complex crystal structures indicated that the peptides bind in a similar extended conformation and hence it was decided to use the extended backbone conformation as seen in MHC-peptide complexes and a backbone dependent rotamer library for building all atom models for peptides in the MHC groove. The side chains were attached to the extended backbone using their most probable rotameric state and steric criteria. The accuracy of the generated conformations were tested by comparison with the known MHC-peptide complexes. The results indicate that 73% of the peptide side chains generated by the rotamer library approach have c values within ± 30° of their respective values in the crystal structure. This relatively simple approach is able to predict peptide conformations in the MHC groove with a reasonable accuracy. However, detailed comparison of the predicted conformations with crystal structures are being carried out to see if prediction accuracy can be improved further. The generated all atom MHC-peptide complexes are being ranked using various atom based statistical potentials and combinations of molecular mechanics and statistical potentials to achieve more accurate prediction of peptide sequence specificity for MHC.

C.    Substrate specificity of b -ketoacyl synthase (KS) like condensing enzymes

The formation of carbon-carbon bond during polyketide biosynthesis is catalyzed by KS domains of modular polyketide synthases or by chalcone synthase like single mono-functional enzymes. Such domains or single enzymes have also been found in genomes of M.tb and other microbes, though their substrates are unknown. We have initiated molecular modelling studies to complement the experimental approaches pursued in Dr. Gokhale’s group for identifying possible substrates for such enzymes. For computational prediction of possible substrates, we are making use of the available knowledge base of the crystal structures of similar condensing enzymes, enzyme-substrate complexes and sequence information of a large number KS domains or chalcone synthase like proteins with known substrates. Since, sequences of 83 different KS domains with known substrate were available, a sequence based phylogenetic analysis was carried out to see if variation in length of the substrate and chemical substitutions on these substrates can be correlated with the sequence variations in the KS domains. However, no such correlation was obvious from phylogenetic analysis as KS domains show a high degree of sequence variations among themselves. Hence, it was decided to build 3D structures for these sequences by homology modelling approach using crystal structure of a condensing enzyme from fatty acid biosynthetic pathway as template and look for sequence variations among the residues lining the active site cavity. Homology models have been built for several KS domains and chalcone synthase like proteins using MODELLER package. Analysis of the shape and chemical nature of the active site cavity is being carried out to correlate them with the substrate types and the results of this analysis would help us in predicting unknown substrates.