Research in protein structure modeling:

1.   Protein folding and structure prediction

·         A new Monte Carlo sampling method for structure prediction.

Sampling method was suggested recently as the bottleneck in protein structure prediction. We developed a new sampling algorithm for protein structure prediction and tested it on a popular protein folding model, HP model. The new method performed significantly better than all previous methods. As a powerful global optimization algorithm, the new method will also find application in other problems.

·         Estimation of entropy, free energy and stability of proteins by studying near-native structures (NNS).

Since a protein's dynamic fluctuation inside cells affects the protein's biological properties, we present a novel method to study the ensemble of near-native structures (NNS) of proteins, namely, the conformations that are very similar to the experimentally determined native structure. We show that this method enables us to (i) quantify the difficulty of predicting a protein's structure, (ii) choose appropriate simplified representations of protein structures, and (iii) assess the effectiveness of knowledge-based potential functions. We found that well designed simple representations of protein structures are likely as accurate as those more complex ones for certain potential functions. We also found that the widely used contact potential functions stabilize NNS poorly, whereas potential functions incorporating local structure information significantly increase the stability of NNS.

·         Potential function for simplified protein models.

An effective potential function is critical for protein structure prediction and folding simulation. Simplified protein models such as those requiring only Ca or backbone atoms are attractive because they enable efficient search of the conformational space. We show residue-specific reduced discrete-state models can represent the backbone conformations of proteins with small RMSD values. However, no potential functions exist that are designed for such simplified protein models. In this study, we develop optimal potential functions by combining contact interaction descriptors and local sequence-structure descriptors. The performance of the potential function in a test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. Our potential function requiring only backbone atoms or Ca atoms have comparable or better performance than several residue-based potential functions that require additional coordinates of side-chain centers or coordinates of all side-chain atoms.

2.   Protein packing and stability.

·         Effect of side-chain entropy to protein stability and packing.

The role of side-chain entropy (SCE) in protein folding has long been speculated but is still not fully understood. Utilizing a newly developed Monte Carlo method, we conducted a systematic investigation on how the SCE relates to the size of the protein and how it differs among a protein's X-ray, NMR, and decoy structures. For a set of 675 non-homologous proteins, we observed that there is significant SCE for both exposed and buried residues -- the contribution of buried residues approaches ~40% of the overall SCE. Furthermore, the SCE can be quite different for structures with similar compactness or even similar conformations. As a striking example, we found that proteins' X-ray structures appear to pack more "cleverly" than their NMR or decoy counterparts in the sense of retaining higher SCE while achieving comparable compactness, which suggests that the SCE plays an important role in favouring native protein structures. By including a SCE term in a simple free energy function, we can significantly improve the discrimination of native protein structures from decoys.

·         What makes the 20 natural amino acids different from other building blocks?

There are only 20 natural amino acids. What make them so special? We study side-chains using two-dimensional square lattice and three-dimensional tetrahedral lattice models, with explicitly constructed side-chains formed by two atoms of different chirality and flexibility. With enumeration and sequential Monte Carlo technique, we found that both chirality and reduced side-chain flexibility lower the folding entropy significantly for globally compact conformations, suggesting that they are important properties of residues to ensure fast folding and stable native structure. This corresponds well with our finding that natural amino acid residues have reduced effective flexibility, as evidenced by statistical analysis of rotamer libraries and side-chain rotatable bonds. We also found that among compact backbones with maximum side-chain entropy, helical structures emerge as the dominating configurations. Our results suggest that compactness and conformational entropy are important factors contributing to the formation of helices.

·         Voids in proteins.

Voids exist in proteins as packing defects and are often associated with protein functions. Using lattice and off-lattice simplified protein models, we found that packing density for single domain proteins decreases with chain length. We further demonstrate that protein-like scaling relationship between packing density and chain length is observed in off-lattice self-avoiding walks. Our studies suggest that maintaining high packing density is only characteristic of short chain proteins. We found that the scaling behavior of packing density with chain length of proteins is a generic feature of random polymers satisfying loose constraint in compactness.

§         Voids modeled by lattice polymers..

§         Protein packing behavior in off-lattice models.

3.   Atom-level free energy functions for structure modeling.

Back to My Research.

Back to Homepage.