Stephen Mulcahy (9224076), 4th Industrial Biochemistry


A long-standing problem in biology has been the question of what makes proteins fold i.e what causes linear amino acid sequences to fold into complex three-dimensional structures, structures which are vital to the function of the protein within a living system. Christian B. Anfinsen proposed that the amino acid sequence itself determined the three-dimensional structure of the protein in the 1950's (2). This theory has been found to apply to a greater or lesser extent to most small globular proteins. Larger proteins have been found to need assistance from other proteins such as Chaperonins to carry out the folding process. Nevertheless, "The Protein Folding Problem" has as one of it's central features - the primary sequence of a protein. It would be very useful to be able to predict the structure of a protein from it's primary sequence for a number of reasons both academic and industrial,

Proteins other than small, globular entities of approximately 300 residues have been most studied and undoubtedly the results of this study will not be directly applicable to other protein categories such as long fibrous ones or proteins residing in cellular membranes, however if the basic rules of folding can be determined for the simpler globular proteins, subsequent elucidation of similar mechanisms for more complex proteins should be possible.

Polarity of Proteins and hydrophobic effects between the protein and surrounding solvent are the main factros involved in driving the protein folding process, i.e. in acqueous solutions, polar amino acids tend to be hydrophilic, attracting polar water molecules while nonpolar amino acids (most of which contain hydrocarbon side-chains) tend to be hydrophonic. The hydrophonic parts of the protein mix poorly with water and are more inclined to associate with each other (2). This and the peptide bonds between consecutive amino acids in a sequence has an influence over the available conformations for a protein. Thus, there appear to be certain pathways along which a protein will fold. This folding process may involve the formation of intermediates as described by Oleg Ptitsyn who speaks of a compact intermediate that is large than the native form of the protein and has an intact secondary structure - this has been described as the molten globule. (1,2)

Fraunfelder and Woynes (3) state that the final stages of folding will depend on the specific sequence of amino acids, whereas earlier folding stages should be mostly insensitive to details of sequence. Another important aspect of protein structure is that native structures appear to be quite robust to mutations in their primary sequence with practically any residue in a sequence being replaceable without causing any change in the proteins structure or orientation. The hydrophobic core seems to be the most important feature of the protein in relation to it's normal folded state. Obviously, the enzymatic activity is not subject to the same mutational freedom with even single residue changes threatening total disruption of activity (2).

Thermodynamics of Protein conformational changes

Protein stability depends in the free energy change between the folded and unfolded states which is expressed by the following,

[ Equation 1 ]

where R represents the Avogadro number, K, the equilibrium constant, G, the free energy change between folded and unfolded, H, the enthalpy change and S, the entropy change from folded to unfolded. The enthalpy change, H, corresponds to the binding energy (dispersion forces, electrostatic interactions, van der Waals potentials and hydrogen bonding) while hydrophobic interactions are described by the entropy term, , S. Proteins become more stable with increasing negative values of , G i.e. as the free energy of the unfolded protein (GU) increases relative to the free energy of the folded or native protein (GU). In other words, as the binding energy increases or the entropy difference between the two states decreases, the folded protein becomes more stable (13). The folded conformation of a domain is apparently in a relatively narrow free energy minimum, and substantial perturbations of that folded conformation require a significant increase in free energy. The large heat capacity change upon protein unfolding causes there to be a temperature at which stability of the folded state is at a maximum. Measured by free energy, the maximum occurs when S=0, while that measured by the equilibirum constant occurs when H=0. These maximum stabilities can occur at quite different temperatures, but both are used in different situations. Regardless of which one is used, however, the stability of the folded state decreases at both higher and lower temperatures. While factors such as binding interactions do obviously play a part in stabilising the protein, they cannot account for a very significant portion of stabilisation effects since similar phenomena occur in the unfolded state (although, the interaction between folded protein and solvent would be expected to be stronger than the interaction between the unfolded protein coil and the solvent), the hydrophobic effect is probably the major stabilising effect (1).

The thermodynamics of protein stability is modelled quite well by the Energy landscape theory. While we can speak of discrete ground and excited states in simple systems such as atoms and nuclear particles, the description of complex systems like proteins requires more than such simplistic models, the ground state of the folded protein is very degenerate and as such, we use the energy landscape to describe it more adequately, where the energy of a protein is a function of the topological arrangement of the atoms. We deal with a spatial surface with a very large number of different co-ordinates and energy values separated by mountains and ridges. Each value in this surface describes the protein in a specific conformation, and there is an energy landscape for each state of the protein (e.g neutral, charged, folded, intermediate or unfolded) (3).

Diagram of folding-energy landscape for a protein

Fig.1: Folding-energy landscape for a Protein molecule, depicted schematically in one-dimensional cross-section. The insert depicts a blow-up of the main diagram with multiple substates (3)

The thermodynamic behaviour of proteins as determined in various temperature-jump experiments is best described by stretched exponentials as opposed to Arrhenius's law, where the rate coefficient decreases with increasing speed as the temperature is reduced as follows,

[ Equation 2 ]

This behaviour is corresponds to that of what are described as glasses or spin-glasses which undergo a transition in which transition temperature depends on the characteristic observing time. The random energy model put forward by Bernard Derrida correleates well with the rough energy landscape diagram for proteins. Wolynes and Bryngelson explained this by proposing that the random-energy model described the misfolding protein states on the energy landscape, with the misfolding minima acting as "traps" that slow down the protein molecules folding process, these traps become successively more difficult to escape as the temperature is lowered. This suggested that proteins were not random heteropolymers, but the products of biological evolution which have a tendency not to get trapped in deep local minima. Due to the fact that there are a limited number of amino acids from which a protein can be constructed it is, however, inevitable that some degree of frustration and landscape roughness will occur, but the landscape is what can be described as minimally frustrated (3).

The robustness of protein native structure to conformational change is a consequence of the funneled nature of the energy landscape of a minimally frustrated protein. The geometry of this landscape cannot be significantly changed by the modification of a few isolated residues. Random heteroplymers on the other hand, have an energy landscape consisting of multiple funnels with each on leading to a different structure, making them more inclined to conformational change as a result of sequence modification.

In terms of the hydrophobic interactions and their affect on stability, the heat capacity of nonpolar compounds dissolved in water is found to be directly proportional to the surface area of number of solvated water molecules in the first solvation shell, Privalov suggested that the hydrophobic contribution in stabilising the native (folded) protein could be evaluated from the change in the heat capacity, Cp combined with the temperature parameters characterising the dissolution of liquid hydrocarbons. It has also been determined that the ratio of the entropy change, S, to the heat capacity change, Cp, for the dissolution of a variety of hydrophobic compounds is a constant. Thus, heat capacity change appears to define the hydrophobic effect and this in turn is related to the exposure of non-polar groups to the solvent, water. The stability of a structure will rise to a maximum and then decrease with increasing temperature due to the dominance of non-hydrophobic entropy effects. On the other hand, stability also decreases at lower temperatures, this cold denaturation is brought about by increasing hydrophobic solvation at lower temperatures (4).

Bryngelson and Wolynes have obtained a phase diagram for folding transitions as a function of a set of theoretical parameters. The phase diagram consist of a three distinct regions,

Only in the folded region can a protein attain it's native structure, while in a glass transition, the folding of the system depends on it's history i.e. the system has many deep local minima seperated by energy barriers which the thermal motions (the vibrational energy) of the molecule cannot overcome. Plots based on simulations exhibit a sharp S-shape, indicating that model polypeptides at least undergo rapid conformational transitions at specific temperatures. This speed of transition makes it difficult to assess the folding characteristics of the models but it does seem to indicate an all-or-none transition.

Protein folding

Protein folding can be compared to crystallisation in that a protein freezes to a unique stable structure while ordinary polymers typically freeze to form amorphous globules i.e. polypeptides with random sequences will generally not fold to unique structures (9). Folded proteins demonstrate varying degress of flexibility, which is of direct relevance to protein folding, in that it reflects the free energy constraints on unfolding and refolding. Flexibility is greatest at the protein surface, where some sidechains and a few loops have alternative conformations or no particular conformation that is energetically preserved. Small, globular proteins (less than 300 residues) exhibit the greatest plasticity of conformation although it is noted that no protein is known to adopt alternative fully folded conformations. This flexibility ties in with the fact that natural proteins do not appear to have been selected for maximum stability, for a synthetic protein designed empirically is much more stable, [ = -94 kJ/mol. Natural proteins seem to either require some degree of flexibility for their function or to allow them to fold into their native conformation more rapidly. Both of these characteristics would be hindered by the maximum stable conformation (1).

All literature is in agreement that Hydrophobic interaction is the major contributor to the stability of the folded or native state of the protein although other interactions such as Hydrogen bonds, Van der Waals forces and Electrostatic interactions are also believed to play a role. One important feature of the folded state is that it is only marginally more stable than the unfolded state due to various compensating factros that stabilise the folded state (basically, any factor involved in stabilising the folded state will also play a role in stabilising the unfolded state, albeit a more minor one due to simple entropic factors) and the unfolded proteins large favourable conformational entropy (1). The hydrophobic interaction between exposed non-polar amino acid residues on the surfaces of the protein molecule is, in general, attractive, short-range, and orientation dependent (7).

There are various concepts that attempt to explain how exactly a protein undergoes folding and organisation from the unfolded or random coil state to the native folded state but the most promising is that the secondary structure forms before most proteins are able to compact extensively(2). Native protein folding in acqueous solution of physiological temperature do not get trapped in deep local minima. The folding appears to proceed from a restricted conformation ensemble by condensation and secondary strcuture growth through an even smaller ensemble of "molten globules" to a thermally jittered final tightly packed "single" structure. Molecules of the same protein can follow different pathways to the same end but the choice of pathways is limited by the thermodynamics of the process (6). The thermodynamic guiding forces of protein folding will be most active in the early stages of folding because that's when the density of states is quite large while in the last stages of folding, when entropy has been reduced, glass transition could well intervene. These transitions have been observed to some degree in a number of experiments which have noted very large activation energies characteristic of glassy systems appearing in the last stages of protein folding (3).

The Compact Intermediate or "Molten Globule"

Experimental observations of unfolded states induced by different conditions describe structures with different physical properties which are indistinguishable thermodynamically, they suggest a collapsed molecule with native-like secondary structure and a liquid-like interior - the so called "Molten Globule" or compact intermediate. The compact intermediate state appears to be the preferred conformational state of the unfolded protein under refolding conditions where it is usually only transient. there may be a continuum of unfolded conformations, with the compact intermediate state at one extreme and the fully folded native protein at the other (1).

Oleg Ptitsyn described this structure as an intermediate that was larger than the native form of the protein and has its secondary structure intact. It is believed that any polypeptide chain of near-native composition and length (80 to 300) will exist in this loose globular state, or as an ensemble of such states when placed in water (2, 6). These ensembles will have the majority of the hydrophobic residues on the inside and the hydrophilics on the outside and will contain numerous secondary structure seeds composed of short helices and beta hairpins which are continuously being regenerated in a process whereby the structures develop regions of compatibility and hydrogen bonding. Peptide chains will be fold from these limited ensembles containing numerous short transient hydrogen-bonded substructures and proceed down only those pathways or funnels that are highly insensitive to sequence differences within the neighbourhood of the native sequence or sequences (6).

Protein Unfolding

The ideal unfolded protein is the random coil, in which the rotation angle about each bond of the backbone and side-chains is independent of that of bonds distant in the sequence, and where all conformations have comparable free energies, except when atoms of the polypeptide chain come into too close proximity. Steric repulsions are significant between atoms close in the covalent structure, and place limitations on the local flexibility. Unfolded proteins in strong denaturants such as 6M-GdmCl or 8M-Urea, and disordered polypeptide copolymers, have been demonstrated to have the average hydrodynamic properties expected of random coil polypeptides. However, other experimental evidence suggests that unfolded proteins are not true random coils under other conditions such as pH or temperature extremes in the abscence of denaturants. Polypeptides will tend to be less disordered and and more compact structures in situations where interactions between other regions of the polypeptide are more energetically favoured than those with the solvent i.e they tend not to behave like a random coil. Alternatively, in situations where the interactions between solvent and polypeptide are especially favourable, stronger random coil behaviour becomes apparent (1).

Protein structural equilibrium is usually described by the following expression,

[ N <=> U ]

Where N represents the Native or folded state and U represents the unfolded state, although it is suggested that the Compact Intermediate state could represent a subset of U which is continuously alternating between various energetically unfavourable states (1).

Multi-domain proteins usually unfold step-wise, with the domains unfolding individually, either independently or with varying degrees of interactions between them. Multi-subunit proteins usually dissociate first, then the subunits unfold, unless domains are on the periphery of the aggregate where they can unfold independently (1). The enthalpies (H) and entropies (S) or unfolding are very temperature dependent because the heat capacity of the unfolded state is significantly greater than that of the folded state. One explanation for this dependence is that the heat capacity difference results mainly from the temperature dependent ordering of water molecules around the non-polar portions of the protein molecules, more of which are solvent accessible in the unfolded state, although other factors may contribute (4).

High concentrations of salt in solvent induce the precipitation of protein. This is believed to be caused by aggregation of the protein due to hydrophobic interactions with the level of aggregation being proportional to the number of exposed associating sites on the protein, this is further supported by the fact that decreases in temperature can precipitate proteins in solution with little added electrolyte. Precipitation may also be caused by osmotic effects (7).

Some recent research has been conducted on the thermodynamics of unfolding of azurin, a small blue copper protein that acts as an electron transfer agent in the redox systems of certain bacteria. The thermodynamic stability of this enzymes tertiary structure has been the focus of much research in the past few years. It is highly resistant to thermal unfolding with irreversible unfolding only occuring at temperatures in excess of 70°C. This unusually thermal resistance has been ascribed to a number of different features of the protein, including the prescence of disulphide bridges, intramolecular hydrogen bonds, hydrophobic efffects and stabilisation by Cu++ binding. A model of the denaturation or unfolding path was developed which involved two effects: a reversible endothermic process which involves the destruction of the three-dimensional structure of the protein and an irreversible and kinetically controlled exothermic process which involves the aggregation of the polypeptide chain network. Differential scanning calorimetry was used to analyse the denaturation process in detail and didn't indicate any endothermic peak suggesting that the thermal unfolding of azurin is, overall, an irreversible process i.e. the second stage predominates with the first stage occuring as a time independent, all or none process (8).

The thermal denaturation of azurin was also examined with Optical Density scanning equipment which gave a series of curves suggesting a scheme like the following,

[ N <=> N1 <=> N0 ]

where N represents the native protein, which maintains it's structure unchanged up to 67°C. In state N1, obtained after heating the solution up to 76°C, the copper ion in the protein is bound with it's ligands, but on enfeeblement of the copper-sulphur bond occurs as evidenced by a slow decrease of optical absorbance at 650nm. The transition from N to N1 is completely reversible. Beyond 76°C, an irreversible process which involves the change of co-ordination of copper occurs. This second process has yet to be explained (8).

Recent Developments and Research into Protein Unfolding

Charles Brooks and his team at the Computational Biophysics research team at the Brooks Insitute have in the past year conducted work using molecular dynamics simulations to calculate protein folding thermodynamics and explore the dominant "flow" from folded to unfolded states from a first principles atomic level description of the protein and solvent environment. The thermodynamic surfaces, free energy and energies, projected onto the radius of gyration, were computed for a 48 residue protein. The calculated free energy surface they obtained suggests the model protein is stable by ~3 kcal/mol and that a

thermodynamic folding intermediate may exist. They have also examined the flow in terms

of the idea of folding funnels. Their analysis gives a consistent thermodynamic picture of folding which first involves the formation of the N-terminal helix-turn-helix motif, followed by the "docking" of the C-terminal helix onto this substructure (15).


  1. Review Article on Protein folding, Creighton, T.E., Biochem J. (1990) 270:1-16
  2. The Protein Folding Problem, Richards, F.M., Scientific American (Jan 1991):34-41
  3. Biomolecules: Where the Physics of Complexity and Simplicity meet, Frauenfelder, H, and Wolynes, P.G., Physics Today (Feb 1994):58-64
  4. Common Features of Protein Unfolding and Dissolution of Hydrophobic Compounds, Murphy, K.P., Privalov, P.L., Gill, S.J., Science (Feb 1990) 247:559-561
  5. The thermodynamic properties of protein, Jia-tih and Shukao, L, J. Phys. D: Appl. Phys 23 (1990) 976-980
  6. Models of Protein Folding, Smith, T.F, Science (May 19, 1995) 507:959-961
  7. Molecular Thermodynamics for Salt-Induced Protein Precipitation, Chiew, Y.C, Kuehner, D., Blanch, H.W and Prausnitz, J.M., AIChE Journal (Sept 1995) Vol 41, No. 9:2150-2159
  8. Thermodynamics of the Thermal Unfolding of Azurin, La Rosa, C., Milardi, D. and Grasso, D., Guzzi, R. and Sportelli, L. J. Phys. Chem. (1995) 99:14864-14870
  9. Statistical Thermodynamics of Protein Folding: Sequence Dependence, Hao, M.H., and Scheraga, H.A., J. Phys. Chem. (1994) 98:9882-9893
  10. Differential scanning calorimetry: applications in biotechnology, Chowdhry, B.Z., and Cole, S.C., Tibtech (Jan 1989) Vol 7:11-18
  11. Time-resolved Fluorescence of Proteins, Beechem, J.M., and Brand, L., Ann. Rev. Biochem (1985) 54:43-71
  12. Time-resolved Circular Dichroism Spectroscopy: Experiment, Theory, and Applications to Biological Systems, Lews, J.W, Goldbeck, R.A., Kliger, D.S, Xie, X., Dunn, R.C., and Simon, J.D., J. Phys. Chem. (1992) 96:5243-5254
  13. Protein Stability and Stabilization through Protein Engineering, Nosoh, Y., and Sekiguchi, T., ISBN: 0-13-721788-9
  14. Physical Chemistry - 4th ed., Atkins, P.W., ISBN: 0-19-855284-X
  15. The Brooks Group home page
  16. C. L. Brooks, III personal communication

Powered by Linux!
* Go to my homepage
* Go to skynet's homepage

Copyright © 1997 by Stephen Mulcahy ( )