A Toolkit of Protein Folds

Michael Denton, Craig Marshall and Michael Legge published a paper entitled The Protein Folds as Platonic Forms: New Support for the Pre-Darwinian Conception of Evolution by Natural Law in the Journal of Theoretical Biology in 2002. The abstract of the paper reads as follows:

Before the Darwinian revolution many biologists considered organic forms to be determined by natural law like atoms or crystals and therefore necessary, intrinsic and immutable features of the world order, which will occur throughout the cosmos wherever there is life. The search for the natural determinants of organic form, the celebrated ‘‘Laws of Form’’. was seen as one of the major tasks of biology. After Darwin, this Platonic conception of form was abandoned and natural selection, not natural law, was increasingly seen to be the main, if not the exclusive, determinant of organic form. However, in the case of one class of very important organic forms, the basic protein folds, advances in protein chemistry since the early 1970s have revealed that they represent a finite set of natural forms, determined by a number of generative constructional rules, like those which govern the formation of atoms or crystals, in which functional adaptations are clearly secondary modifications of primary ‘‘givens of physics.’’ The folds are evidently determined by natural law, not natural selection, and are ‘‘lawful forms’’ in the Platonic and pre-Darwinian sense of the word, which are bound to occur everywhere in the universe where the same 20 amino acids are used for their construction. We argue that this is a major discovery which has many important implications regarding the origin of proteins, the origin of life and the fundamental nature of organic form . We speculate that it is unlikely that the folds will prove to be the only case in nature where a set of complex organic forms is determined by natural law, and suggest that natural law may have played a far greater role in the origin and evolution of life than is currently assumed.

Denton et al. begin with an interesting historical survey of pre-Darwinian thinking, where form took priority over function:

The widespread belief that organic forms are lawful ‘‘givens of nature’’ explains why it was that throughout the pre-Darwinian period from the naturphilosophie of the late 18th century, right up to the period just before the publication of the Origin, although it was universally accepted that organisms exhibited functional adaptations, for Goethe, Carus Goeffroy and Owen, it was always form which was of primary concern. Form came first and function was viewed as a secondary and derived adaptive feature (Russell, 1916; Richards, 1992).

Denton et al. explain the decline of this thinking as follows:

The Platonic biology of the pre-Darwinian era with its emphasis on evolution by natural law and its conception of a rational order underlying the diversity of life, represented a grand scientific vision, whose heroic goal was nothing less than the unification of biology and physics. It collapsed primarily because it failed to identify the elusive laws of form which might have provided a rational account of organic form and explained how the evolution of the basic invariant forms or types, from cell forms to the body plans of the major phyla, and deep homologies such as the pentadactyl limb, might have come about as a result of natural law. That they had no convincing explanation was explicitly conceded by Owen (1849) in the final paragraph of ‘‘On the Nature of Limbs’’: ‘‘To what natural laws or secondary causes the succession and progression of such organic phenomena may have been committed we as yet are ignorant.’’

The authors also note:

Of course no serious biologist doubts that some biological forms may be given by natural law and arise spontaneously out of the intrinsic self-organizing properties of their constituents and may not need any genetic program for their specification. The spherical form of the cell and the .at form of the cell membrane are two well known examples. Other more complex examples cited by Waddington (1962) are the various cytoplasmic structures made up of multiple layers of membranes such as the grana and intergrana regions of chloroplasts, the hexagonal arrangement of the rhabdomeres in the eyes of insects and the many forms described by Thompson (1942) in Growth and Form, including radiolarian skeletons, the shapes of mollusk shells, the curved shape of animal horns. But on the whole, natural law is considered to play a very trivial role in the generation of biological form and particularly in the generation of complex seemingly asymmetric biological forms such as protein folds, cell forms, body plans, etc.

Denton et al. then argue that protein folds represent a genuine example where the pre-darwinian thinking has been validated. Let me quote their argument at length:

The protein folds are the basic building blocks of proteins and therefore of the cell and indeed of all life on earth. Each is a polymer between 80 and 200 amino acids long consisting of from about 1000 to 3000 atoms folded up into a complex intricate three-dimensional shape. Most folds exhibit a hierarchical structure composed of basic secondary structural elements such as a helices and beta sheet conformations which are often arranged into more complex motifs which are in turn combined together to make up the native conformation of the fold.

It is important at this stage to note that the great majority of functional proteins in the cell consist of two or more basic folds linked together into multidomain or multifold complexes. In this paper we are considering only the fundamental nature and evolutionary origin of the folds and not of the higher order adaptive structures into which they are combined. These higher order complexes resemble, ‘‘Lego-like’’-,contingent assemblages put together by natural selection for various biological functions during the course of evolution by gene duplication and fusion (Brandon & Tooze, 1999).

Despite these early successes the lack of any apparent regularity in protein structures, and the great dissimilarity among those that had been determined, provided no basis for a rational classification (Ptitsyn & Finkelstein, 1980; Richardson, 1981). The picture was still in those early days compatible with the Lego model, that the folds in living organisms on earth might be individual members of a near infinite set of contingent material assemblages put together by natural selection over millions of years of evolution. It was only during the 1970s, as the number of 3D structures began to grow significantly, that it first became apparent that there might not be an unlimited number of protein folds, that the folds might not belong to a potentially infinite set of artifactual Lego-like constructs. On the contrary, it became increasingly obvious as more structures were determined that the protein folds could be classified into a finite number of distinct structural families containing a number of related but variant forms, i.e. that the classification system of fold structures was typological (Ptitsyn & Finkelstein, 1980; Richardson, 1981; Orengo et al., 1997). This was an important finding as the very fact that protein folds can be grouped in such a way was itself significant, for it provided the first line of evidence that the folds might be natural forms determined by physical law.

It also became apparent that the 3D structures of individual folds were essentially invariant, some such as the Globin fold and the Rossman fold for example, having remained essentially unchanged for thousands of millions of years. Both their invariance and the typological classification schemes into which they could be grouped argued for their being a finite set of ‘‘real timeless structures’’ determined by physics rather than being mutable ‘‘Lego-like’’ aggregates of amino acids determined by selection.

Consideration of the various physical constraints which restrict the folded spatial arrangements of linear polymers of amino acids, the laws of fold form, suggests that the total number of permissible folds is bound to be restricted to a very small number. One recent estimate based on possible arrangements of typical structural elements gave a maximum of 4000 folds (Lingard & Bohr, 1996). Based on similar considerations, the authors of another recent paper suggested that the maximum is likely to be no more than a few thousand (Chothia et al., 1997). A different type of estimate based on the rate of discovery of new folds, rather than permissible spatial arrangements, suggests that the total number of folds utilized by organisms on earth might not be more than 1000 (Chothia, 1993). In many recent reports the total number of different folds is often cited to be somewhat less than 1000 (Holm & Sander, 1996; Orengo et al., 1997; Zhang & DeLisi, 1999; Holm & Sander, 1999).

Whatever the actual figure, the fact that the total number of folds represents a tiny stable fraction of all possible polypeptide conformations, determined by the laws of physics, reinforces further the notion that the folds like atoms, represent a finite set of allowable physical structures which would recur throughout the cosmos wherever there is carbon-based life utilizing the same 20 amino acids.

If there are only a few thousand possible protein folds, this has very significant implications for our reconstruction of evolutionary history. Denton et al. don’t fully draw this out (in my opinion), but it is hinted at here:

Further evidences consistent with the Platonic conception that the protein folds represent a set of lawful immutable natural forms, ‘‘primary givens of physics,’’ are those many cases where protein functions are clearly secondary adaptations of a primary, immutable form (Gerlt & Babbitt, 2001). This is spectacularly true in the case of some of the more common folds also known as superfolds (Orengo et al., 1994; Gerlt & Babbitt, 2001). In the case of one superfold the so-called triosephosphate isomerase (TIM) barrel, an eight-stranded alpha/beta bundle (see Fig. 1), essentially the same fold, has been secondarily modified for many completely unrelated enzymic functions occurring in such diverse enzymes as triosephosphate isomerase, enolase and glycolate oxidase (Orengo et al., 1994). Another example, where a basic fold has been secondarily modified for various biochemical functions, in this case closely related functions, is the various elegant functional adaptations to oxygen uptake and carriage exhibited by the globin fold in myoglobin and the various vertebrate hemoglobins. The fact that in many cases where the same fold is adapted to different functions, no trace of homology can be detected in the amino acid sequences, suggesting multiple separate discoveries of the same basic structure during the course of evolution (Orengo et al., 1994; Brandon & Tooze, 1999), further reinforces the conclusion that the folds are a finite set of ahistoric physical forms.

Let me expand on this slightly. If there are only a few thousands protein folds, then our degree of confidence about homology is greatly weakened if the main pillar of this inference is based on structural similarity. That is, if we were dealing with a nearly-infinite number of potential protein folds, then the fact that two proteins share folds would be strongly suggestive of common descent. But if the number of structures is quite limited, then an origin through convergence, or common design, is equally plausible. This is quite significant in our post-genomic age , given that biologists seem to be relying more and more on structural similarity to infer ‘homology.’

For example, let’s assume there are only about 1000 different protein folds. Let’s now imagine that an intelligent designer sought to seed this planet with microbial life forms containing proteins whose average number of domains was three. This would mean that our designer could only design about 300-350 proteins before he/she would have to reuse a fold. Since we would not be able to design a heterogeneous pool of bacteria from such a limited number of proteins, the design had to reuse protein folds. From the design perspective, the mere sharing of protein folds is not evidence of homology.


2 responses to “A Toolkit of Protein Folds

  1. Like the new look! Keep up the good work.

  2. “From the design perspective, the mere sharing of protein folds is not evidence of homology.”

    I know where I would go with this. I’m wondering if you’re thinking of going there, also.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s