Special Stop Codons in the Exceptional Code

The genetic code employs three stop codons – UGA, UAA, and UAG. We have already seen that these codons are perfectly immune to the effects of cytosine deamination. In other words, the code buffers against mutations that will mistakenly produce elongated proteins by turning a stop codon into a sense codon (a codon that codes for an amino acid).

But another question arises – why are there three stop codons? Since one stop codon would be sufficient for the purposes of signaling termination during protein synthesis, why the extra two? What’s more, by having three stop codons instead of one, we increase the chance of having a nonsense mutation, where a sense codon is mutated into a stop codon. Nonsense mutations would thus produce truncated proteins. Such nonsense mutations are a problem for the cell, as evidenced by the need for an RNA surveillance system known as nonsense-mediated decay. So again, why not just use one stop codon?

To understand why life uses three codons, I propose that we tap into the engineering concept of design tradeoffs. NASA explains this strategy as follows:

Conceptual design involves a series of tradeoff decisions among significant parameters – such as operating speeds, memory size, power, and I/O bandwidth – to obtain a compromise design which best meets the performance requirements. Both the uncertainty in these requirements and the important tradeoff factors should be ascertained. Those factors which can be used to evaluate the design tradeoffs (usually on a qualitative basis) include:
• Reliability
• Expandability
• Programmability
• Maintainability
• Compatibility
• Adaptability
• Availability
• Development Status and Cost
(emphasis added)

We have good reason to think a design tradeoff is in play. Ciliates have reassigned two of their three stop codons such that they code for glutamine. Thus, the blind watchmaker has the ability to strip away two of the three stop codons, yet has not done this with most organisms. This tells us that there is a long-term beneficial aspect of the code’s design. So what might it be?

A first possible explanation is that the use of three codons emphasizes expandability and adaptability, where the code was designed to facilitate future evolution. That is, two of the three stop codons are poised as place-holders, whereby the blind watchmaker could more readily reassign them a new amino acid if needed. And sure enough, a 21st and 22nd amino acid have been incorporated into living organisms since the code came into existence. Selenocysteine, the 21st amino acid which is used by bacteria, archaea, and eukarya, is coded for by UGA. Pyrrolysine, the 22nd amino acid which is used by some methanogenic archaea, is coded for by UAG. We would thus predict that if further new amino acids have been added to specific lineages, they will be coded for by one of the stop codons.

Another explanation concerns maintainability, whereby the risk of nonsense mutations is not only addressed with nonsense mediated decay, but is balanced against the risk of frameshift mutations.

In a frameshift mutation, a single base is added or deleted, thus altering the reading frame of the transcript downstream of the site of mutation. This means that the coding sequence downstream of the mutation codes for a random string of amino acids that could gunk up the cell. Also lost in the frameshift is the termination codon, meaning this string of random gunk could be quite long. Having three stop codons instead of one increases the chance that a new stop codon will be encountered downstream of the site where the original stop codon was positioned. In other words, just as three stop codons increase the chance of a nonsense mutation, they likewise increase the chance a stop codon will be encountered after a frameshift has occurred.

What is most intriguing here is that a recent scientific study by Itzkovitz and Alon has shown that life’s code is among the very best at terminating frameshifts [1]:

Here, we consider whether robustness to translational frame-shift errors may be linked to the structure of the genetic code. We tested all alternative codes for the mean probability of encountering a stop in a frame-shifted protein-coding message. We find that the real genetic code encounters a stop more rapidly on average than 99.3% of the alternative codes (Fig. 3). The real code aborts translation eight codons earlier than the average alternative code (15 codons vs. 23 codons).

Feast your eyes on figure 3:

termcode

So we can begin to see the engineering concept of design tradeoffs helps to explain why life uses three instead of one stop codon. If life used only one stop codon, it is doubtful a 21st and 22nd amino could have been tapped and the effects of frameshifts would likely be more severe. By using three codons, the risk of nonsense mutations is increased, yet this can be handled by processes such as nonsense-mediated decay.

But let me end with a tease. The explanation most likely goes much deeper than this and perhaps takes us into the realm of cybernetics.

1. Shalev Itzkovitz and Uri Alon. 2007. The genetic code is nearly optimal for allowing additional information within protein-coding sequences. Genome Res. 17: 405-412

8 responses to “Special Stop Codons in the Exceptional Code

  1. It sounds like the nonsense-mediated decay system might be a separate molecular machine, or more.

  2. Indeed. But it gets better. One of the classic bad design arguments has long been the existence of introns. The world of NMD helps us to understand the brilliance of introns (and more). Just another set of essays for the future. 🙂

  3. Poor “John” keeps wondering why only one start codon (that codes for an amino acid) but three stop codons (that don’t). Any ideas?

  4. Hi Bilbo,

    So what’s the problem?

  5. Having a start codon that codes for an amino acid means that it must be removed most of the time before the protein can fold:

    http://bilbos1.blogspot.com/2011/03/good-challenge-to-id/

    It seems a more rational design would have a start codon that does not code for an amino acid, just as the stop codons do not.

  6. Hi Bilbo,

    On your blog, you wrote, “Proteins achieve their function by folding into their specific shapes. In order to do this, it is important to have many or most of the correct amino acids in the correct positions. Because the stop codons do not code for any amino acids, when the ribosomes produce the protein, there is no problem of the stop codons coding for an unwanted amino acid. However, this is not the case with the start codon. Whether or not the protein calls for methionine at the first postion, methionine will be at the first position. Often this could interfere with the correct folding of the protein. So a second process is needed to remove the methionine before the protein can fold and function. From a design perspective this does indeed seem needlessly inefficient. It would seem that a much better process would be for the start codon not to code for an amino acid, also.”

    First of all, consider the NASA quote from above – “Conceptual design involves a series of tradeoff decisions among significant parameters – such as operating speeds, memory size, power, and I/O bandwidth – to obtain a compromise design which best meets the performance requirements.” Y’see the start codon has an important job not shared by any other codon – it sets the reading frame. If you want a translation system that does not encode an amino acid for its start codon, you’ll need a completely different mechanism to set the reading frame. Has anyone ever proposed such a mechanism? And if so, is there any evidence that such a system would do a better job at setting the proper reading frame? Until those questions can be answered, no problem has been demonstrated with using the start codon to code for an amino acid.

    But what about your claim that “methionine often interferes with the correct folding of the protein?” Really? How often is often?

    The second process that removes the methionine from many proteins is known as methionine excision and depends on the protein methionine aminopeptidase. You say that it seems needlessly inefficient, but again, think design tradeoffs and remember that efficiency is not the sole criterion of good design.

    So at this point I would direct your attention to something called the N-end rule. Here is a review paper written by the scientist who uncovered this rule.

    Click to access JHU_0708_paper_TheN-endRule.pdf

    It basically states the N-terminal amino acid plays a key role in determining the half-life of a protein. If you look at table 1, you’ll see that a protein whose first amino acid is methionine has a half-life of 30 hours. If it was tyrosine, it would be 10 minutes. And if you check out figure 1, you’ll see that methionine is considered a stabilizing residue (one of the three universal stabilizing residues). So the start codon represents a default state for giving a protein a long lifespan. By cutting away the methionine, the cell can in effect set the lifespan of the newly made protein that is shorter. Looks pretty elegant to me.

    But there is more meaty stuff to get out of the N end rule.

  7. Yes, I was thinking that an answer to John would be that there is a tradeoff of some kind. I’m not sure what it means to “set the reading frame.” And I’m not sure how often methionine would interfere with protein folding. I assumed, perhaps mistakenly, that since methionine is regularly removed before protein folding that it was done so because it would interfere with protein folding. You give a hint in your last paragraph that methionine is removed for a different reason. I’ll read the linked paper as I have time. Meanwhile, I’ll post your response at my blog for my millions of readers.

    I like it when people come up with examples of what appear to be irrational design, especially of features that were probably very ancient. It means the teleologist must dig deeper for an explanation and often uncovers an unexpected motherload.

    I hope you write about the “more meaty stuff.”

Leave a comment