Introns and Design

[I’ve combined all the previous intron entries together to make it easier to read.  However, did not have the time to thoroughly edit, so some parts might seem a little repetitive.]

Since I will be discussing introns, let me begin with a few points of clarification.

First, I will be focusing on introns found in protein-coding genes.  In other words, these are the introns that interrupt sequence that code for amino acids and are removed by spliceosomes in order to form the mature mRNA.  There are other introns that may have front-loaded the existence of these protein-coding introns, but that is another topic for another day.  For now, when I refer to ‘introns,’ I am referring to introns found in protein-coding genes.

Second, I am going to use the following hypothesis as a guide: introns facilitated the evolution of metazoan life.

This hypothesis stems from two teleological vantage points:

1. Since the early 2000s, I have proposed a modest front-loading hypothesis where unicellular life was designed to frontload the appearance of metazoan life.  Since that time, this hypothesis has become increasingly more plausible with the discovery of all sorts of “metazoan-specific” genes in protozoan life forms (a common topic on this blog).  Since we can safely assume that introns were present in these ancient unicellular life forms, we might also suspect that they too helped to frontload the appearance and subsequent evolution of metozoans.

2. The Design Matrix proposes the criterion of Rationality as one means of investigating origins.  In this case, introns, upon superficial glance, appear to be irrational and would nudge the DM score toward the non-teleological spectrum.  From the perspective of translation, they are just long stretches of meaningless gibberish inserted into coding text and thus need to be cut out.  They are not essential to life itself, as bacteria do quite well without any such introns.  Yet if introns facilitate the evolution of more complex, metazoan life, the rational essence of the system begins to come into focus.  We might even wonder if something as complex as the human brain could emerge without them.

Third,   I am not proposing or defending the notion that each and every intron has a function that serves the cell.  Instead of this reductionist approach, we’ll consider introns from the larger perspective, as a class, as a whole.   In other words, most introns can indeed be examples of “junk,” but the question is whether such junk can be used to carry out an objective over time. We can explore individual cases of introns, but for them to be ultimately meaningful, they would need to represent an opportunity that would not be too narrowly restricted for use throughout all of evolution.

Let me now provide a couple of clues to support the hypothesis that introns facilitated the evolution of multicellular life.

First, as a general rule, introns are far more common in multicellular genomes than single-celled genomes.  Consider the human genome.  It has 21, 746 genes and only 1,760 are without introns.  Compare this to the genome of baker’s yeast.  It has about 6200 genes and only about 250 have introns.  In other words, while 92% of human genes have introns, only about 4% of the single-celled yeast genes have introns. What’s more, only about dozen yeast genes have more than one intron, while the typical human gene has around ten introns.

And this pattern is reflected in many other genomes. For example, Plasmodium falciparum, the protozoan that causes malaria, has about 5300 genes and only 121 have introns (about 2%).  And trypansomes, the protozoa that cause African Sleeping Sickness, do not appear to have any introns.

So as you can see, “In general, nuclear introns are widespread in complex eukaryotes, or higher organisms. Simple prokaryotes and eukaryotes (such as fungi and protozoa) lack them.”

Our second clue was already provided in the previous posting.  Bacteria, which lack introns (remember, we’re talking about protein-coding genes) also have not been successful in spawning an organism as complex as a mammal.  There are probably several reasons for this, and the lack of introns may be one of them.

So not only is high intron density correlated with complex, metazoan life, but cells that lack introns have never succeeded in generating something analogous to complex, metazoan life.

One way to explain this difference is that single-celled life forms come in large populations and would thus experience stronger purifying selection that would remove introns – stream-lining such genomes over time.  Yet even if true, this would not negate the putative role of introns, but would only help to explain part of the process by which they would exert their influence.   But as it stands, the purifying selection explanation is probably incomplete.

We saw that the general rule was that complex multicellular genomes tend to be intron-rich, while the genomes of single-celled organisms tend to have very few introns.

But there is a glaring exception.  Recall the choanoflagellates – the single-celled organisms thought to be most closely related to metazoans.  When their genome was sequenced, it provided a big boost to the hypothesis of front-loading, as it contained a whole toolkit of genes needed for metazoan existence, including the information to make cell adhesion domains, extracellular-matrix-associated protein domains, and an elaborate phosphotyrosine signalling machinery (all of these once believed to be specific to metazoans).

So do the choanoflagellates have introns?

Oh yeah.

Choanoflagellates are the closest known relatives of metazoans. To discover potential molecular mechanisms underlying the evolution of metazoan multicellularity, we sequenced and analysed the genome of the unicellular choanoflagellate Monosiga brevicollis. The genome contains approximately 9,200 intron-rich genes, including a number that encode cell adhesion and signalling protein domains that are otherwise restricted to metazoans.


Whereas the M. brevicollis genome is compact, its genes are almost as intron-rich as human genes (6.6 introns perM. Brevicollis gene versus 7.7 introns per human gene). M. brevicollis introns are short (averaging 174 bp) relative to metazoan introns, and with few exceptions do not include the extremely long introns found in some metazoan genes.

From Nicole King et al. 2008. The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans. Nature Vol 45

So while introns are usually sparse in protozoan genomes, in the case of this protozoan genome, they exist at a density that is analogous to the human genome.  So it is reasonable to propose that an intron-rich genome did precede the appearance of metazoan.  Such a genome might have constituted a preadaptation that would facilitate the emergence of metazoans.

In fact, feast your eyes on the abstract from Sullivan, J. C., Reitzel, A. M. & Finnerty, J. R. 2006. A high percentage of introns in human genes were present early in animal evolution: evidence from the basal metazoan Nematostella vectensis. Genome Inform. 17, 219–229.

Intronic sequences represent a large fraction of most eukaryotic genomes, and they are known to play a critical role in genome evolution. Based on the conserved location of introns, conserved sequence within introns, and direct experimental evidence, it is becoming increasingly clear that introns perform important functions such as modulating gene expression. Here, we demonstrate that the positions of 69% (862/1246) of human introns in 343 orthologous genes are conserved in the starlet sea anemone Nematostella vectensis, a phylogenetically basal animal (phylum Cnidaria; class Anthozoa). This degree of intron concordance greatly exceeds that between humans and three more closely related animals: fruitfly (14%), mosquito (13%) and nematode worm (19%). Surprisingly, the fruitfly and mosquito, two members of the order Diptera, share only 43% of intron locations, fewer than the percentage of cumulative introns shared between human and sea anemone (47%), despite sharing a much more recent common ancestor. Our analysis indicates (1) that early animal genomes were intron-rich, (2) that a large fraction of introns present within the human genome likely originated early in evolution, before the cnidarian-bilaterian split, at least 600 million years ago, and (3) that there has been a high degree of intron loss during the evolution of the protostome lineage leading to the fruitfly, mosquito, and nematode. These data also reinforce the conclusion that there are functional constraints on the placement of introns in eukaryotic genes. (emphasis added)

Having laid out the clues to support my hypothesis that introns facilitated the evolution of metazoan life, we should start speculating how introns facilitated the evolution of metazoan life.  But before going there, we need to unpack this hypothesis a bit.

In essence, there are two possible ways to interpret this hypothesis.

1. Introns facilitated the emergence of metazoan life.

2. Introns facilitated the evolutionary spread of metazoan life forms.

The first version would have us focus on how introns helped evolution transition from a single-celled existence to a multicellular existence. The second version would have us focus on how introns helped multicellular organisms expand into their various phyla.

Both possibilities are in play, although the second version entails a front-loading event that attempts to reach much further into the future.

Hey, there is one more clue we can add to the mix when considering introns.  Let me quote from Puzzles of the Human Genome: Why Do We Need Our Introns? By L. Fedorova and A. Fedorov (Current Genomics, 2005, 6, 589-595):

One could argue that, in theory, removing “junk” DNA from the genome would have no negative effects on the organism. This has in fact happened in one vertebrate species, the puffer fish Takifugu rubripes, whose genome shrank several times millions of years ago [1]. The general phenotype is essentially the same as that of closely related genera, even though it has lost vast sections of its genome.

Let’s now add this:

One advantage is that it is much faster to get from one end of a pufferfish gene to the other end and from one gene to the next when determining DNA sequence on continuous stretches of chromosomes. This is because the pufferfish genome is only about an eighth of the size of the human genome—400 million DNA bases. But the pufferfish is not deficient in its total number of genes. Rather, the pufferfish genome contains less of what seems to be irrelevant DNA, sometimes called “junk.” This junk DNA separates genes from one another like the space that separates words in a sentence. It also breaks genes into sections like syllables. The human genome is diluted with so much junk DNA that genes are contained in only three percent of it—compared to fifteen percent in the pufferfish.

So here are two vertebrates that have roughly the same number of genes, but while the human genome is filled with 3 billion nucleotides, the puffer fish genome is only 400 millions nucleotides long.

But here’s the catch.

From the paper by  Fedorova and Fedorov:

An interesting example of fast evolutionary genome shrinkage was observed in Takifugu fish. In this case, the diminution of Takifugu intron lengths and the length of its intergenic regions were highly coordinated. Despite dramatic shortening of the Takifugu genome, the number of introns remains the same as other vertebrates [22]. The process of intron loss is extremely rare in vertebrates.

And here is specific example:

In other experiments, Brenner’s team studied certain large complex human disease genes. For example, in 1995 Elgar and colleagues identified and sequenced the pufferfish counterpart of the human Huntington’s disease gene, which had already been sequenced. The pufferfish gene turned out to be only 23,000 DNA bases long—seven and a half times shorter than the human gene. Although the pufferfish gene has the same sixty-seven interruptions, they are rarely over 1,000 DNA bases—compared to interruptions as long as 12,500 DNA bases in the human gene for Huntington’s. The actual gene, however, is very similar to the human gene and provides no further information about the protein.

The puffer fish has not shed one of the sixty-seven introns in the Huntington’s disease gene.

So here’s the thing.  For whatever reason, the puffer fish genome has shed gobs and gobs of its “junk” DNA.  Yet despite this massive pruning, most, or all, of the introns have been retained.   Apparently, the loss of introns would be deleterious to this fish, which is serving  as a model for all vertebrates. The fact that the puffer fish has not been able to shed its introns fits perfectly with the observation that the “process of intron loss is extremely rare in vertebrates.”  But such intron loss does not appear to be rare in single-celled eukaryotic organisms.  Now why is that?

Add it all up, and it indicates introns are playing some role in metazoan life and this would support my hypothesis that introns facilitate the evolutionary spread of metazoans.

So how would introns work to facilitate the origin and/or spread of metazoan life? Let’s begin with one obvious example – alternative splicing. Below if figure to help you visualize this.

We have one gene with 4 exons, the nucleotides that code for an amino acid. Between them are three introns (in green). When the DNA is is transcribed into RNA, the introns will be cut out by the spliceosome. But we can see tfrom this figure, exon four was cut away along with the 3rd intron, making a protein with exons 1-2-3. On the other hand, the RNA was processed such that exon three and its upstream intron was cut out, giving us a different version of the protein with exons 1-2-4.

Now simply imagine a gene with 10 exons and 9 introns and all the possible proteins that could be produced.

When you grasp the beauty of alternative splicing, the supposed klugy, inefficiency of intons fades from view. Cutting away stretches of “junk” would appear needlessly wasteful by itself. But a more holistic perspective allows as to see that this junk allows genes to get “more bang for their buck.” Without introns, it’s basically one gene – one protein. With introns, one can spawn dozens of proteins, all variants on a theme. And this looks like a designed search strategy. More on that in a bit, but next, let’s see how this process would so beautifully facilitate the evolution of metazoan life forms.

Hopefully, you too can appreciate the brilliance and beauty of alternative splicing: 

Instead of three different, but similar, genes each with its own regulatory sequence and set of transcription factors, we more efficiently package it all into one gene/one promoter that is capable of making three different proteins.

So how would this process facilitate the evolution of metazoan life? Consider the picture below:


What this is telling you is that different cell types can express different versions of the same gene. And this is precisely the strategy that is used by the various cell types of your body (See my blog, “Why are there no prokaryotic mice?). Let me give you a classic example below the fold.

All of your cells possess the proteins actin and myosin.  Actin is a core cytoskeletal protein and myosin is the motor protein that travels along it.  In your muscles, many other proteins have organized and bundled the actin/myosin into thick protein capables that run the length of the muscle.  When stimulated, the cables shorten, resulting in the contraction of muscle.

One protein that works in conjunction with actin and myosin is a protein known as tropomyosin.  It binds to the actin and prevents myosin from binding when not needed.

Now closely survey the figure below as to see what happens with this tropmyosin gene in the different cells of your body:

Full size here

In this figure, the tropomyosin gene has 11 exons, but nowhere are they all used at the same time.  On the contrary, skeletal muscle, like you biceps muscle, does not use exon 2.  Smooth muscle, the muscle that lines your stomach and intestines, uses exon 2, but not exon 3 or 10.  Non-muscle tissue, like fibroblasts, liver, and brain, don’t use either exon 2 or 3 (those two must be muscle-specific).  But even among these, they differ: fibroblasts don’t use exon 10, liver does not use exon 7, and brain does not use exon 11.

Alternative splicing allows the body to fine-tune a single gene and its gene product to meet the unique needs of each cell in different types of tissue.  In other words, because of this process, you body can take any gene/protein and synthesis a brain-version, a stomach-version, a muscle-version, etc.  The proteins are all likely to have the same core function (note that all tropomyosin contain exons 1-4-5-6-8-9), but the activity, location, and specificity of each can be fine-tuned to fit the functional context of the different cell types.

It should now becoming clear to you why introns are so useful in a multicellular state and, conversely, why the cell design of a prokaryote could never have evolved something like a mouse.  Introns impart extreme flexibility that would facilitate the emergence of different cell types under the contraint of the same genome.

In the previous entry, I noted that the alternative splicing of introns facilitates metazoan existence because it allows the organism to produce different versions of the same protein to fit the different versions of cells that make up a body. In other words, a brain cell and muscle cell might express mostly the same genes for their cytoskeleton, but they express different versions of those cytoskeletal genes to facilitate the functions associated with being a neuron vs. a muscle fiber.

Of course, since you might be thinking, “What does that dumb bunny know anyway?,” let me point to a recent study.

First, it was determined that the vast majority of human genes are alternatively spliced:

Scientists have long known that it’s possible for one gene to produce slightly different forms of the same protein by skipping or including certain sequences from the messenger RNA. Now, an MIT team has shown that this phenomenon, known as alternative splicing, is both far more prevalent and varies more between tissues than was previously believed.
Nearly all human genes, about 94 percent, generate more than one form of their protein products, the team reports in the Nov. 2 online edition of Nature. Scientists’ previous estimates ranged from a few percent 10 years ago to 50-plus percent more recently.
A decade ago, alternative splicing of a gene was considered unusual, exotic … but it turns out that’s not true at all — it’s a nearly universal feature of human genes,” said Christopher Burge, senior author of the paper and the Whitehead Career Development Associate Professor of Biology and Biological Engineering at MIT.

They also found:

Burge and his colleagues also found that in most cases the mRNA produced depends on the tissue where the gene is expressed. The work paves the way for future studies into the role of alternative proteins in specific tissues, including cancer cells.
They also found that different people’s brains often differ in their expression of alternative spliced mRNA isoforms.
Two different forms of the same protein, known as isoforms, can have different, even completely opposite functions. For example, one protein may activate cell death pathways while its close relative promotes cell survival.
The researchers found that the type of isoform produced is often highly tissue-dependent. Certain protein isoforms that are common in heart tissue, for example, might be very rare in brain tissue, so that the alternative exon functions like a molecular switch. Scientists who study splicing have a general idea of how tissue-specificity may be achieved, but they have much less understanding of why isoforms display such tissue specificity, Burge said.

From: Human genes sing different tunes in different tissues

So to evolve something as complex as a mammal, the blind watchmaker was given a ready-made strategy for taking the same gene and tweaking it in different tissues as they emerged. Without alternative splicing, there is no reason to think the blind watchmaker could ever have constructed something like a mouse or man.

More support for my hypothesis that introns facilitate the evolution of metazoans from:  Kim E, Magen A, Ast G. 2007. Different levels of alternative splicing among eukaryotes.  Nucleic Acids Research 35:125-131.

Here, the researchers compared levels of alternative splicing in eight different organisms: human, mouse, rat, chicken, sea squirt, fruit fly, worm, and plant. Here is a figure from their paper that reports the core finding:

And here are a couple of key excerpts:

In this study we compare the level of alternative splicing among eight different organisms. By employing an EST independent approach we reveal that the percentage of genes and exons undergoing alternative splicing is higher in vertebrates compared with invertebrates.


The difference in the level of alternative splicing suggests that alternative splicing may contribute greatly to the mammal higher level of phenotypic complexity, and that accumulation of introns confers an evolutionary advantage as it allows increasing the number of alternative splicing forms.

Hmmm. Not only do introns appear to facilitate the evolution of metazoans, it may be the case they facilitate the evolution of complexity among metazoans.

6 responses to “Introns and Design

  1. So bacteria have introns, just not in protein-coding genes.

  2. OK learning mode-

    Introns in bacteria are self-splicing (didn’t know that)

    And a domain of these self-splicing introns can be substituted for part of the snRNA (U6) of the euk’s spliceosome (evidence for common ancestry)- didn’t know that either.

  3. Pingback: Incurring the Cost «

  4. Pingback: RNAPs and Another Front-loading Prediction «

  5. Pingback: Introns and Metazoans again |

  6. Pingback: More introns on the brain |

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s