I’ve combined the essays about the signal recognition partcle, Alu elements, cytosine deamination, all connected by front-loading. All 11, 465 words of it.
Life is a balancing act existing at the interface of opposing demands. This realization comes from many directions and even finds itself entwined within the pillars of PICERAS. As we saw earlier, PICERAS represents the seven universal pillars, or themes, found in all living cells. One of these themes was Compartmentalization, where it is important to sequester the contents of the cell from its external environment, thereby allowing the cell to maintain an internal state that is completely different from the outside. To compartmentalize the contents, we require a barrier that effectively cuts off all the internal activity of the cell from its outer environment. But this poses a problem for other elements of PICERAS. Adaptability, for example, is the process whereby cells communicate with the environment and respond to it in order to maintain their internal states. If the cell was completely cut off from its environment, how could it detect and respond to it? Furthermore, the pillar of Improvisation will work best if cells can communicate with their environment, evolving new solutions to the problems posed by this very environment the cell contents must be protected from. So on one hand, the cell needs to be left alone, but on the other hand, the cell needs to be plugged in. Shall it be an introvert or an extrovert? Actually, the cell doesn’t have to choose because it has a very special “skin” – the membrane.
At first glance, a membrane might seem to be little more than a hodgepodge conglomeration, where two very different ingredients are tossed together to create a messy, oily film. One ingredient consists of two layers of phospholipids. These lipids form the very thin layer of oil and given that carbon-based nanotechnology exists in a water-based medium, the oily layer serves as an extremely effective barrier for molecules dissolved in the water. Anything that is dissolved in water would rather stay embedded in the water than travel through a wall of oil. The lipid layers thus serve as the backbone of Compartmentalization, preventing all that dissolved material from leaving the cell and entering the cell. It is thus no surprise why all cells require a lipid-based membrane without exception. The other ingredient is the proteins. The proteins can stick to one side of the phospholipids or penetrate them acting as conduits or channels connecting the inside of the cell with the environment. Since the cells also controls the shape and activity of these proteins, it controls its connection with the environment. The proteins thus satisfy the needs of Adaptability and Improvisation. And this explains why all membranes are embedded with many different proteins.
The need to insert proteins into the membrane poses another design problem for the cell. Since all proteins are synthesized on ribosomes that are located in the cytoplasm of the cell, how do you get the proteins into the membranes? Do you just make them in the cytoplasm and let them float to the membrane where they will insert themselves? This is a bigger problem than you might think. Cytoplasmic proteins are dissolved in water and thus fold up into structures with hydrophobic cores and an outer surface that is hydrophilic. But proteins embedded in the membrane tend to be “inside-out.” Their outer surface is hydrophobic, allowing them to interact with the surrounding lipids, while their inner channel-like regions are hydrophilic so they can conduct dissolved material across the membrane. This means the cytoplasm doesn’t provide the correct arena for the proper folding of membrane proteins. In fact, in the cytoplasm, proteins with oily surfaces would stick to each other, forming large and growing clumps of oily goo that would gunk up the machinery of the cytoplasm.
A further design problem comes to mind. Consider the bacterium E. coli. A typical cell contains about 24,000 ribosomes. Each ribosome strings together 40 amino acids per second. Let’s imagine that at any given point in time, only half the ribosomes are actively translating. Furthermore, we’ll have them synthesize proteins with an average length of 200 amino acids. This would mean that about 2400 proteins were being added to the cytoplasm every second. The newly made proteins part of this massive and constant stream have three possible fates: 1) remain in the cytoplasm; 2) be inserted into the membrane or; 3) be secreted out of the cell. Roughly 20-30% of the proteins will end up in the membrane and about 500 proteins per second will be secreted. The design problem is that of sorting all these proteins on a moment-by-moment basis to ensure they all reach their proper destination.
The obvious engineering solution to these twin problems is to design a device that would allow the ribosome to interface with the membrane on an “as needed” basis. That it can insert membrane proteins or secrete proteins when appropriate, but remain in the cytoplasm when it is time to make cytoplasmic proteins. It turns out that all cells have such a device, solving both the problem of relying on a cytoplasmic machine to make proteins that belong in the membrane and the problem of sorting thousands of proteins that are made every second – the signal recognition particle and its receptor.
THE SIGNAL RECOGNITION PARTICLE
Let’s sketch out the basic events associated with getting a protein across the membrane. We’ll join the story after the gene for this protein has been expressed and an RNA molecule coding the amino acid sequence is synthesized. This RNA is known as messenger RNA (mRNA) and it is ultimately fed into the ribosome where its sequence of nucleotides will be decoded and used to string together a particular sequence of amino acids. (see animation here).
Figure 1. The first half of the signal recognition particle pathway known as elongation arrest.
Figure 1 shows a simplified representation of some of the players involved in this drama. The mRNA that is threaded into the ribosome is not shown. But what you can see from Figure 1a is a small “window” where the tRNA (also known as transfer RNA) and elongation factor enter the ribosome. The tRNA carries a specific amino acid that will correspond to a specific codon sequence on the mRNA. The elongation factor will guide the tRNA to the proper arena within the ribosome for this interaction to occur. Thus, as the ribosome reads the string of nucleotides on the mRNA, a stream of elongation factors/tRNA pour into the ribosome to hand over the specified amino acids. As synthesis proceeds, the growing chain of amino acids emerges from an exit tunnel (the “window” on the bottom left of the ribosome in Figure 1a).
Now, here’s the catch. If the newly forming protein is destined for secretion (or insertion into the membrane), the first amino acids to emerge from the ribosome will contain a positively charged amino acid followed by about 10-20 hydrophobic amino acids. This characteristic arrangement constitutes the signal sequence which in turn will be recognized by the signal recognition particle (SRP).
In human cells, the SRP is composed of an RNA molecule, about 300 nucleotides in length, attached to six different proteins. Four of the proteins bind to one end of the RNA and form the so-called S domain, a region of the SRP that binds near the exit tunnel as shown in figure 1a. The other two proteins bind to the other end of the RNA and form the Alu domain. When the signal sequence emerges from the exit tunnel of the RNA, the entire SRP complex undergoes a shape change that allows the Alu domain to attach to the window where the tRNAs enter the ribosome. The resulting complex temporarily pauses protein synthesis (Figure 1b).
Once protein synthesis has been paused, the complex formed from the ribosome, the signal sequence, and the SRP will then dock into a receptor on the membrane (Figure 2).
Figure 2. The second half of the signal recognition particle pathway.
The S domain from the SRP interacts with this receptor, causing the SRP to let go of the signal signal sequence. The ensuing shape change also causes the Alu domain to move away from the tRNA window (Figure 2a) and now protein synthesis can resume. However, now the exit tunnel is positioned right next to a membrane channel known as the translocon. When the ribosome resumes protein synthesis, the newly forming amino acid chain is threaded into the tunnel provided by the translocon, allow the protein to pass through the membrane. The whole process is also coordinated with the GTP molecule, which is very similar to ATP. Upon binding to each other, both the SRP and its receptor also bind GTP and then break it down. Once the GTP has been broken down, the energy released will allow the SRP and its receptor split apart and the SRP dissociates from the ribosome where it can be reused with another ribosome.
The story thus ends with the ribosome snapped on to the translocon, pumping its newly made protein across the cell membrane while the SRP is released for use by another ribosome looking to secrete its protein product across the membrane.
PARTICLE PROTEIN AND FRIENDS
To really appreciate the beauty of the SRP system, we should look more closely at the major players. But first, let’s make things more manageable. Lucky for us, the bacterium E. coli has a scaled-down version of the system that nevertheless functions much like the system seen in human cells . The RNA is much smaller, being only 114 nucleotides in length and thus lacking the Alu domain . Furthermore, instead of having six different proteins as part of the SRP, the E. coli version has only one, known as Ffh. Since there is only one, we’ll just call Ffh the ‘particle protein.’ E. coli also has the receptor (FtsY) and the translocon (SecY). Thus, the system is actually quite simple, being composed of a small RNA molecule (4.5S RNA) that is bound by the particle protein which in turn binds to the receptor and the translocon.
Let’s first put the particle protein under the microscope.
This protein is truly multifunctional. It has the ability to bind to the ribosome, the signal sequence, the 4.5S RNA, GTP, and the receptor. Each function is essential and to carry them all out, the 450 amino acids that make up the protein fold up to form three distinct domains: the N domain, followed by the G domain, followed by the M domain (see Figure 3). The M domain has the ability to bind to the signal sequence and the RNA . The G domain binds GTP and is an enzyme that also breaks it down. Both the N and G domains form a structural and functional unit that binds to receptor . The N domain also plays an important role in binding to the ribosome  and contributes to the binding of the signal sequence .
Figure 3. The particle protein (Ffh) bound to the 4.5S RNA. The relative orientation of the domains and RNA is from ref. 10. The functions associated with each domain and the RNA are also indicated.
The receptor is the other major player that interacts with the particle protein. The receptor too has the N and G domains, but lacks the M domain. Instead, another chunk of amino acids, known as the A domain, is typically present in front of the N and G domains. The A domain is believed to help anchor the receptor to the membrane, but studies have shown it to be dispensable , meaning it is the N and G domain of the receptor that provide the core functions of binding to the membrane, the particle protein, and the translocon.
The remaining players include the RNA, which binds to the ribosome and may also regulate the interaction of the domains within the particle protein, and the translocon itself, that forms the membrane channel. The translocon forms a ring-like tunnel with a central cavity that is sufficiently large to conduct the newly made protein across the membrane .
Now that we have all the players in place, let’s think about them in motion. Even though we have a relatively few components to tend to, they play an essential role in targeting the membrane and secretory proteins to their proper destination, and as SRP researcher Peter Walter, from the Howard Hughes Medical Institute, notes, “like many other cellular processes, the targeting reaction involves a series of ordered steps that need to be closely coordinated” . We should not underestimate the difficulty of this task. To properly carry out the job, such precise coordination requires that the SRP and its receptor “switch between multiple functional states in response to cargo occupancy, spatial information, and time constraints” .
It all starts when the SRP, composed of particle protein bound to RNA, samples the ribosomes. This sampling requires that that SRP bind weakly to ribosomes near the exit tunnel where the signal sequence from the newly made protein will emerge. Keep in mind that the ribosome is much larger than the SRP. While the SRP is composed of one protein and an RNA molecule 114 nucleotides in length, the ribosome is composed of over 50 proteins and three RNA molecules comprising 1000s of nucleotides. Thus, the smaller SRP must find the particular patch on the larger ribosome and it does so by binding to a particular ribosomal protein known as L23, which happens to be adjacent to the exit tunnel . Also, when anchored at this point, the RNA that is part of the SRP makes contact with RNA on the ribosome. Yet we don’t want to SRP to bind too tightly yet, as it would be wasting its time latched on to a ribosome that was making a cytoplasmic protein. So, the SRP binds weakly and briefly, allowing to check if it is needed, and if not, it dissociates to sample another ribosome.
If a signal sequence emerges from the ribosome, the M domain of the sampling SRP will bind it tightly. The M domain contains a deep groove that can house the signal sequence and once they bind, the SRP itself is now tightly latched on to the ribosome. What’s more, the interaction between the M domain and signal sequence communicates conformation changes throughout the particle protein, causing the N and G domain to swivel out such that the G domain can now bind GTP . When the G domain binds GTP, it, along with the N domain, becomes more compact  and now is effectively primed for interaction with the receptor . In the meantime, the receptor itself apparently undergoes priming in an analogous manner, where it binds GTP as a function of its association with the membrane or translocon. These priming steps are important because they prevent the SRP and receptor from interacting until both are loaded with their appropriate cargo. But once both are primed, they can finally bind to each other, bringing the exit channel of the ribosome into close proximity to the translocon.
The interaction between the SRP and its receptor is more complex than first suspected. They interact pretty much along the entire interface of their respect N and G domains. The interface is unusually large, being 3-6 times more extensive that the interface formed between antibodies and their antigens  and there are many sites along this interface that are crucial for SRP function. After the particle protein and its receptor bind, a cascade of conformational changes ensues. Both N and G domains re-adjust their shape to form a tighter binding pocket for their respective GTP molecules and further changes then form a composite active site, where the two GTP molecules are paired together within this interface. This GTP twinning is essential to the mechanism of breaking down the GTP molecules, the receptor helps the particle protein break down its GTP molecule while the particle protein helps the receptor break down its GTP molecule. Once the GTP molecules are broken down into GDP and Pi (a process known as hydrolysis), this triggers more conformation changes that allow the receptor and particle protein to dissociate. The breakdown of the GTP molecules acts as a timing devise. Prior to such activity, the M domain on the particle protein transfers the signal sequence on the newly formed protein to the translocon where it is then threaded into or across the membrane . The exact mechanism of this transfer remains obscure. But once the handoff to the translocon is complete, the breakdown of the GTP molecules occurs and the particle protein is released from the receptor and ribosome for another round of reuse.
A CLEVER CONNECTION
When reading through the works of those who study the SRP pathway, it becomes clear that these scientists have tremendous respect for the subject of their research. There are no complaints about the system being jury-rigged, a hodgepodge, or messy. On the contrary, as we have come to better understand how this system works, the researchers are under the impression that it is a very sophisticated system. It has been independently described as an “elegant pathway,”  an “elegant mechanism,”  and “a very elegant solution” . We can likewise get a feel for the way scientists greatly respect the sophistication of this system by considering some excerpts from their studies. One team of researchers notes that “structural studies suggest that the relative position of the N and GTPase domains change during the targeting cycle in response to external cues and serves as an important indicator of the status of the protein” . They also observe ”it appears that two half-reactions (binding of the signal peptide and assembly or activation of the translocon) must be monitored independently and then brought together before a translocation event can be initiated.” Another team notes, “targeting involves a series of ordered steps in which cargo binding and release must occur at the proper stages. Each of the conformational changes in the GTPase domains of the SRP and SR described above provides a potential point at which such control can be exerted, thereby coordinating the loading and unloading of cargoes” . A team of reviewers comments that an “intrinsic advantage of cotranslational protein targeting is that the coupling of translation and translocation should prevent misfolding of the nascent chain in the cytoplasm” . And yet another set of SRP researchers comment on the way the M domain interacts with the NG domain after binding of the signal sequence, observing that this “would elegantly link signal sequence binding to the M domain with GTP binding to the G protein” . The same researchers also describe the central role of the particle protein (SRP54): “SRP function relies on the tightly controlled communication of SRP54 with the external regulators (e.g., the ribosome, the SR, and the translocon) and on internal communication between the domains of SRP54.” Since “the efficiency and fidelity of the targeting process are crucial for maintaining the remarkable organization that is essential for life” , it is not surprising that such an elegant and logical system would have been put into place to carry out the task.
Yet there is a better way to appreciate the inherent rationality of this system. In The Design Matrix: A Consilience of Clues, I argued that an engineered system should succumb to structural and functional decomposition. Because the SRP system is so relatively simple, and has been the subject of a good deal of research, it should be possible to cleanly decompose it. As can be seen from the figure below, I was easily able to break down the SRP system into a discrete set of eight functions.
Functional Decomposition of the SRP Pathway.
Each function, in turn, depends on distinct components of the system. The SCAN function occurs when the N domain from the particle protein interacts with ribosomal protein L23 and surveys the amino acid chain that emerges from the exist tunnel. If a signal emerges, the BIND function kicks in, where the M and N domains attach to the signal sequence creating a tight interaction between the SRP and the ribosome. The BIND function also results in a conformational change in the SRP, exposing the G-domain such that it can now bind GTP, causing the PRIME function to be activated. Once the PRIME function is activated, the SRP is licensed to interact with the receptor. In eukaryotes, the SRP contains the Alu domain which then interacts with the ribosome to PAUSE protein synthesis. Once both the SRP and receptor are primed, they can now DOCK through the interactions of their respective N and G domains. This docking not only activates the breakdown of the GTP molecules, but triggers the TRANSFER function, where the M domain (and perhaps the RNA) guides the signal sequence to the translocon. Once handed off, the THREAD function is carried out, as the ribosome resumes protein synthesis where the exit tunnel of the ribosome and the central pore of the translocon show “perfect alignment,”  so that the amino acid chain emerging from the exit tunnel is guided immediately into the membrane channel formed by the translocon (SecY). Finally, after GTP is hydrolyzed, the RELEASE function is implemented, whereby the SRP and receptor dissociate and now exist in forms that must be primed again to start the whole cycle over again.
The rationality of this system goes beyond the ability to structurally and functionally decompose it. Embedded within this system is a logical circuit that cycles the SRP in a unidirectional pattern, guided by the binding and then breakdown of GTP. The translocon is strategically plugged into this system for just long enough to receive the ribosome. What’s more, the PAUSE step is likewise strategically positioned such that bacteria, which apparently do not require it , can skip it and proceed directly to the DOCK function. In eukaryotes, where protein synthesis is paused, the THREAD function could likewise be re-labeled RESUME. That bacteria do not require the PAUSE function is intriguing. In bacteria, the process of transcription and translation are usually coupled such that the ribosome is translating the RNA that is still in the process of being synthesized by the RNA polymerase. Given that this linked process occurs near the membrane when membrane proteins are being synthesized , it suggests no PAUSE function is required, as the ribosome would already exist in the vicinity of the translocon and could be shuttled there almost instantly. In fact, there are data that show that the receptor itself can bind DNA and that such DNA binding stimulates the GTP hydrolysis activity of the receptor . All of this suggests a sophisticated relationship between transcription, translation, and membrane protein insertion mediated by localization and the interaction between specific machine parts, where it is possible at least some membrane proteins may be inserted into the membrane while the genes for those proteins are still in the process of being transcribed . One couldn’t ask for a more rational explanation for the missing PAUSE function in the bacterial SRP system. On the other hand, since eukaryotes have a nucleus that separates the process of transcription and translation, the need for a PAUSE makes sense, as SRP needs time to dock the ribosome to a distant membrane.
When everything is considered together, it is hard to imagine how this system could be any more rational. It has impressed the researchers who study it, demonstrates efficiency, flexibility, and sophistication, and is so logical that is nicely breaks down into a design flowchart of discrete functions.
CONNECTING TO THE FUTURE
The trickiest part in the Design Matrix is to detect the echoes of foresight. In the Design Matrix: A Consilience of Clues, we sketched out two possible ways of recognizing foresight. One way is to look for molecular machines that exhibited Original Mature Design (OMD). What we would have is machine that appears abruptly in the biotic landscape and then is not significantly improved or changed after extensive subsequent evolution. In other words, the designer got it right from the start. OMD can be viewed as an echo of foresight because there is no reason to think the blind watchmaker would have a decent chance of getting things right from the beginning. When the blind watchmaker cobbles something together through cooption, it doesn’t know what it’s producing and is only “solving” the immediate problem at hand. And there is no reason to think a solution to an immediate problem would just happen to work, much as is, billions of years later. A rational mind, on the other hand, can see beyond the immediate state and contemplate how something might need to be constructed in light of possible future contingencies.
The 3-component system that I have described earlier (SRP, RNA, and receptor) appears to qualify as OMD. It is a system as ancient as the ribosome itself and represents a unique and universal design that solves the problem of coupling the universal ribosome to a universal membrane channel. The highly conserved bacterial nature of this core also speaks to foresight, as whatever designed the core got it right from the start: the same basic strategy that works in the simplest of bacterial cells is at work in complex human cells. Even individual parts can be swapped. Of course, says the Duck, it is possible to view all of this as some Frozen Accident. But then, says the Rabbit, we are left pondering how such an accident turned out to be so darn rational.
Remember, these are all just clues, not proofs or powerful evidence. But it gets much more interesting if we view the OMD as a seed to nudge evolution. In Chapter 7 of The Design Matrix we discussed how intelligence can use chance by carefully selecting bait to help the blind watchmaker fish for certain functions at a later time. In this case, given its universal nature, and the fact that a designer could rely on it to be present in the future, might the SRP system have functioned as such bait? If we return the our original discussion of the SRP system as illustrated in this essay, the eukaryotic system was much more complex than the bacterial system. To keep things simple for our analysis, we cut away this added complexity by shifting our focus to the simpler bacterial system. However, it is may be the case that billions of years of bacterial evolution have simplified what was once a much more complex state .
Recall that the eukaryotic SRP has an RNA component known as the Alu domain that is crucial for elongation arrest (the PAUSE function).
The Alu domain is also bound by two small proteins, SRP9 and SRP14. These proteins first stick to each other and then bind to the Alu domain and are likewise important for elongation arrest . Because of its large size and cellular architecture, the eukaryotic cell design may require the PAUSE function as the ribosome is shuttled to a membrane.
The smaller cell size of bacteria, coupled with its different architecture, may not require such a function and, in fact, it is not known if they can even carry out elongation arrest. Bacteria such as E. coli, have no SRP9 and SRP 14 proteins, and also lack the Alu domain in their RNA component. So which state is more like the ancestral state – the eukaryotic system with its larger RNA molecule, Alu domain, and SRP9/14 proteins or the bacterial system with its smaller RNA molecule lacking an Alu domain SRP9/14 proteins?
PAUSE IN WAITING
I just asked which state is more like the ancestral state – the eukaryotic system with its larger RNA molecule, Alu domain, and SRP9/14 proteins or the bacterial system with its smaller RNA molecule lacking an Alu domains and the SRP9/14 proteins?
Consider the secondary structure of the SRP RNA from mammals and E.coli*. Here is the complex mammalian RNA:
and here is the much simplified E. coli RNA, where the entire left half (the Alu domain) is missing:
So which is more like the ancestral state?
We can first look to protozoa, which constitute a diverse set of unicellular eukaryotes. The Alu domain is present in most protozoa that have been analyzed, albeit with some significant variations . The same theme holds true with fungi. The SRP9/14 proteins are found in algae, amoeba, and the protozoa that cause malaria, but missing in other protozoa. These data are consistent with the blind watchmaker gradually stringing together the process of elongation arrest during the evolution of eukaryotes. But they are also consistent with independent loss of these proteins. So let’s take a closer look at the prokaryotes, as a much more interesting picture emerges.
If we first consider Archaea, they have the ‘eukaryotic’ version of the large RNA, complete with an Alu domain that is strongly conserved in terms of size and structure!  Have a look at the archaeal SRP RNA (and compare it to the mammalian and E. coli versions above):
What is most interesting is that their Alu domain closely resembles that found in humans, where the entire RNA “can be folded into a series of helices that are virtually identical to the secondary structure of human SRP RNA.”  In other words, the human Alu domain looks more like the archaeal Alu domain than other protozoan Alu domains, even through the later organisms, being eukaryotes like humans, should be more closely related. Archaea are also completely missing the SRP9 and SRP14 proteins and it is not known if their ribosomes undergo elongation arrest. Yet it gets better.
While it is true that most bacteria lack the Alu domain, many gram positive bacteria do not. In fact, the RNA from the bacterium B. subtilis looks very much like that found in humans and archaea. Have a look at the SRP RNA from gram positive bacteria:
And while B. subtilis likewise does not contain the SRP9 and SRP14 proteins, unlike archaea, its Alu domain binds to a protein known as HBsu.  HBsu is a small protein that is very similar to proteins that normally bind DNA and it seems to be an important component in SRP function in B. subtilis. 
Since eukaryotes, archaea, and gram positive bacteria all have the same RNA component of the SRP system, this strongly suggests the larger version of RNA is ancestral and that the smaller version lacking the Alu domain was lost very early in most bacterial lineages through reductive evolution. This view is supported by the fact that chloroplasts, the highly simplified and extensively streamlined descendents of cyanobacteria, have completely lost their entire RNA component through reductive evolution.
If the RNA with the Alu domain is indeed ancestral, we have a fairly decent echo of front-loading. Clearly, this Alu domain is not needed for the bacterial way of life as seen by the majority of bacteria that lack it. Yet it is an essential component of elongation arrest in eukaryotes. If the first cells were front-loaded to evolve the eukaryotic cell plan, might the ancestral Alu domain have nudged into existence the necessary appearance of elongation arrest once eukaryotes arose? After all, the structure of the Alu domain is remarkably conserved in gram positive bacteria and archaea, indicating that it has hung around for vast spans of evolutionary time, despite being jettisoned by most bacteria. This means it could have served as the bait to fish out components that would interact with the Alu domain to elicit a full blown elongation arrest. Since SRP9, SRP14, and HBsu are all small, simple proteins, the baiting might work. In fact, what is striking is that even though these proteins are not related, and SRP9/14 need to work as a team to bind the Alu domain, while HBsu can do it alone, the three dimensional structures of Hbsu and the SRP9/14 complex show remarkable similarity . Apparently the conserved structure of the Alu domain fished out similarly shaped proteins in two different evolutionary lineages.
Oh, but it gets even better, as the nudging potential of the Alu domain may have reached much further into the future than the evolution of eukaryotic cell plan. Time to buckle those seat belts.
Added: Found a nice animation of the SRP in action:
THE DISTANT POTENTIAL OF FRONTLOADING
The reach of the Alu domain may have extended much further into the future that the evolution of eukaryotes. Shortly after the dawn of the primate lineage, the gene for the SRP RNA was duplicated and the Alu domain was freed of its SRP function. A second round of duplication occurred and the two Alu domains were merged to form what is now known as the Alu element .
The Alu elements have played an important role in the evolution of the primate genome. Since escaping from its SRP function, the number of Alu elements has expanded to 1.1 million copies in the human genome, making up 11% of human DNA . Since Alu elements do not code for any protein, they were once considered a classic example of “junk DNA.” In reality, they are retrotransposons. A retrotransposon is a gene that is transcribed into RNA and an enzyme known as reverse transcriptase uses it to make a DNA copy that can be inserted some other place in the genome. Since the Alu element does not code for reverse transcriptase, where does the enzyme come from? When the Alu elements were born early in the primate lineages, they commandeered the reverse transcriptase from an older retransposon in the genome known as L1.
What is the purpose of spreading all these Alu elements around during the dawn of human evolution?
Biologist James Shapiro and Richard von Sternberg propose that they are one means of reformatting a genome . Since Alu elements have the ability to influence the activity of the genes in their neighborhoods, by spreading such elements around, the regulatory activity of the entire genome is being reorganized. For example, in one study of two human chromosomes, the Alu elements were far more likely to be clustered with genes involved in metabolism, transport across the cell membrane, and the signaling network inside the cell. In contrast, the Alu elements were rarely found in the neighborhood of genes that coded for components of DNA replication, translation, or structural proteins . It’s as if the Alu elements are tinkering with the processes that most closely interface with the environment while preserving the basic identity of the cell. And while natural selection would act as the final editor to determine if the reorganization passes the fitness test, the actual reorganization is being driven by machinery intrinsic to the genome. In other words, the Alu elements certainly look like they are part of a larger activated program that facilitates the front-loaded potential of the genome.
The most intriguing feature of the Alu elements is their apparent role in the evolution of the human brain. In the 1980s, it was determined that one type of Alu element was expressed only in the brain . Later it was shown that this Alu domain also binds to the SRP9 and SRP14 proteins inside the brain cells, forming a complex that is distinct from the SRP . The function of this complex is not known, but it does tend to localize in the dendrites, which are extensions off the brain cell used to make connections to other brain cells. All of this raises the intriguing possibility that this Alu/protein complex may actually be involved in learning and memory. And what’s more, this appears to be another example of moonlighting among ribosomal components.
Another role Alu elements played during brain evolution revolves around the process of alterative splicing. When a gene is transcribed in eukaryotes, the RNA is typically cut into pieces, where non-coding sequences known as introns are removed, while the coding sequences (exons) are spliced back together to form the mature RNA that will be translated by the ribosomes. Alternative splicing occurs when different combinations of exons are spliced together, generating different versions of a protein from the same gene. The process of alternative splicing is very important in the brain. Diane Lipscombe, a neuroscientist from Brown University, notes that “alternative splicing might be the primary mechanism for generating the spectrum of protein activities that support complex brain functions” . And the Alu elements may have played a significant role in generating many of the alternatively spliced proteins in the brain. These elements contain sequence that is similar to a splice site (the region where the RNA is cut and spliced together) and there is growing evidence that they have spawned a good deal of the alternative splicing in the human genome . Yet another role for the Alu elements involves a sugar known as sialic acid. This sugar is attached to the surface of the cell in almost all animals. In fact, the only mammal that doesn’t coat its cells with this sugar happens to be reading this sentence. Very early in the human lineage, before brain size began to expand, an Alu element jumped into a gene needed for the synthesis of sialic acid and disrupted it . As a result, human cells cannot synthesize this sugar. What makes this loss so intriguing is that in all mammals, the expression of sialic acid in the brain is significantly dampened. Perhaps the sugar acts as a brake that prevents significant brain size expansion, and by removing it from the genome with the help of the Alu elements, the human brain was uniquely released from this constraint and was freed to evolve into a more complex state.
Finally, if we turn our attention to the manner in which the Alu elements spread throughout the primate genomes, we’ll see a pattern that is quite friendly to the hypothesis of front-loaded evolution. The million Alu elements that are found in the human genome can be categorized into 200 or so families and subfamilies  that in turn can be arranged in a hierarchy that reflects their evolutionary origin. One such subfamily is known as AluYb. This group is one of the largest and most active groups of retrotransposons in the human genome. It originated approximately 25 million years ago, just after the Old World monkeys split from the lineages that would lead to apes, chimpanzees, and humans . During the subsequent twenty million years or so, the AluYb family would remain largely dormant. For example, when the chimpanzee genome was searched, only 12 copies of AluYb were found . In comparison, the human genome has over 2000 copies, indicating that this Alu subfamily underwent a burst of activity and expansion specific to the human lineage about 3-4 million years ago . What is most striking is the very long span of dormancy, where the AluYb elements didn’t do much until it was time to evolve human beings. Mark Batzer, from the Louisiana State University, is the one who discovered this pattern of Alu element evolution and has come up with a model to describe their evolution known as the ‘stealth driver’ hypothesis. Essentially, the idea is that Alu elements have the ability to slowly and quietly propagate across deep time without having much of an impact on their genome. However, periodically, they can spawn copies that in turn are much more active. These progeny elements undergo rapid rates of retrotransposition and thus reformat the genome. If the reformatting efforts fail the test of natural selection, those genomes go extinct. Yet the stealth drivers remain in the genomes that didn’t actively evolve, waiting for another opportunity for to catalyze the evolution of their genome. It’s is if the ability to radically evolve lies in waiting. What makes this phenomenon even more interesting is that the ability to significantly reformat the genome may be a response to stress, as experiments have shown a higher level of Alu expression and activity as a consequence of stressing cells . In other words, the bursts of reformatting activity may be homeostatic responses to environment stresses, as the Alu elements help the genome search for a program betters enables the organism to survive through some form of accelerated evolution. In the human lineage, one such program may have been the further development of the brain.
How interesting this all is. The Alu domain, which is completely unnecessary for the bacterial way of life, is nevertheless present in gram positive bacteria and archaebacteria. In essence, it exists as a preadaptation that will help nudge the emergence of eukaryotic protein processing. Then, its preadaptive potential lays dormant for billions of years, until it is unleashed through gene duplication to help facilitate not only primate evolution itself, but also the evolution of the human brain.
NO REST FOR THE SRP
Earlier in the summer, I pointed to a study that shows evidence of genome reformatting during human evolution:
In new research the Leeds team reports that a protein known as REST plays a central role in switching specific genes on and off, thereby determining how specific traits develop in offspring.
The study shows that REST controls the process by which proteins are made, following the instructions encoded in genes. It also reveals that while REST regulates a core set of genes in all vertebrates, it has also evolved to work with a greater number of genes specific to mammals, in particular in the brain – potentially playing a leading role in the evolution of our intelligence.
Says lead researcher Dr Ian Wood of the University’s Faculty of Biological Sciences: “This is the first study of the human genome to look at REST in such detail and compare the specific genes it regulates in different species. We’ve found that it works by binding to specific genetic sequences and repressing or enhancing the expression of genes associated with these sequences.
“Scientists have believed for many years that differences in the way genes are expressed into functional proteins is what differentiates one species from another and drives evolutionary change – but no-one has been able to prove it until now.”
Consider the abstract of this study:
Specific wiring of gene-regulatory networks is likely to underlie much of the phenotypic difference between species, but the extent of lineage-specific regulatory architecture remains poorly understood. The essential vertebrate transcriptional repressor REST (RE1-Silencing Transcription Factor) targets many neural genes during development of the preimplantation embryo and the central nervous system, through its cognate DNA motif, the RE1 (Repressor Element1). Here we present a comparative genomic analysis of REST recruitment in multiple species by integrating both sequence and experimental data. We use an accurate, experimentally validated Position-Specific Scoring Matrix method to identify REST binding sites in multiply aligned vertebrate genomes, allowing us to infer the evolutionary origin of each of 1,298 human RE1 elements. We validate these findings using experimental data of REST binding across the whole genomes of human and mouse. We show that one-third of human RE1s are unique to primates: These sites recruit REST in vivo, target neural genes, and are under purifying evolutionary selection. We observe a consistent and significant trend for more ancient RE1s to have higher affinity for REST than lineage-specific sites and to be more proximal to target genes. Our results lead us to propose a model where new transcription factor binding sites are constantly generated throughout the genome; thereafter, refinement of their sequence and location consolidates this remodeling of networks governing neural gene regulation.
In other words, RE1 is a piece of DNA that is spread about the genome, where it can bind the protein REST and alter the level of expression in near-by genes. And during human evolution, RE1 may have been tweaking the expression of genes involved in brain evolution. So why is this so interesting?
The authors note:
Emerging evidence, including that presented in this manuscript, points to highly divergent transcription factor recruitment between mammalian species. What is the basis for this divergence? Many transcription factors bind short degenerate sequences that can be readily created by single base pair mutations of a similar sequence. However, this is unlikely to be the case for transcription factors with long recognition elements, such as REST, p53 or CTCF: For simple probabilistic reasons, long periods of time must pass before long regulatory motifs can arise through DNA mutation in a given stretch of random sequence. What processes can explain the genomic remodeling of transcriptional regulatory networks observed in vertebrates?
The process of simple point mutation, ticking away over time like a clock, is insufficient for distributing these RE1 sites around the genome. So how did they spread all over the genome?
A couple of years ago, the same researchers published a paper entitled, “Identification of the REST regulon reveals extensive transposable element-mediated binding site duplication.” Here are some excerpts:
We reasoned that duplication and insertion by TEs [transposable elements – MG ]might be a potential mechanism of RE1 duplication. We therefore tested the duplicated RE1s for repetitive or transposon characteristics. We submitted the flanking sequences of duplicated RE1s to the online tool RepeatMasker, which indicated that the majority of duplicated RE1s are located in TEs of most major classes, including long interspersed repeats (LINEs, principally LINE2s), short interspersed repeats (SINEs, principally Alus) and hERV sequences.
Most of those sequences tested, including those associated with Alu, LINE1 and LINE2 sequences, as well as two pairs residing in non-repetitive DNA, were capable of interacting with REST.
TEs have gone through bursts of active transposition during distinct periods of evolutionary history: although LINE2 elements were thought to be active ∼200 million years ago and before human–mouse divergence, LINE1 and Alu elements continue to retrotranspose in humans. This is reflected in the phylogenetic conservation of human TE-associated RE1s: those associated with Alu and LINE1 elements have no aligned sequences other than in chimp, while a number of ancient LINE2 elements are conserved amongst multiple species.
Whoa. The Alu elements, derived from the SRP, seemed to have been involved in spreading the RE1 elements around the genome and thus influencing brain evolution.
But it just won’t stop getting better……
[Don’t forget some related context].
WHY RARES ARE NOT SO RARE
From this site, we learn
Retinoic acid (RA) may act as a regulator of differentiation at various stages of vertebrate embryogenesis. In particular, the results of exogeneous RA treatment have implicated RA in antero-posterior patterning both along the body axis and in developing Limb bud.
Retinoic Acid receptors (RARs) are nuclear receptors related to the steroid and thyroid hormone Receptors, a family of proteins that function as ligand-dependent transcription factors.
RAR’s are not membrane receptors, but instead exist as proteins in the cytoplasm.
Because retinoic acid is a hydrophobic molecule, it easily slips across the membrane. When it binds to the RAR, it is then able to bind the DNA to alter the expression of genes involved in embryonic development, as shown in the figure below:
The RAR binds to a DNA motif known as the RARE (RAR element). There are slightly different versions of the RAREs, and one form, known as DR2, is actually found as part of the Alu element:
Notably, a motif, AGGTCAnnAGTTCG, found within most subclasses of AluS sequences, corresponds to a non-consensus DR2 element recognized by RARs, and has been shown to function as a RARE.
From: David Laperriere, Tian-Tian Wang, John H White and Sylvie Mader. 2007. Widespread Alu repeat-driven expansion of consensus DR2 retinoic acid response elements during primate evolution. BMC Genomics 8:23.
So embedded with the Alu sequence is a DNA sequence that is able to recruit RARs! But it gets even more interesting, as the DNA motif contained within the Alu element can become an even better binding site by one of the most common mutations:
We have mapped the positions of all consensus DR-type hormone response elements in the human genome, and found that DR2 motifs, recognized by retinoic acid receptors (RARs), are heavily overrepresented (108,582 elements). 90% of these are present in Alu repeats….. 95.5% of Alu-DR2s are distributed throughout subclasses of AluS repeats, and arose largely through deamination of a methylated CpG dinucleotide in a non-consensus motif present in AluS sequences. We find that Alu-DR2 motifs are located adjacent to numerous known retinoic acid target genes, and show by chromatin immunoprecipitation assays in squamous carcinoma cells that several of these elements recruit RARs in vivo.
The researchers conclude their study by noting
We find that consensus DR2 motifs are heavily overrepresented in the human genome relative to other DR response elements due to their presence in a subset of Alu motifs, in particular in AluS sequences…. Consensus Alu-DR2 elements arose predominantly through deamination of a methylated CpG dinucleotide present in AluS elements rather than through random base substitutions.
Readers of The Design Matrix may have perked up. Did someone say deamination? Deamination? Alu elements, reformatting the genome, also contain sequence that can be converted to a consensus RAR binding site by through the process of cytosine deamination.
The non-telic perspective has insisted, “Any engineer would have replaced cytosine, but evolution is a tinkerer not an engineer.” But as I noted years ago:
A second possible explanation was that cytosine was chosen because of its predisposition to undergo deamination. This explanation may also intersect with the hypothesis of necessity, as a good designer often finds ways to turn a “design problem” into an opportunity. In this case, let me propose that cytosine, far from being something any engineer would replace, may actually have played an instrumental role in the front-loading of evolution. Put simply, C-to-T transitions, as a function of deamination, may have posed a form of “direction” on evolution.
We might now catch a glimpse of one possible such direction – the creation of RAR binding sites during the reformatting of vertebrate developmental programs.
YOU TOO, PAX6?
From Novel PAX6 Binding Sites in the Human Genome and the Role of Repetitive Elements in the Evolution of Gene Regulation, by Yi-Hong Zhou, Jessica B. Zheng, Xun Gu, Grady F. Saunders, and W.-K. Alfred Yung.
Pax6 is a transcription factor that plays a very important role in development:
Pax6 is an important regulator of transcription in the development of the eye and central nervous system in vertebrates and invertebrates. The protein sequence and function of Pax6 are evolutionarily conserved so that murine and human Pax6 proteins are identical. Even human and zebrafish Pax6 proteins share a 97% homology. With few exceptions, Pax6 is expressed during central nervous system development and during fundamental sensory processes, particularly of the photoreceptive organ. The expression of Pax6 in adult mammals is restricted to the eye, brain, and pancreas.
And guess what?
Through an in vitro protein–DNA binding approach, we identified three new types of PAX6 binding sequences in the human genome. Two exist in a single copy in genome, and one within Alu repetitive elements.
These observations led to a putative evolutionary scenario that describes how a transcription factor (Pax6) recruits new target genes in the genome. Mediated by repetitive elements and mutations in CpG dinucleotide hot spots (e.g., Alu or B1), several types of Pax6 binding sites have been generated and were spread over the entire genome.
Mutations in CpG hotspots – cytosine deamination.
Need I say more?
What’s that? B1? Oh, that’s a good one too.
YOU TOO, P53?
p53 has been called the “Guardian of the genome” and is commonly known as a tumor-suppressor gene – a gene that suppresses the formation of cancer. Normally, the cell expresses low levels of the p53 protein, but if the genome is damaged, p53 levels rise and in turn activate several programs that will arrest the cell cycle and attempt to repair the DNA damage. If the genome cannot be repaired, p53 will then activate programmed cell death and the cell will die rather than pass on the damage to future generations.
p53 brings about these responses to DNA damage either by activating pre-existing proteins, or by specific binding to promoters or introns (the p53 element) of various target genes involved in cell cycle arrest, DNA repair, and cell suicide.
Well, guess what? It turns out that our friend the Alu element is poised to generate these p53 binding sites with a little help from cytosine deamination. This was shown in a recent study entitled, “Methylation and deamination of CpGs generate p53-binding sites on a genomic scale” by Tomasz Zemojtel, Szymon M. Kielbasa, Peter F. Arndt, Ho-Ryun Chung andMartin Vingron. Let me simply provide a few excerpts:
Our findings indicate that Alu sequences can serve as templates for the generation of p53- binding-sites on a genome-wide scale.
Thus, we conclude that methylation and deamination of cytosines generates a high number of preferred p53-binding sites in Alu elements, some of which were recruited to regulate target gene expression.
Finally, we used a statistical model to identify and characterize 20-mers in the human genome that lie on the fastest evolutionary trajectories of p53-site formation. Up to ~151 000 20-mers resided in the highest probability range and required only one cytosine deamination to become a p53 site. As expected, most of these (~119 000) were located in Alu sequences and ~10 000 resided in the non-repetitive portion of the genome.
And here is the basic conclusion:
Our findings strongly indicate that the formation of p53- binding sites by CpG deamination, in particular in Alu repeats but also in non-repetitive DNA, is an important evolutionary process. Alu repeats, which amplified to over one million copies, harbor one-third of the total number of CpGs in the human genome, resulting from which, most Alus are transcriptionally silenced by methylation. Because Alu elements are associated with gene-rich regions, the process of cytosine deamination is capable of transforming numerous silent Alus into functional regulatory elements. As we pointed out here, this process has assigned a role for Alus in spreading of p53-binding sites and in recruiting new target genes to the p53 regulatory network in a species-specific manner.
Let’s add this up. We’ve seen that the Alu element, which was in effect contained within the SRP (a device that is as ancient as the ribosome), has played an important role in primate and human evolution. It has likely reformatted the genome, spreading REST binding sites around to help facilitate brain evolution. And with the help of the most common mutation in mammals, cytosine deamination, it has spread and created binding sites for two transcription factors crucial to development, the retinoic acid receptor and Pax6, and now we see the Alu elements spreading and creating p53 sites to fine tune the genomic surveillance pathways of primate cells. It’s no wonder the authors of the p53 study conclude that “deamination of CpGs constitutes a universal mechanism for generation of different transcription-factor-binding sites in Alus.”
Luck? Or is it nudging?
Anyway, let me again quote myself from around 2002:
A second possible explanation was that cytosine was chosen because of its predisposition to undergo deamination. This explanation may also intersect with the hypothesis of necessity, as a good designer often finds ways to turn a “design problem” into an opportunity. In this case, let me propose that cytosine, far from being something any engineer would replace, may actually have played an instrumental role in the front-loading of evolution. Put simply, C-to-T transitions, as a function of deamination, may have posed a form of “direction” on evolution.
IT JUST DOESN’T STOP!
The title of yet another paper speaks for itself: Alu elements contain many binding sites for transcription factors and may play a role in regulation of developmental processes (Paz Polak and Eytan Domany. BMC Genomics. 2006; 7: 133).
Let’s look at the abstract.
This research suggests that evolution used transposable elements to insert modules of transcription factor binding motifs into promoters and, by means of their presence, assemble higher level regulatory networks. In order to explore this question we focused on Alu elements, which are good potential candidates to be part of the building blocks of regulatory networks for two reasons. First, Alu elements are abundant in the upstream region of the TSS of genes, and second, Alu elements contain dozens of putative BSs for TFs. Some of these BSs were found before and their association with Alu was also reported, whereas in some cases although the BSs were found, the fact that they reside on Alu went unnoticed. Finally, we list here also BSs on Alu that were not identified previously. Our findings imply that the biological pathway on which Alu-mediated regulation appears to have the most significant impact is the development process. Many of the TFs that have binding motifs on Alu are associated with development; moreover, some of these BSs were previously demonstrated to be functional in vivo and essential to regulation of some target genes.
TF stands for transcription factors, proteins that bind to specific DNA sequence to activate the process of gene expression. TSS stands for transcription start site, the precise point at which the copying of the DNA into an RNA format begins. BS stands for binding site, the region of the DNA that binds with the transcription factors.
Given that Alu elements contain so many potential binding sites for so many genes involved in stress response, blood formation, heart development, muscle development, brain development, eye development, pancreas developments, overall embryonic development, and sterol biosynthesis, their ability to reformat the genome, and its developmental programs, make them a very powerful tool for evolution. And to think they were once dismissed as junk DNA.
NUDGING THE NUDGE
We have seen that the Alu element is poised to generate binding sites for multiple transcription factors involved in development. Even more interesting is the manner in which the process of cytosine deamination can easily create several of these transcription factor binding sites. It’s as if we have two nudges, working together, to facilitate the evolution of primates.
Yet there is more to the story. Recall that the cytosine deamination events occur at CpG sites.This is simply where a cytosine (C) is followed by a guanine (G). Why is this?
Let me quote from this site:
CpG islands are often located around the promoters of housekeeping genes (which are essential for general cell functions) or other genes frequently expressed in a cell. At these locations, the CG sequence is not methylated. By contrast, the CG sequences in inactive genes are usually methylated to suppress their expression. The methylated cytosine may be converted to thymine by accidental deamination. Unlike the cytosine to uracil mutation which is efficiently repaired, the cytosine to thymine mutation can be corrected only by the mismatch repair which is very inefficient. Hence, over evolutionary time scales, the methylated CG sequence will be converted to the TG sequence. This explains the deficiency of the CG sequence in inactive genes.
To this, let me add a couple of more facts. First, transposons contain roughly 35% of the GC sequence in a genome. Second, when transposons jump, the response from the genome is the methylate them. By placing methyl groups on the C’s of the CG transposon sequence, the genome is effectively silencing them. Yet the silencing of the transposon poises the system for cytosine deamination. Built into the fabric of jumping genes is the recruitment of cytosine deamination.
And as if that was not enough, one study determined that this effect would become most pronounced in warm-blooded animals, where “CT and GA transitions occur at higher rates than other base substitutions in mammals.” Why is this?
the rate of cytosine deamination is strongly temperature-dependent. Given a typical body temperature of 20°C in fish and amphibians versus 37°C in mammals, cytosine deamination should occur 20.6-fold more slowly in fish and amphibians (based on eq. 3 , k37°C/k20°C = (7.0 x 10-13/s)/(0.34 x 10-13/s) = 20.6). This indicates that positive feedback between cytosine deamination and GC content is insignificant in fish and amphibians, which is consistent with the lack of distinct classes of isochores in fish and amphibians. Reptiles are intermediate between cold-blooded vertebrates (i.e., fish and amphibians) and homeothermic vertebrates (i.e., birds and mammals) in terms of body temperature, remaining levels of 5-methylcytosine , presence of GC-rich isochore structures, and presence of cytological chromosome bands.
Sit back and take it in. The SRP, which elegantly solves a core problem for life and helps nudge into existence the eukaryotic cell plan, also happens to have an RNA component that can moonlight as a transposon – the Alu element. The Alu element played an important role in primate and human evolution, helping to reformat the genome by, in part, recruiting transcription factors and surveillance proteins to different regions. This reformatting function was, in turn, facilitated by cytosine deamination, which in turn was facilitated by the methylation of cytosines that comes with gene jumping, all in the context of a warm-blooded body that enhances the rate of cytosine deamination. I guess we’re all just lucky that all these puzzle pieces fell together like that.
Well, there is more to say about the Alu element. For example, we should explore why it is that Alu elements still can bind to the SRP proteins (do you know that Alu function depends on these proteins?). And we should explore some uncanny similarities between the Alu elements in primates and the B1 elements in rodents. But let’s take a break from all that Alu investigatin’ for the moment.
I’ve been talking about Alu elements for weeks now, so I was going to try to change the topic. But alas, I can’t stop myself. Here is some more Alu Fun for those similarly intrigued by the manner in which these nifty reformatting devices can facilitate evolution.
First, here is a decent video that outlines the basics of Alu retrotansposition.
Second, remember how it has become clear that the genome has a three-dimensional architecture?
MORE ALU ON THE BRAIN
RNA editing, DNA recoding and the evolution of human cognition.
Mattick JS, Mehler MF.
Trends Neurosci. 2008 May;31(5):227-33.
RNA editing appears to be the major mechanism by which environmental signals overwrite encoded genetic information to modify gene function and regulation, particularly in the brain. We suggest that the predominance of Alu elements in the human genome is the result of their evolutionary co-adaptation as a modular substrate for RNA editing, driven by selection for higher-order cognitive function. We show that RNA editing alters transcripts from loci encoding proteins involved in neural cell identity, maturation and function, as well as in DNA repair, implying a role for RNA editing not only in neural transmission and network plasticity but also in brain development, and suggesting that communication of productive changes back to the genome might constitute the molecular basis of long-term memory and higher-order cognition.
Luckily, I ran across a web article that borrows heavily from M&M’s article. I would encourage you to read the whole thing (if you can’t get your hands on M&M’s article, that is). Here are some excerpts:
Two classes of coincidental events stand out in the evolution of primates, the end result of which is to greatly expand the diversity of transcripts and proteins and to build the complex regulatory architecture required for human intellectual capacity. The first is the dramatic increase in RNA editing, a process that systematically alters the genetic messages transcribed from the genome, creating new coding and non-coding RNAs, and hence new proteins as well as RNAi (interfering RNA) species that regulate networks of genes. The second is the expansion of primate-specific Alu retrotransposons, which multiply through RNA intermediates that are reverse-transcribed and inserted into the genome. It so happens that the increase in RNA editing in primates occurs almost entirely within primate-specific Alu elements.
RNA editing occurs in all taxonomic groups of organisms, but increases dramatically in vertebrate, mammals and primates, with humans exhibiting the highest levels of edited and multiply-edited transcripts. RNA editing occurs in most, if not all tissues, but is particularly active in the nervous system, where transcripts encoding proteins involved in fast neural transmission, such as ion channels and ligand-gated receptors [1, 2]. These species-specific alterations have profound importance for normal nervous system function.
A to I editing is much more abundant in humans than in mice, and over 90 percent of this increased editing occurs in Alu elements in mainly noncoding regions of RNAs, i.e., in untranslated regions (UTRs) of mRNAs, in introns and intergenic transcripts.
All of this fits nicely with the use of Alu elements as a means of nudging the evolution of a human-like brain. I’ll probably talk about it all in more detail sometime in the future. But then things start to get really cool (but more speculative):
Learning and memory in the brain is similar to the immune response in many ways. A key feature of the immune system is the alteration of DNA sequence in the genome to generate receptor diversity, in part catalyzed by the APOBEC family of cytidine deaminases that can catalyze cytosine to uracil (C to U) and cytosine to thymine (C to T) editing of RNA and DNA.
The possibility exists that DNA recoding – rewriting genome DNA – is a central feature of both the immune and nervous systems. DNA recoding may be involved at the level of establishing neuronal identity and neuronal connectivity during development, learning and brain regeneration. And it appears that the brain, like the immune system, also changes according to experience.
Mattick and Mehler suggest that the potential recoding of DNA in nerve cells (and similarly in immune cells) might be primarily a mechanism whereby productive or learned changes induced by RNA editing are rewritten back to DNA via RNA-directed DNA repair. (See the latest model of RNA-directed recoding of DNA proposed for the immune system  by Ted Steele at Australian National University Canberra). This effectively fixes the altered genetic message once a particular neural circuitry and epigenetic state has been established.
The suggestion that there might be communication of RNA-encoded information back to the genome at the epigenetic and genetic levels would also potentially explain the surprising observation that diverse RNA species and associated regulatory signals are not only trafficked to the periphery of the nerve cell, but might also undergo retro-transport back to the nucleus. There is increasing evidence for retrograde transport of RNAs, including small RNAs, to the nucleus in a broad range of organisms, as well as for RNA informational exchange between cells through ‘exosomes’, specific RNA receptors and derivation of presynaptic RNA from surrounding glial cells.
BTW, if abiogenesis did occur and did involve an RNA World, a human-like brain was in the cards. Y’just can’t run from the Bunny!