We have already seen that most of the universal small subunit ribosomal proteins have alternative functions. If ribosomal proteins can be used as a vehicle for front-loading, given that a designer can count on the ribosome being perpetuated far into the future with minimal changes, why not also use the ribosomal RNA (rRNA) itself?
rRNA forms the functional part of the ribosome where, with the help of the ribosomal proteins, it folds into a complex 3D structure that interacts with the messenger RNA (mRNA) and transfer RNAs to carry out the core processes of protein synthesis. While rRNA, which is synthesized by RNA polymerase 1 is typically the end-product, natural genetic engineering processes could copy and transplant rRNA sequence so that it was under the control of an RNA polymerase II promoter. This would mean that the rRNA sequence would suddenly find itself being transcribed as mRNA and thus translated into a protein.
A clever front-loader might encode proteins-for-the-future in the rRNA sequence itself. In other words, while rRNA sequence is not normally used to code for proteins, it could be used to store code for some proteins. Of course, the coding potential is limited, as rRNA sequence plays a crucial, conserved role in the process of protein synthesis. The ability to code amino acid sequence would thus be limited by the sequence needed for the rRNA to carry out its function. Nevertheless, the opportunity for some degree of front-loading exists.
With this in mind, I decided to take a rather unique approach and search for protein sequence encoded in rRNA.Let’s begin with the following sequence:
ACTAGTTACGCGACCCCCGAGCGGTCGGCGTCCCCCAACTTCTTAGAGG GACAAGTGGCGTTCAGCCACCCGAGATTGAGCAATAACAGGTCTGTGAT GCCCTTAGATGTCCGGGGCTGCACGCGCGCTACACTGACTGGCTCAGCGT GTGCCTACCCTGCGCCGGCAGGCGCGGGTAACCCGTTGAACCCCATTCGT GATGGGGATCGGGGATTGCAATTATTCCCCATGAACGAGGAATTCCCAGT AAGTGCGGGTCATAAGCTTGCGTTGATTAAGTCCCTGCCCTTTGTACACA CCGCCCGTCGCTACTACCGATTGGATGGTTTAGTGAGGCCCTCGGATCGG CCCCGCCGGGG
This sequence is from mouse 18S rRNA, the core RNA component of the small ribosomal subunit that binds all the moonlighting ribosomal proteins we have discussed earlier. This sequence corresponds to positions 1402-1759 of the 18S rRNA and shows > 80% sequence identity with various sponges and protozoa.
What happens if we were to translate this sequence (recall, rRNA sequence is not normally translated) and use that amino acid sequence to probe a database of proteins? Well, we find this:
>dbj|BAE89989.1| unnamed protein product [Macaca fascicularis] Length=130 Score = 208 bits (529), Expect = 2e-52 Identities = 118/119 (99%), Positives = 118/119 (99%), Gaps = 0/119 (0%) Frame = +1 Query 1 TSYATPERSASPNFLEGQVAFSHPRLSNNRSVMPLDVRGCTRATLTgsacaypapagagn 180 TSYATPERSASPNFLEGQVAFSHPRLSNNRSVMPLDVRGCTRATLTGSACAYP PAGAGN Sbjct 3 TSYATPERSASPNFLEGQVAFSHPRLSNNRSVMPLDVRGCTRATLTGSACAYPTPAGAGN 62 Query 181 pLNPIRDGDRGLQLFPMNEEFPVSAGHKLALIKSLPFVHTARRYYRLDGLVRPSDRPRR 357 PLNPIRDGDRGLQLFPMNEEFPVSAGHKLALIKSLPFVHTARRYYRLDGLVRPSDRPRR Sbjct 63 PLNPIRDGDRGLQLFPMNEEFPVSAGHKLALIKSLPFVHTARRYYRLDGLVRPSDRPRR 121
Whoa! A chunk of rRNA sequence from mouse 18S rRNA appears to code for an unnamed protein product in this little fella:
But it is not just this monkey. This same rRNA sequence also encodes a “conserved hypothetical protein” from a wide range of eukaryotic organisms (with E values less than -10), including:
So let’s look more closely at the Macaca protein. The amino acid sequence comes from a translated cDNA (cDNA is DNA sequence derived from protein-coding mRNA sequence). Here is how this cDNA is described:
Macaca fascicularis brain cDNA clone: QflA-20247, similar to human stathmin-like 2 (STMN2), mRNA, RefSeq: NM_007029.2
Similar to STMN2?! Here’s the expression profile of STMN2:
If you squint hard enough at the x-axis, you’ll see that STMN2 is expressed only in the nervous system.
In fact, this page describes STMN2 as follows:
Superior cervical ganglion-10 protein
Short name=Protein SCG10
May play a role in neuronal differentiation, and in modulating membrane interaction with the cytoskeleton during neurite outgrowth.
STMN2 is one version of stathmin protein, which is described as follows:
Involved in the regulation of the microtubule (MT) filament system by destabilizing microtubules. Prevents assembly and promotes disassembly of microtubules. Phosphorylation at Ser-16 may be required for axon formation during neurogenesis. Involved in the control of the learned and innate fear.
So let’s use a program known as CLUSTALW to align STMN2, the unnamed protein from Macaca, and stathmin from Macaca:
Stathmin_[Macaca_fasciculari MASS----------------------------------DIQVKELEKRAS STMN2_[Homo_sapiens] MAKTAMAYKEKMKELSMLSLICSCFYPEPRNINIYTYDDMEVKQINKRAS unnamed_protein[Macaca_fasci MLTS-------------------------------------YATPERSAS * .: :: ** Stathmin_[Macaca_fasciculari GQAFELILSPRSKESVPEFPLSPPKKKDLSLEEIQKKLEAAEERRKSHEA STMN2_[Homo_sapiens] GQAFELILKPPSPISEAPRTLASPKKKDLSLEEIQKKLEAAEERRKSQEA unnamed_protein[Macaca_fasci PNFLEGQVAFSHPRLSNNRSVMP-----LDVRGCTRATLTGSACAYPTPA : :* : .: . *.:. : :.. . * Stathmin_[Macaca_fasciculari EVLKQLAEKREHEKEVLQKAIEENNNFSKMAEEKLTHKMEANKENREAQM STMN2_[Homo_sapiens] QVLKQLAEKREHEREVLQKALEENNNFSKMAEEKLILKMEQIKENREANL unnamed_protein[Macaca_fasci GAGNPLNPIRDGDRGLQLFPMNEEFPVSAGHKLALIKSLPFVHTAR---- . : * *: :: : .::*: .* : * .: : * Stathmin_[Macaca_fasciculari AAKLERLREKDKHIEEVRKNKESKDPADETEAD STMN2_[Homo_sapiens] AAIIERLQEKERHAAEVRRNKELQVELSG---- unnamed_protein[Macaca_fasci --RYYRLDGLVRPSDRPRRGRPTALAER----- ** : . *:.:
The positions marked by * have the same amino acid, the positions marked by : have very similar amino acids, and the positions marked by . have somewhat similar amino acids.
Since roughly 30% of the positions appear to contain the same or highly similar amino acids, it is plausible that the unnamed Macaca protein is homologous to stathmin. And if this is the case, a brain protein is effectively encoded by a portion of the 18S rRNA sequence.
The hypothesis of front-loading evolution allowed me to hypothesize that rRNA sequence may actually contain code for the formation of proteins. When I translated a portion of 18S rRNA mouse sequence, and used that translated sequence to probe protein databases, an unnamed protein from Macaca, along with a conserved hypothetical protein from various distantly related eukaryotes, was retrieved. This protein might be homologous to stathmin, a protein that regulates microtubule assembly in neurons.