In order to better appreciate the teleological echoes of some recent research on the REST protein and its blinding sites, let’s first take some time to summarize the main points from James Shapiro’s review paper, “A 21st century view of evolution: genome system architecture, repetitive DNA, and natural genetic engineering” (Gene 345 (2005) 91–100).
Shapiro begins by outlining the perspective of “DNA as a data storage medium” and then an informatic metaphor that explores the genome:
Our current understanding of how coding sequence expression (data file access) and all these other processes operate is based upon the definition of cis-acting signals as part of the operon and replicon theories in the early 1960s (Jacob and Monod, 1961; Jacob et al., 1963). These cisacting signals are fundamentally different from any classical definition of a gene. They serve to format coding sequences and genome architecture in the same way that generic bit strings format the encoded information in electronic data storage media and guide the computational hardware to the right data files and indicate the appropriate routines to apply. Cis-acting signals in the genome similarly direct cellular hardware to form functional nucleoprotein complexes to carry out tasks such as transcription, replication, DNA distribution to daughter cells, and homology-dependent and homology-independent recombination (Shapiro, 2002a). Since they are generic and work at many locations, cis-acting signals belong to the repetitive component of the genome (Shapiro and Sternberg, 2005).
By applying an informatic perspective, we can appreciate the functional relevance and interconnections of genome features which have proved difficult to understand within the linear conceptual framework of classical genetics. Extending the informatic metaphor, it is possible to argue that genomes each have a characteric “system architecture,” in much the same way that different computer systems do (Shapiro, 1999; Shapiro and Sternberg, 2005).
The cis-acting signals are chunks of DNA that are usually (although not always) near the coding sequence of a gene (the nucleotide sequence that codes for the amino acid sequence). The function of these signals is to attract DNA-binding proteins, and their partners, to regulate the expression of the coding sequence. Such coding sequence can be silenced, or it can expressed at varying levels. Thus, the cis-acting signals are control elements that allow the genes (coding for proteins) to be responsive to the internal and external environment of the cell.
The major determinants of genome system architecture are the repetitive elements in the genome, such as tandemly arrayed repeats at centromeres (Choo, 2001), telomere repeats that permit the replication of chromosome ends (Blackburn, 2001), and dispersed repeats that contain many signals for transcription, chromatin organization, and nuclear localization (Jordan et al., 2003; Shapiro and Sternberg, 2005).
And further notes:
It has been evident for a long time that repetitive DNA is a more discriminating indicator of hereditary relationships than coding sequences…. In other words, the repetitive component of the genome is far more taxonomically specific than coding sequences. This conclusion is consistent with a key role for repetitive DNA in evolutionary diversification.
Recall that I just pointed out that the coding sequences of the mouse and human are mostly the same – “It’s almost as if you could take a mouse genome, reformat it, and make a human.” It is the pattern of these cis-acting elements that may be the most crucial feature in specifying the phenotypes that the blind watchmaker will see. In other words, these control elements will play a key role in facilitating evolution.
Next: Genomes and cellular computation