Here is an abstract of a study that was published in 2006 by Juan A. Ranea, Antonio Sillero, Janet Thornton, and Christine A. Orengo (J Mol Evol 63:513–525):
By exploiting three-dimensional structure comparison, which is more sensitive than conventional sequence-based methods for detecting remote homology, we have identified a set of 140 ancestral protein domains using very restrictive criteria to minimize the potential error introduced by horizontal gene transfer. These domains are highly likely to have been present in the Last Universal Common Ancestor (LUCA) based on their universality in almost all of 114 completed prokaryotic (Bacteria and Archaea) and eukaryotic genomes. Functional analysis of these ancestral domains reveals a genetically complex LUCA with practically all the essential functional systems present in extant organisms, supporting the theory that life achieved its modern cellular status much before the main kingdom separation (Doolittle 2000). In addition, we have calculated different estimations of the genetic and functional versatility of all the superfamilies and functional groups in the prokaryote subsample. These estimations reveal that some ancestral superfamilies have been more versatile than others during evolution allowing more genetic and functional variation. Furthermore, the differences in genetic versatility between protein families are more attributable to their functional nature rather than the time that they have been evolving. These differences in tolerance to mutation suggest that some protein families have eroded their phylogenetic signal faster than others, hiding in many cases, their ancestral origin and suggesting that the calculation of 140 ancestral domains is probably an underestimate.
The discussion and data concerning the differences in genetic versatility between protein families is quite interesting, and worth discussing later, but let’s stay focused on our hunt for LUCA, a “genetically complex” group of organisms “with practically all the essential functional systems present in extant organisms.”
Here is part of the conclusion of the study:
The assignment of completed genome sequences to CATH structural domain superfamilies has provided a sensitive method to derive a more realistic distribution of superfamilies in distant species. From this annotation we know that the LUCA, or the primitive community that constituted this entity, was functionally and genetically complex (Table 1, Fig. 1, Supplementary Table 3), supporting the theory that life achieved its modern cellular status long before the separation of the three kingdoms (Doolittle 2000). Contrary to analyses based purely on sequence conservation and universal ubiquity throughout all species, which suggested a simple LUCA with translation and few other genes (Koonin 2003), with the application of a more sensitive method to detect remote homology, we can affirm that the LUCA held representatives in practically all the essential functional niches currently present in extant organisms, with a metabolic complexity similar to translation in terms of domain variety. The criteria applied to select ancestral superfamilies are stringent in order to ensure a confident sample of ancestral representatives. The selected 140 ancestral domains are analogous to spots in a ‘‘connect-the-dots’’ picture, suggesting the presence of other hidden partners in the LUCA’s functional composition. Likewise, the true genetic and functional content of the LUCA has, with all probability, been underestimated. Even if the ancestral domain set in the LUCA was much larger than the set considered here, the functional analysis of this selected sample reveals that the LUCA comprised functions for (i) replication, transcription, and translation; (ii) the use of glucose and other sugars; (iii) the assimilation of amino acids and nucleosides/bases; (iv) the synthesis of ATP both by substrate level phosphorylation and through redox reactions coupled to membranes; (v) signal transduction and gene regulation; (vi) protein modification; (vii) protein signal recognition, transport, and secretion; (viii) protein folding assistance; and (ix) cell division.
Thus, by scanning for shared protein domains, the researchers determined that LUCA was remarkably complex. The final paragraph of their paper also contains a useful metaphor:
The metaphor arising from this analysis is of a LUCA comprised of ‘‘soft’’ and ‘‘hard’’ parts just as human bodies are made of bones and tissues. When anthropologists discovered ancient bones they did not surmise our ancestors_ appearance as skeletons. They imagined bones surrounded by muscles and other tissues more susceptible to decomposition during time.
In other words, we can visualize the 140 universal domains as the discovery of several of LUCA’s bones. But it certainly seems plausible that LUCA contained much more than these domains. Why? Do not forget that LUCA probably exhibited PICERAS, as I explained earlier.