A Subtle Front-loading Pattern

Cytosolic beta-glucosidase is an enzyme found in your liver that plays a role in the metabolism of sugar.   This particular enzyme is a member of one family among many families of glycosyl hydrolases.  I thought I would use this enzyme to show you something subtle, but interesting, that adds more plausibility to the hypothesis of front-loading.

This liver enzyme is 469 amino acids in length.  So let’s use the human amino acid sequence to probe databases for homologs in other animals.

Below is a phylogenetic tree for animals (highlighted in the boxes).

I was able to find clear homologs of the human protein in mouse, arthropods, and nematodes.  Nothing remarkable here, as carbohydrate metabolism is a core feature of animal cells.

But what if we look near the base of the tree, among the simple creatures highlighted in the yellow boxes?    I was unable to find a homolog of this liver enzyme among sponges, placozoa (not shown in the above figure), ctenophores, and Hydra (an example from cnidarians).  In other words, the enzyme is missing from these more ancient lineages, suggesting it arose sometime after these groups split away from the lineage that led to all the creatures in the pink, green, and blue boxes.  No big deal.

But it turns out that one member of cnidaria – the jellyfish Nematostella – does have a clear homolog of the human gene where 44% of the positions are shared by the same amino acid.  So it would seem the enzyme is missing from the sponges, ctenophores, Hydra, and placozoa because those lineages lost the enzyme.

So what if we dive deeper and search among the fungi?  Here, we will find lots of homologs to the human enzyme, so it must be very useful in the fungal way of life. Now let’s expand our view to include the entire eukaryotic tree as shown below.

You can see the animal branch highlighted in the red rectangle in the purple box in the lower right.  The branch above it including ascomycetes are the fungi.  Since animals and fungi both have this enzyme, it would appear it originated where the two lineages converge (the red circle).  Supporting this view is that amoebozoa don’t have a homolog of this enzyme.  It is interesting to note that choanoflagellates also lack a homolog, indicating that they too lost it.

But what if we looked outside the purple box among all the various other protozoa?  When we do this, we find sporadic homologs.  There is one in Oomycetes (the black box in the lower pink corner)

In this case, 39% of the positions have identical amino acids when compared to the human protein.

We find it one kinetoplast (the black box in the upper right yellow):

In this case, 28% of the positions have identical amino acids when compared to the human protein.

And we find it in a very simple species of green algae (black box in upper left green):

In this case, 36% of the positions have identical amino acids when compared to the human protein.

And finally, it is found in many land plants (black box in upper left green), where about 40% of the positions have identical amino acids when compared to the human protein.

So it would seem that this particular glycosyl hydrolase was present in last common ancestor of all eukaryotes (the green circle in the middle) and has been lost many times over during evolution.

So what kind of pattern do we have here?  This enzyme is apparently important in complex animals, land plants, and fungi, given its common occurrence among these groups.  But it doesn’t seem all that important to simple animals and single-celled organisms, given its rarity among these groups.  Yet the enzyme apparently arose with the origin of eukarya itself given its widespread distribution.  In other words, it is ancient, but did not become important until long after it arose. And that’s a pattern that echoes front-loading.  What’s more, with so much extensive loss throughout evolution, it raises the specter of an ancestral eukaryotic state that was quite complex, perhaps more so than any cell alive today.

[BTW, there are two caveats here.  It is possible the sporadic appearance of the enzymes among protozoa could be attributed to lateral gene transfer.  Also, keep in mind we only have a handful of completed genomes to probe.]

Advertisements

8 responses to “A Subtle Front-loading Pattern

  1. So let’s use the human amino acid sequence to probe databases for homologs in other animals.

    How do you know they are homologs?

    Does mere similarity now equal homology?

    That sounds subjective…

  2. OK you have identified polypeptides that have some degree of amino-acid sequence similarity.

    Is the function also similar?

  3. As usual I feel like I’m missing something, because I don’t see the implication of the ancestral state being more complex, Mike Gene.

    Assuming a homologous minimal function in the ancestral state, the probe for sequence homologs suggests (the ancestral) function only requires ~40% of the derived (human) encoded sequence.

    Simpler, huh?

  4. Joe,

    I know. You think these proteins are examples of common design and not homologs and that science should include this view. My interest is not to convince you of common descent; my interest is to explore ways in which common descent can be employed to carry out design objectives.

  5. Rock,

    It’s probably even less than 40%, because I’ll bet if we aligned all these sequences, it is not the same exact positions that are being retained. Nevertheless, there is some functional core of sequence that is preventing evolution from erasing their ancient relationships. But that’s not my point here.

    My point is in relation to the extensive loss of this particular enzyme. If it had been lost in only three more lineages, there would be no black boxes in that eukaryotic phylogenetic tree. So we would then infer this protein arose in the red circle and thus miss its ancient state (I’ll give another example in the next entry). So it makes you wonder just how many proteins are mistakenly believed to be more recent than they really are.

    Once we get our hands on hundreds of completed protozoan genomes, we’ll be able to make or refute this argument with more rigor. For now, I’m just raising the “specter.” Think of that ghost as a prediction from front-loading.

    In fact, let me add one more brief point that I’ll flesh out in more detail someday. Sequence analysis only gives us the tip of the iceberg, as it is known that structure is more conserved than sequence. Front-loading will become even more strongly supported when we begin to find structural homologs of “metaozoan-specific” proteins among the various protozoa. For example, I predict this will happen with a protein known as p53.

  6. Mike:

    You think these proteins are examples of common design and not homologs and that science should include this view.

    Could be an example of convergence.

    And my position is that science needs to deal with reality- whatever that reality is.

    My interest is not to convince you of common descent; my interest is to explore ways in which common descent can be employed to carry out design objectives.

    Right- that is why I didn’t say anything about common design.

    However the fact remains “homolog” is a loaded word.

    Also targeted search evoltionary algorithms would best describe a front-loading scenario.

    See evolving inventions

    You don’t need the excess baggage of unused genetic material.

    You just need the coding and resources to build what you need- the targets.

    Different, yet similar, solutions would be very likely.

    It would all depend on what was available when that solution was produced.

    And BTW my position is that front-loading or directed evolution is the only way UCD is possible.

  7. my interest is to explore ways in which common descent can be employed to carry out design objectives.

    Do you have a list?

    For example, what is needed:

    1- Targets (those objectives)

    2- Some intial conditions

    3- Resources

    4- Algorithms that can, starting with those conditions and resources, achieve the objectives, ie reach the targets

    and so on.

    With this approach sequence similarity would be due to similar solutions for similar problems.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s