Go to Home Page Diana, Goddess of the Hunt — for Ancestors!
 
Go to Home Page
 
TMRCA — An Alternate View
The TMRCA calculation [Time to Most Recent Common Ancestor] is mathematically based on an average mutation rate derived from large samples of tested individuals.  The problem is that the frequency of mutations is not constant, it's random, and it only appears to even out in large samples and deep time frames.  Therefore, and in my opinion, TMRCA is nearly useless when applied to small samples (individual families) in genealogical time (under 25 generations).  I do not dispute its usefulness with large samples in paleoanthropological time frames.
The "average" is a descriptive statistic.  It's one of the statistics used as a shorthand way of describing the characteristics of a group.  For example, if you have a classroom of 4th grade students, you can describe their height by giving a list of their heights, which would be a valid, though clumsy, way to describe them.  We would probably find the average height a more efficient way to describe the group's height.  But please note — and this is the point — by calculating the average, what you have discovered here is not some underlying "law of the universe" that governs the height of school children.  It is simply a description of this group, not a predictor of other groups, not unless the other group is a carefully selected 4th grade class and the sample sizes are valid.  In other words, a statistical average is not useful as a predictor unless the samples are comparable and, especially, not unless the sample sizes are valid.  Oddly enough, people who clearly understand this basic requirement seem to abandon it when it comes to genealogical DNA research.

Mutations are random, which is the direct opposite of even.  You can take any sample of STR test data and calculate an "average" mutation rate, but this average is merely a description of that particular data set.  It is not a natural constant, that is, it is not some underlying law of the universe, such as, c, the speed of light.  It only appears to be a constant when applied to large data sets because only in large data sets does the randomness begin to even out.  And I say, "begin" to even out because many different mutation rates have been derived, and the reason they don't agree is that the sample sizes still aren't large enough to produce a consistent result.  When applied to small data sets, the randomness of mutations becomes highly evident —  and the TMRCAs based on a delusory "constant" mutation rate become useless.

My favorite example is that of my first cousin, who tested my maternal grandfather's STRAUB line for me.  To simplify the example (we actually have two dozen members of this family tested), my cousin, who is modal for the family, matches person A at 67/67 and person B at 66/67.  The FTDNA-TiP calculator will tell you my cousin is more closely related to A than he is to B:
 
Generation A - 67/67 B - 66/67
1 43.40% 18.66%
2 68.05% 39.86%
3 81.99% 57.93%
4 89.79% 71.61%
5 94.23% 81.32%
6 96.74% 87.94%
  … …
18 100%  
22   100%

My cousin's probability of being related to A reaches 100% at 18 generations and for B reaches 100% in 22 generations.  In other words, my cousin has a 100% chance of being related to both of them within genealogical time.  Because they have a paper connection to the same progenitor, we have just proven something we already knew, or at least believed — which is not to belittle having proved it.  It's the main reason for being tested.

The problem is, the TiP calculations show my cousin more closely related A than to B, and it's a problem because B is his brother and A is his 6th cousin (once removed).  His brother just happens to bear a new mutation, one not found in any other member of the family.  (Thankfully, he has a good sense of humor because the family has now dubbed him, "the mutant.")

So, what does the above example tell us?  It tells us not to use TMRCA as a precise measure of the degree of relationship between people in genealogical time frames.  It is useful to genealogists only to the extent that it tells you when something is totally inside or outside the realm of possibility.  It cannot help you reconstruct the family tree.  DNA test results should be used to support or debunk a paper pedigree, not to create a pedigree.
I manage six DNA projects, and I don't use the TiP calculator, except to answer my project members' questions relating to it (and to prepare this article).  When I began my first project in 2004, I relied on these two guides, compiled by FTDNA:
Interpreting Genetic Distance within Surname Projects:  12 Markers
Interpreting Genetic Distance within Surname Projects:  25 Markers

When FTDNA started offering 37 markers, I began using this guide:

Interpreting Genetic Distance within Surname Projects:  37 Markers

And when FTDNA started offering 67 markers, I began using this guide:

Interpreting Genetic Distance within Surname Projects:  67 Markers

Yes, I do realize these guides are based on TiP calculations, but the advantage of the guides is that they convey the imprecision of genetic distance as a measure of relatedness, rather than giving the false sense of precision conveyed by the TiP calculator.

However, I seldom use even these guides anymore…

Most people doing their genealogy and being STR tested for genealogical purposes are Europeans who emigrated from Europe in the last four centuries, mostly to the United States, some to Australia, southern Africa, and elsewhere.  Hopefully, now that the United States has a President of both European and African ancestry — he's a descendant of my STRAUB ancestor, by the way —  we will see more Africans seeking their roots through DNA testing, too.  My point is that we "colonials" — we "displaced persons" — are the ones who've become disconnected from our roots and now seek them.  Europeans still living in Europe are not on this quest — they are largely still living where their roots lie, which is undoubtedly one reason they are so difficult to find and recruit for DNA testing.  Likewise, people who descend from recent immigrants usually know where they are from, so tend not to be tested, though the rest of us wish they would to help those of who don't know.  Thus, the sample of people being Y-DNA STR tested is largely comprised of persons displaced from their origins from 100 to 400 years ago, whose initial quest is to connect to their immigrant and, secondly, to "cross the pond." 

What the above means for Y-DNA surname projects is that most of our members connect to an immigrant between 5 and 15 generations removed — and that most of us have yet to connect to our Old World ancestry, which will not go back more than another 5 or 10 generations, at most.  And further, what it means is 1) that TiP calculations cannot help you, except in the broadest way, and 2) that experience — empirical evidence — will.

My experience with my surname projects has been that a typical member testing 67 markers will have accumulated from zero to three mutations away from the modal haplotype of his immigrant ancestor's descendants.  The rank of these frequencies is:  1, 0, 2, 3.  These frequencies are in keeping with the observation that, at 67 markers, you can expect roughly one mutation every seven generations.  The range of variation is large, however.  For example, I'm aware of one case where two mutations happened in a single generation, that is, two brothers with a GD of 2, though that is the only such case I'm aware of, but two mutations within three or four generations is not uncommon.

Note that these distances (0 to 3) are from the model haplotype for the family.  The distance between two individual descendants can be twice that and still constitute a good match.  Compare these empirically derived figures to the 67-marker guide, and they are reasonably congruent, with the family's modal haplotype serving as the "in-betweener" connecting persons who would otherwise appear less related than they really are. 

The two best examples from my projects are the J-M67 CARRICOs and the I1-AS5 STRAUBs, though we have, in fact, crossed the pond the the STRAUBs.

While I'm gaining confidence in what constitutes a "good match" between descendants of the same immigrant, I still do not have the empirical data I need to be confident of what constitutes a match "across the pond" — or between families who descend from multiple immigrations of a family with deep roots.

In the case of the above CARRICOs, the paper evidence supports that they all descend from a single immigrant to Maryland in 1674 because, among other things, no other CARRICO immigrant is found before the 1900 census.  The STRAUBs also appear to descend from one Württemberg family, though possibly from more than just the one 1733 immigrant to Philadelphia. However, it is equally clear that other, more common surnames represent multiple immigrations from the same family and families who histories go deeper than the typical surname adoption period in the 16th Century, going back as far as the 13th Century.  Those GDs range, so far, up to 6 from the modal, meaning as much as 12 between descendants.  However, as I said, I do not as yet have the empirical evidence I need to feel confident of these limits.  It would be worthy of publication for someone to compile the data we do have bearing on this matter.  In other words…

Instead of basing the meaning of genetic distance on a calculation so dependent on a constant mutation rate that isn't constant, base it directly on the data.  Both the pedigrees and the DNA test results are real, so use them to tell you what genetic distances really mean — which, of course, leads us to the issue of why having your members' lineages is so important and why I double-check, as best I can, the genealogy of my surname project members. 

Contact Home
Page
Table of
Contents
DNA
Hub
Biddle
DNA
Carrico
DNA
Corbin
DNA
Danish
DNA
Rasey
DNA
Straub
DNA
Pedigree
Charts
Census
Hubs
Every-Name
Indices

Go to Home Page
Privacy Policy ______
Go to Home Page