Genealogical Ponderings

the Professional Family History Blog

Professional Family History Blog
  1. Demystifying DNA 4: Autosomal DNA, Ancestry DNA and Family Finder tests



    In my series of posts about Demystifying DNA testing for use in family history this is probably the type of testing most people want to hear about: autosomal DNA testing. In other words, the tests used by Ancestry, 23 and Me, My Heritage, Family Tree DNA (and soon Living DNA) to find matches with close relatives, matches where the common ancestor is only 2-5 generations ago. Autosomal DNA testing is also known as a Family Finder test at Family Tree DNA and you may also find it described as “close cousin” testing.

    This is the type of testing taken by the highest number of people and the type of testing most likely to generate matches to your own known family names.

    Before we get into where to test and how to use your matches we need to look at the science some more to ensure we will get the most from the data. See the Introduction post for the general background to DNA.


    Autosomal DNA: The “Sciencey” Bit


    Remember from the first post in this series the image that showed your autosomal DNA? Here is it again:


    Human karyotype


    Ignore the Xs and Ys. We’ve talked about Y-DNA before and I’ll come onto X-DNA in a later post. Right now we are interested in the numbers 1 to 22: our 22 pairs of autosomal chromosomes.

    For each pair, one chromosome came from our father and one from our mother, in its entirety. So, for example, looking at Chromosome 1, perhaps the left hand chromosome came from our father and the right hand side from our mother.

    It is how we inherit autosomal DNA that makes it such a powerful tool for use with family history research.


    The Inheritance of Autosomal DNA


    The creation of eggs and sperm, in which DNA is passed from parent to child, occurs by a process called meiosis. During meiosis each chromosome is duplicated resulting in four copies of each chromosome, two paternal copies (in blue below) and two maternal copies (purple). DNA is then exchanged between the four copies, a process called recombination, which essentially mixes up the paternal and maternal DNA. Only one of the four chromosomes survives to be passed on in the egg or sperm. You don’t need to worry about the detail, the important part is that the two of each chromosome a parent has are mixed up so that a child receives a combination of both.


    Recombination (used with permission of


    Let’s take this a step further with an example: the descent of DNA from a couple, John and Mary. We will just look at one chromosome pair but the same principle applies to all 22 chromosome pairs. In all cases we will assume that the left chromosome came from the father and the right from the mother. So, John’s blue chromosome is from his father and the purple from his mother, and so on.


    Autosomal DNA descent – 1 generation


    John and Mary have two children: Thomas and Sarah. Each child gets 50% of the DNA from their father and 50% from their mother, but in different combinations. Thomas gets two sections of blue from John and two sections of purple. Sarah gets the top half of the blue and the bottom half of the purple. They have each inherited DNA from their father but they have not inherited exactly the same DNA. The way in which they inherit from Mary is also different from one another. On average siblings share about 50% of their DNA but there is a range, as we will discuss further later.

    Next we look at the case where Thomas and Sarah and have children of their own:


    Autosomal DNA descent – 2 generations


    Here we see that Robert’s paternal chromosome (the one on the left) is a combination of the two chromosomes for Thomas. He has some sections, or segments, of DNA from both of John’s chromosomes, and some from both of Mary’s chromosomes.

    Robert’s cousin Elizabeth, also has segments of DNA from all four of John and Mary’s chromosomes but Robert’s and Elizabeth’s DNA are different from one another. On average first cousins share 25% of their DNA.

    There’s a really important point to note here. A grandchild CANNOT inherit DNA from his grandparent that was not passed from grandparent to parent. Look back to the example above. Robert cannot have the top portion of the purple chromosome of John’s, because John did not pass it to Thomas. Likewise, Elizabeth cannot have the bottom section of the blue chromosome, because John did not pass it to Sarah.

    We have started to talk about the percentage of DNA you receive from different ancestors. The further back in time you go, the less DNA you share with your ancestors on average:


    Percentage of autosomal DNA shared on average with ancestors


    In fact, the amount you share with your distant ancestors eventually becomes so small that there is a chance that you will not share any DNA at all. This is the second important point: You do not inherit autosomal DNA from every one of your ancestors. Whilst all of your ancestors are included in your genealogical family tree, even if some of them are yet to be identified, not all of your ancestors are included on what we call your genetic family tree. The genetic family tree is shown below, DNA is only shared with those ancestors shaded in grey:


    The GENETIC family tree (used with permission of


    This could potentially be very frustrating if your aim was to find links to one of your great x 4 grandparents shown in white above. However, remember, the process of recombination is different each time. You may not have inherited DNA from that particular ancestor but your siblings, aunts and uncles etc may well have done. This is why it is always worth testing as many family members can you can.


    Uses of Autosomal DNA


    The primary use of autosomal DNA is for finding connections with those descended from common ancestors in recent generations, your close cousins.

    Potential uses of this type of testing include:

    • Confirming your family history research carried out so far using traditional research techniques
    • Expanding your family tree by connecting to those with whom you share DNA
    • Finding the answer to a particular problem or breaking down a brick wall
    • Use by adoptees searching for birth relatives (particularly powerful in combination with Y-DNA or mtDNA testing, as discussed in earlier blog posts).


    Who can take a test?


    Both males and females can take autosomal DNA tests and will find cousins in the same way.


    The Data


    So how do the tests work? When we talked about Y-DNA and mtDNA tests we talked about the raw data or comparing the raw data to reference standards. When your autosomal DNA is analysed data is collected at around 600,000-700,000 SNPs or positions. Below is a short extract from my Family Finder test results at Family Tree DNA. The entire spreadsheet contains 708,093 rows of data.


    Short extract of autosomal DNA data


    Rather than compare raw data the commercial companies do the data crunching for us using matching algorithms. Rather than look at the data in the form above we are presented with a lists of matches. Here’s an example, taken from the Ancestry website:


    Ancestry DNA matches


    You can see that there are a range of relationships assigned to my matches. In fact the top three matches are my father, my brother and my uncle. We have talked about the fact that on average you share about 50% of your DNA with each parent, and will share about 50% with a sibling. The average amounts of DNA shared with some of your other likely living relatives are shown in the table below:


    Amount of autosomal DNA shared on average with living relatives (from the ISOGG website)


    Don’t worry too much about what a cM is at this stage, just think of it as an amount of DNA. For each of the matches above you can click for more detail, for example my predicted third cousin match looks like this:


    Finding amount of shared DNA for an Ancestry match


    Clicking on the little “i” gives you the actual numbers. You can see here that Ancestry’s calculations have given me this match as a third cousin. If we look at the table above we can see that 147cM sits somewhere between a second and third cousin. In fact, I know this to be my second cousin once removed. As you can imagine there is a range for each relationship and the above table simplifies things considerably. If you want to look at things in more detail I suggest using this tool, which was developed using actual data from known DNA matches (click on the image for a larger version):


    Calculating relationships (used with permission of


    Even better than this though, there is now an interactive version of the Shared cM Project, where you can type in your result and see the most likely relationships. For my 147cM match (again, click the image for a larger version):


    Possible relationships with a shared match of 147cM


    If we ignore the half relationships for now for simplicity, we can see that the known 2C1R (second cousin once removed) sits nicely in the middle of the possible relationships. You can try this for any of your matches.

    Incidentally, if you get confused about second cousins, cousins once removed etc there is some useful information on Wikipedia and the following chart is also useful:


    Explaining cousin relationships


    Investigating matches


    Once you have discovered some matches to your DNA data the next step is to start to work out how they connect to your family tree. They may have a family tree uploaded themselves and you may see familiar names. You may feel sure you know where there is a link and either need to work on your tree or theirs to bring the two together.

    The ease with which you can link to your connections will depend on how extensive the research is so far by both of you and how distant the proposed relationship.

    Depending on where you live in the world your matches at the various websites will look very different. The majority of those who have tested are still based in the US so you would expect US testers to have more close matches in their results. For example, a second cousin shares a common great grandparent. I think most of us are comfortable with our family history research back to this point. There’s a fair chance we identified most, if not all of our second cousins and if we haven’t, it would not be too onerous a task. Personally I have 5 first cousins and 25 second cousins, you may have more.

    Unfortunately I do not have any matches on any of the websites that are second or even third cousin matches (apart from people I have had tested). All of my matches are around fourth cousins or more distant relationships.

    So, let’s look at the likelihood of being to identify where a fourth cousin match fits into your family tree. A fourth cousin shares one set of your great x 3 grandparents with you, but you have 16 sets of great x 3 grandparents! Let’s say each couple between then and now has had an average of two children. Starting at the most recent generations that means my mother and father have one sibling each (my aunts or uncles). If they both have two children I will have 4 first cousins altogether. Are you with me so far? Building this up gives us the following figures:



    These are just figures based on easy to calculate assumptions. If we factor in studies of population change, two separate pieces of research suggest we could have on average 940 or 1572 fourth cousins (for more information visit the ISOGG website). In my own research I’ve identified 5 first cousins, 25 second cousins, 34 third cousins and 55 fourth cousins. I feel I have some way to go!

    So if you, like me, only have more distant cousin matches you need to come up with some clever strategies to focus your efforts appropriately.


    Testing other relatives


    One of the tools offered by Ancestry, Family Tree DNA, 23 and Me AND My Heritage is the ability to view matches “in common with” another match. I have tested my father and by looking at those in common with him I can be confident that these are probably* matches on my paternal side.

    You can narrow this down even more. I mentioned my second cousin once removed earlier. He is descended from the set of my father’s great grandparents that I am particularly interested in at the moment. As we have 4 sets of great grandparents, any matches that match both my father AND this cousin can be assumed to be probably linked to only one small subsection of my family tree. This is powerful stuff!

    The potential for different scenarios here is endless.

    * WARNING: This approach only works if your family tree is straightforward. You may be related to a match in more than one way or your parents may even be related to one another. All it takes is for one marriage of first or second cousins way back in time to completely complicate your genetic family tree.


    Analyse the data in more detail


    All discussion so far has looked at just the amount of DNA we share with our matches. However, remember our discussion about how DNA is inherited. If we can start to build up a picture of where in our DNA we have particular matches we begin to develop new powers!

    Although Ancestry has by far the greatest number of testers it is the only one of the big companies offering DNA matches with autosomal DNA that does not offer a chromosome browser.

    The image below shows the chromosomes on which a match occurs between my father and his second cousin. Assuming no complications in the family tree, this is DNA that can only have come from one set of my father’s great grandparents.


    Chromosome browser showing the match between second cousins (My Heritage)


    As you test more and more relatives and identify more and more matches with known connections through traditional research you can build upon this. In fact, you can begin to work out which parts of each chromosome came from each ancestor. Think of the potential for breaking down brick walls.

    The complicating factor is that we have two of each chromosome. If my father and his second cousin have a match on a chromosome with a third person in the same area as their match with each other does it mean that all three are related? Not necessarily. This is more easily demonstrated with an image. In the image below we are comparing the DNA on a single chromosome, let’s say chromosome 5. Each tester has two copies of the chromosome, one paternal (P) and one maternal (M).


    The comparison of DNA between matches on the same chromosome


    Brian and David are related on their maternal chromosomes (in blue). Both match to Adam in the same position. However, the data is different. Brian and Adam match on their paternal sides (in red), David and Adam match each other on David’s paternal side but on Adam’s maternal side (in green). So Brian is related to David, Brian is related to Adam and David is related to Adam but they are not all related to one another from the same ancestors. Are you still with me?


    This is called triangulation: we look at the match, as above, and when we bring in a third person we check to see whether they all match in the same location on the same chromosome with the same data. (It is highly unlikely we will have exactly the same data on both chromosomes for a length sufficient to be considered a match).

    Here is an example. This is the date from my father and his second cousin again but now with data added in from a third individual. Here the lady in question matches both individuals on chromosome 7 but there is an area of overlap indicating that this data all matches and is therefore on the same side for both my father and his second cousin:


    Chromosome browser showing a triangulated segment on chromosome 7


    In Summary


    This is only an introduction to autosomal DNA testing, to give you a flavour of what can be achieved. There is much more to add and many tools and external websites that can be used to look at the data and matches in more detail.

    If you want to learn more, I am pleased to announce that I will be running a four week online course, titled Demystifying DNA for Family Historians, for Pharos Teaching and Tutoring in 2019. More details may be found HERE.


    The Testing Companies

    Autosomal tests are available at all of the big 5 DNA companies:

    Ancestry DNA:

    Offers matches and amount of shared DNA but gives no chromosome data. Data can be downloaded for upload elsewhere.

    Family Tree DNA, 23andMe and My Heritage:

    All three offer matches, amount of shared DNA and chromosome data. Each has a chromosome browser to examine data in more detail. Data can be downloaded for upload elsewhere.

    Family Tree DNA and My Heritage offer free upload of data from other companies (though you will have to pay to use all of the tools available).

    Living DNA:

    To date Living DNA has focused on providing estimates of ethnicity and its selling point is that is provides a more detailed breakdown for those with UK heritage than other companies. Data can be downloaded for upload elsewhere.

    Matches are coming soon and uploads of data from other companies are accepted.


  2. Demystifying DNA 3: mtDNA testing

    Leave a Comment


    This is the third in my series of posts attempting to Demystify DNA testing for family historians. If you would like an overview to the many types of DNA test, do see the Introduction post. In my last blog we looked at Y-DNA testing in detail: when to use Y-DNA, who can test, where to test and how to interpret the results.

    This month we move onto mitochondrial DNA, or mtDNA, testing.


    Uses of mtDNA


    Mitochondrial DNA is to maternal research what Y-DNA testing is to paternal research as illustrated by the following schematic:


    The ancestors that may be traced using Y-DNA (blue) versus mtDNA (pink)


    mtDNA is specific to the matrilineal line, looking at your mother, her mother, her mother’s mother and so on. It does not include ALL of your mother’s ancestors, just those highlighted above.

    Just like Y-DNA, mtDNA is passed largely unchanged from one generation to another, enabling its use for tracing maternal ancient origins. You can also, in theory, use mtDNA to support your genealogy research. However this is complicated by the fact that the surname changes at every generation, making traditional research more challenging, and the way in which mtDNA mutates.

    Every now and again a mutation or copying error occurs with mtDNA, moving from one generation to the next. When I use the term “mutation” here I simply mean a change in DNA, no implication of anything to do with health. The main difference between the inheritance of mtDNA compared to Y-DNA is that the mutation rate of mtDNA is slower. A mtDNA match could share a common ancestor with you in recent generations or hundreds or even thousands of years ago.

    mtDNA is therefore not as useful for testing speculatively to find matches, as it is less likely that a mtDNA match will share with you a common ancestor in a genealogically relevant timeframe. The beauty of a match on mtDNA is the fact that you will know what small section of your family tree the match is connected to. Where mtDNA is particularly useful is for confirming suspected relationships. It is a very powerful test for comparing your own data with a suspected match to see if you are indeed related to the same maternal ancestor. This approach can equally be applied to adoption cases as to the situation where you have two candidates for your maternal great grandmother.


    Who can take a test?


    Contrary to popular belief, mtDNA tests can actually be taken by both males and females.

    mtDNA passes from a mother to her children, the difference being that only the females then pass this mtDNA on.  This is illustrated more clearly by the following schematic:


    Schematic showing the path of descent of mtDNA


    A is the great granddaughter of B. The diagram shows all descendants of B, outlined in blue or pink, depending on gender. Spouses are shown in black for clarity. The filled pink shapes indicate the path of descent of mtDNA. Remember, mtDNA is passed from a mother to her children but only her daughters will pass it on to the next generation. B had three children, but only her two daughters passed her mtDNA to the next generation, and so on.

    If A is unable to take the mtDNA test herself, you can see she has a number of options, assuming that the above represents only a bloodline. Her brother, C, could take the mtDNA test, her first cousin D, or even her second cousins, E and F. All have an unbroken line of female descent from B and all have the same mtDNA. This is an important point to note if you are looking to identify close living relative matches, say in adoption cases: a match could equally be a mother, sibling, aunt, cousin or grandparent, all of whom are descended from the same maternal ancestor, B.


    Types of Test


    Mitochondrial DNA is a circle of DNA, consisting of 16,569 base pairs. See the first post in this series, the Introduction, for an explanation of the terminology. Mitochondrial DNA consists of the following regions:


    mitochondrial DNA


    The area shown in white represents the hyper variable control regions (HVR1 & HVR2). These are the areas of the mtDNA known to mutate more quickly. They are therefore more likely to differ from one individual to another, unless they are closely related. The coding region undergoes changes less frequently.

    The first mtDNA tests analysed DNA in the HVR1 and HVR2 control regions only. Later mtDNA tests included both the HVR1 & HVR2 regions and the coding region. Some companies use SNP testing. Remember from the piece on Y-DNA testing, a Single Nucleotide Polymorph, or SNP, is a point along the DNA molecule known to differ from one individual to another – a point at which a mutation has occurred at some point in time. SNP (pronounced “snip”) testing analyses which nucleotide is found at many individual locations or SNPs.

    Also available for mtDNA are sequence tests. Rather than look at individual SNPs all base pairs are analysed in the region of interest. Early tests just looked at the base pairs in the HVR1 or HVR1 and HVR2 regions. Now it is possible to obtain a  full sequence test, which analyses all 16,569 base pairs. In much the same way as a higher number of markers on a Y-DNA STR tests gives you better data for comparison with others, more accurate mtDNA data is found with a full sequence test.

    If we imagine the ring of DNA opened out flat then a visual representation of the difference is:


    Graphical representation of Sequence vs SNP testing


    The Data


    When we looked at the Y-DNA STR tests we looked directly at the number of repeats at STR markers or the identity of bases at particular locations. mtDNA data analysis is different. Here we compare how each individual differs from reference standards. The first produced was the Cambridge Reference Standard (CRS), now superseded by the corrected revised Cambridge Reference Standard (rCRS), based on a European who had haplogroup H. A second standard, the Reconstructed Sapiens Reference Sequence (RSRS), was produced more recently and was an attempt to to compare mtDNA against a reference with an older haplogroup, closer to Mitochondrial Eve (see below for more on haplogroups). The details of the two standards are not appropriate here, more information can be found at the ISOGG website. It is, however, important to know which standard has been used by your testing company of choice if you are to compare results with those obtained elsewhere.

    Family Tree DNA supplies results against both reference standards. The images below show the (truncated) results of my own mtDNA test against the rCRS at Family Tree DNA.


    mtDNA results against the rCRS standard


    The results are actually reported in two ways, just to confuse you! In this case there are no differences to the standard in the HVR1 region. In the HVR2 region five differences are shown. The traditional way of reporting these is to the give the position number, followed by the letter of the base that you have compared to the original. So at position 152 I have C instead of the base of the rCRS. The second set of data (the lower table labelled “Revised Cambridge Reference Standard”) actually shows this more simply. It shows you that there should be a T at position 152 but I have a C.

    The addition of a “.1” indicates an addition at this position. In fact I have two additional Cs at position 309. Again this is more clearly seen in the bottom set of table for the rCRS results: there are no bases at 309.1 but I have two Cs. If a base is missing at a particular position it would be marked e.g. 309-, known as a deletion.

    Now let’s turn our attention to the RSRS results, again my own (truncated) data:


    mtDNA results against the RSRS standard


    There are some differences against the reference standard in the HVR1 region here. This is to be expected: The reference for the rCRS was in haplogroup H, as am I, whereas the reference for RSRS is based on older haplogroups. Here differences are marked:

    <reference base> POSITION NUMBER <your result>

    so you can readily compare the base in the reference standard with your own. For the RSRS results there are also extra mutations and missing mutations. These refer to differences from what is expected for my haplogroup compared to the RSRS.

    My current matches on mtDNA are show below:


    mtDNA matches at Family Tree DNA


    As you can see, I don’t yet have any matches at genetic distance of zero. A genetic difference of 1 means that there is a difference in my data compared to the other test taker’s data at one position, whether it be a different base, an addition or a missing mutation compared to their results.

    With Y-DNA we could calculate a reasonable estimate of the time to Most Recent Common Ancestor (MRCA) as Y-DNA mutations happen at a regular rate and there is some level of confidence in predictability. As I said earlier, with mtDNA the mutation rates are much slower and there is much greater range. The following table is taken from the Family Tree DNA website. Even if I had a match with a genetic distance of zero there’s only a 50% likelihood that person and I share an ancestor within 5 generations. It’s more likely that the common ancestor is somewhere within the last 5-22 generations.


    MRCA estimates for mtDNA (Family Tree DNA)


    mtDNA haplogroup


    What I find interesting is knowledge of my mtDNA haplogroup. Just as there is a haplogroup tree for Y-DNA, there is an equivalent mtDNA haplogroup tree, as all females are descended from mitochondrial Eve. An individual’s mtDNA haplogroup is their location in the human mtDNA haplogroup tree. Everyone fits on this tree, some branches dating far further back in time than those derived from more recent mutations. A simple graphic is shown below but there are many branches, or subclades, within each haplogroup.


    mtDNA haplotree (Wikipedia)


    Each haplogroup is connected to  particular time and place and more information on where the haplogroups originated can be found here: mtDNA haplogroups. My own haplogroup is H. This is a predominantly European haplogroup as I would expect and does not reveal anything exciting about my own family history. However, for those with a family story that 3x great grandmother was a local Indian girl that 3 x great grandfather met while he worked in British India, discovering the haplogroup can be very important.


    The Testing Companies


    Family Tree DNA:

    I’m only considering the main five DNA testing companies in this series of blogs, to keep things simple. Of these, only Family Tree DNA currently offers separate mtDNA tests. Both HVR1 / HVR2 (mtDNA Plus) and full sequence (mtFull Sequence) tests are available.

    Whilst the other DNA companies in “the big five” do not offer a separate mtDNA tests, some do provide the mtDNA haplogroup a part of their single combined DNA test:

    Living DNA:

    Living DNAs test results for includes measurement of roughly ~4700 positions on the mtDNA genome to define the haplogroup*.

    23 and Me:

    The 23 and Me DNA test results for include measurement of 2737 mtDNA single nucleotide polymorphisms (SNPs) to define the haplogroup*.

    * Data source: ISOGG wiki, MtDNA testing comparison chart.



    Be careful with haplogroups – Y-DNA and mtDNA lettering conventions do not relate to one another. The Y-DNA haplogroup is an indication of paternal ancient origins, the mtDNA haplogroup an indication of maternal ancient origins. A man has both, a woman has only a mtDNA haplogroup.

    As with all DNA tests, the number of matches you get with mtDNA testing will depend on who else has tested. If you have no matches to start with: be patient.

    With any type of DNA test, the results obtained form only part of the analysis. DNA testing does not answer questions alone: it must always be assessed along with other information and documentary evidence.

    Next Up


    The next post will focus on the most popular type of DNA testing now: autosomal DNA, the type of test offered by Ancestry, My Heritage and Family Tree DNA (the Family Finder test) to find matches to close living relatives.




  3. Demystifying DNA 2: Y-DNA tests


    In my last blog we looked at the science behind DNA testing and the different types of test available.

    Here we start to look at each type of DNA testing in more detail.


    Uses of Y-DNA

    Y-DNA is passed from father to son unchanged for many generations and this makes it a powerful tool for assessing your paternal line: a match on a Y-DNA test can only lead you up one part of your family tree.

    Y-DNA is often used by those running surname studies as, in principle, the descent of the male line is the same as the descent of the surname. Y-DNA can therefore be used to assess the likelihood of all bearers of a particular surname arising from the same single individual, no matter how far back in time this individual lived. There are single surname DNA projects through the commercial testing sites and a number of One-Name Studies (ONS) also operate DNA projects.

    However, human nature results in a number of scenarios where this hypothesis falls down. The most common issue is the bearer of a surname being found to be illegitimate: Mr Postlethwaite was actually a plain old Mr Brown. In DNA circles this is referred to as a “non-paternity event” or NPE. There are a number of other reasons a surname may be assumed: unofficial adoption, taking on a stepfather’s surname and so on.

    Y-DNA can also be used to identify an unknown father. The caveat here is that the results may point to a particular male line but not necessarily an individual within that line. Just using Y-DNA testing could point to Mr A being the father, but equally his brother, his paternal cousin or his uncle. Just as we said last time, the DNA test is one source of information that can be used to aid genealogical research. It needs to be used in the context of known information and documentary research, in this case, who was in the right place at the right time?


    Who can take a test?

    Y-DNA tests can only be taken by males. However, the power of the Y-DNA chromosome is that it is unchanged over many generations. If you have no brothers you can ask a cousin to test, or an uncle, or even a second or third cousin, So long as they are from an unbroken line of males from your common ancestor.

    Schematic showing the path of descent of Y-DNA

    The schematic above (click the image for a larger version) shows all descendants of a single couple, marked in blue or pink, depending on gender. Spouses are shown in black for clarity. Solid blue boxes indicate the path of descent of Y-DNA. Our test taker is shown at the bottom right in solid pink, a female family historian interested in her maternal grandfather’s line. Unfortunately, her grandfather died some years ago. At first it seems there is no way that she can find a male to test for Y-DNA. Her mother is one of two sisters and her grandfather also had no brothers. The beauty of Y-DNA is that we can keep moving backwards. A generation further back and our family historian’s great grandfather had one sister and one brother. If we look at the brother’s line was can see that he had two sons. One died without children but the other had a son and he also had a son. Assuming one of these individuals is still alive we have found a test candidate for the Y-DNA equivalent to that of our family historian’s grandfather. This approach does assume that all are the blood line of the uppermost male. A non-paternity event is a possibility and you should always test more than one candidate from more than one line if you can.


    The Data

    There are two different methods of analysing and comparing Y-DNA: STR or Short Tandem Repeat testing is useful for comparing the relationships between individuals in a genealogically relevant timeframe. What do we mean by genealogically relevant? Simply, the timeframe over which we are probably going to be able to support any findings with documentary evidence.  SNP or Single Nucleotide Polymorph testing is used to define a person’s haplogroup and investigate their ancient origins.


    STR Testing: haplotypes

    Short Tandem Repeats (STRs) are positions along the DNA molecule where the same sequence of nucleotides or bases is characteristically repeated a number of times, e.g. AGTCAGTCAGTCAGTCAGTC. Each STR marker is named, typically in the format DYS391 (where D = DNA, Y = Y chromosome and S = (unique) segment). The number of times the sequence is repeated at each marker is counted and the results of the Y-DNA test take the format shown below:

    This set of results is an individual’s haplotype. When purchasing a Y-DNA test you will see numbers Y-DNA37, Y-DNA67 and Y-DNA111. These are the number of STR markers tested or haplotype resolution. The example above tested at 12 markers only. Whilst early tests could only look at 12 markers, 37, 67 and 111 are now more common as the technology has developed, with some tests looking at even greater numbers of markers. You can still compare results with someone who tested with a different number of markers: a test at 37 markers looks at the same 12 markers as a 12 marker test, plus an additional 25, and so on.

    The results of the STR tests are compared with those of others on the commercial websites to see if they are matches. If all results match the genetic distance is zero. In the example above, if another set of results was compared and all were the same except the result for DYS391 was 9 instead of 10, this would be termed a genetic distance of one. If DYS391 was 9 and DYS426 was 13 this would be a genetic distance of 3, i.e. the difference of 1 at DYS391 plus the 2 on DYS426.

    Caution: If you have an exact match at 12 or 25 markers this does not necessarily mean you are closely related. A comparison of 67 markers is looking in more detail at the DNA and could reveal that there are actually large differences in the results. In the example below, there appear to be four exact matches when testing at 25 markers. However, the same four individuals also took Y-DNA67 tests. When these are considered it is seen that that is a genetic distance of anything between 4 and 6.

    STR matches at 25 markers, taken from Family Tree DNA website

    STR matches at 67 markers, taken from Family Tree DNA website

    But what does this mean in terms of how closely related you are to someone? Are these Cummings families all related if there are some differences in the STR results? Whilst Y-DNA largely passes down unchanged from one generation to the next, occasionally there is an error or mutation in the replication process and a difference will occur. As the changes tend to occur at regular intervals the knowledge of rate of mutation of STR markers can be used to predict the time to most recent common ancestor (MRCA). Not all STR markers mutate at the same rate; a genetic distance of, say, 4 will not always equate to the same time to MRCA.

    This sounds complicated but is simplified by the tools available to us online. The Family Tree DNA website allows you to get an idea how far back your common ancestor lived by clicking on the orange “TiP” (“Time Predictor”) icon. If we click on the “TiP” icon for  J Cummings in the 25 marker test example above we get the following:

    Likely relationships assessed on 25 marker match data

    This  indicates that J Cummings and the test subject have a 85% chance of sharing a common ancestor in the past 8 generations. However, when 67 markers are compared, this changes:

    Likely relationships assessed on 67 marker match data

    Now there is only a 51% chance they shared a common ancestor over the last 8 generations, but an 82% chance they shared an ancestor within 12 generations.

    A more generic way of assessing genetic distance, without bring in the difference in mutation rates, is as follows (click on the image for a larger version):

    Y-DNA genetic distances from FTDNA website

    So our J Cummings is likely to be related to the test taker within a genealogically relevant timeframe, but the other other individuals are probably connected further back in time. To get the best from a Y-DNA test, always test the highest number of markers you can sensibly afford to.

    The results of STR testing, in the format of number of repeats per STR marker, are called individual’s haplotype, as in the table above. However, the information regarding the known rate of mutation in the different STR markers can also be used to estimate an individual’s haplogroup.

    An individual’s haplogroup is their location in the human Y-DNA haplogroup tree. Everyone fits on this tree, some branches dating far further back in time than those derived from more recent mutations. A simple graphic is shown below but there are many branches, or subclades, within each haplogroup. Haplogroups are further defined with SNP testing, as described below.

    The Y-haplotree (Wikipedia)


    SNP testing: defining haplogroups

    A Single Nucleotide Polymorph (SNP) is a point along the DNA molecule known to differ from one individual to another – a point at which a mutation has occurred at some point in time. Rather than test areas of repeating nucleotide sequence like STR testing, this type of testing analyses which nucleotide is found at many individual locations or SNPs (pronounced “snips”).

    Graphical representation of STR vs SNP testing

    SNP testing is primarily used to improve upon the estimate from STR testing and define an individual’s haplogroup and can be used to assess ancient origins. As more SNP mutations occurred more branches formed in the haplotree (above), each defined by one or more SNPs. A more detailed version of the Y-DNA haplogroup tree can be found on the website of the ISOGG (International Society of Genetic Genealogy).

    If an individual has a mutation at a particular SNP he moves down to the relevant branch, if not he stays where he is on the tree. Y-DNA haplogroups used to be written in the form R1a1a1b2a2a1 but as more research is conducted more and more branches are discovered. Now the name of the haplogroup is shortened to the letter from the major haplogroup branch, follow by the final SNP at which a mutation was detected. Care needs to be taken with comparing haplogroups, as one tester may have conducted SNP testing to a deeper level than another (i.e. they may appear different but could actually be from the same higher branch).

    In the example above for Cummings Y-DNA all four individuals have had their haplogroups defined with extensive SNP testing, the “Big Y” test at Family Tree DNA. We can see that all individuals are within haplogroup R, but not the same branch. SNP testing can therefore be used to compliment the STR results: we would expect J Cummings and K R Cummings to be more closely related to one another than to the others tested (the test taker is also R-YP983).

    The haplogroup can also be used to assess the ancient origins of an individual. For example, the haplogroup R originated in Central Asia. More information on the origin of haplogroups can be found here.


    The Testing Companies

    STR testing:

    Only Family Tree DNA currently offers a separate Y-DNA test where STR results can be compared with other users. Your own STR data can also be downloaded for further analysis elsewhere.

    Haplogroup / SNP testing:

    Haplogroup and SNP testing is also available from Family Tree DNA.

    Whilst other DNA companies do not offer a separate Y-DNA test examining STRs, some do provide the Y-haplogroup a part of their single combined DNA test:

    Living DNA:

    Living DNAs test results for males includes measurement of roughly 20,000 SNPs on Y-DNA to define the haplogroup.

    23 and Me:

    The 23 and Me DNA test results for males include measurement of “hundreds of Y-chromosome single nucleotide polymorphisms (SNPs)” to define the haplogroup.



    Using DNA testing to solve genealogical problems is dependent on the databases of test results. If no one else from your paternal line has yet tested, you won’t get any matches. In addition, all of the commercial companies databases currently contain more results of those from the US than anywhere else. This is changing as more and more test, from all over the world. The test results are powerful but you may have to be patient.


    A plea from me

    Do you have any COWLING ancestors from England? I have registered a Cowling One Name Study and am interested in expanding this to incorporate a DNA Project. If you are a male Cowling descended from Cowlings in England, particularly those from Cambridgeshire, Yorkshire and Cornwall and would be interested in taking part do please get in touch. Similarly if you have Cowling relatives and are just interested in the One Name Study, do get in touch too. I would love to hear from you.


    Next Up

    The next post will focus on mtDNA (mitochondrial DNA), the DNA we can use to assess ancient origins on the female line.




Search Blog

Blog Archive

December 2018
« Nov    


Keep up to date with my latest posts

Enter your email address to subscribe to this blog and receive notifications of new posts by email.