In my last blog we looked at the science behind DNA testing and the different types of test available.
Here we start to look at each type of DNA testing in more detail.
Uses of Y-DNA
Y-DNA is passed from father to son unchanged for many generations and this makes it a powerful tool for assessing your paternal line: a match on a Y-DNA test can only lead you up one part of your family tree.
Y-DNA is often used by those running surname studies as, in principle, the descent of the male line is the same as the descent of the surname. Y-DNA can therefore be used to assess the likelihood of all bearers of a particular surname arising from the same single individual, no matter how far back in time this individual lived. There are single surname DNA projects through the commercial testing sites and a number of One-Name Studies (ONS) also operate DNA projects.
However, human nature results in a number of scenarios where this hypothesis falls down. The most common issue is the bearer of a surname being found to be illegitimate: Mr Postlethwaite was actually a plain old Mr Brown. In DNA circles this is referred to as a “non-paternity event” or NPE. There are a number of other reasons a surname may be assumed: unofficial adoption, taking on a stepfather’s surname and so on.
Y-DNA can also be used to identify an unknown father. The caveat here is that the results may point to a particular male line but not necessarily an individual within that line. Just using Y-DNA testing could point to Mr A being the father, but equally his brother, his paternal cousin or his uncle. Just as we said last time, the DNA test is one source of information that can be used to aid genealogical research. It needs to be used in the context of known information and documentary research, in this case, who was in the right place at the right time?
Who can take a test?
Y-DNA tests can only be taken by males. However, the power of the Y-DNA chromosome is that it is unchanged over many generations. If you have no brothers you can ask a cousin to test, or an uncle, or even a second or third cousin, So long as they are from an unbroken line of males from your common ancestor.
The schematic above (click the image for a larger version) shows all descendants of a single couple, marked in blue or pink, depending on gender. Spouses are shown in black for clarity. Solid blue boxes indicate the path of descent of Y-DNA. Our test taker is shown at the bottom right in solid pink, a female family historian interested in her maternal grandfather’s line. Unfortunately, her grandfather died some years ago. At first it seems there is no way that she can find a male to test for Y-DNA. Her mother is one of two sisters and her grandfather also had no brothers. The beauty of Y-DNA is that we can keep moving backwards. A generation further back and our family historian’s great grandfather had one sister and one brother. If we look at the brother’s line was can see that he had two sons. One died without children but the other had a son and he also had a son. Assuming one of these individuals is still alive we have found a test candidate for the Y-DNA equivalent to that of our family historian’s grandfather. This approach does assume that all are the blood line of the uppermost male. A non-paternity event is a possibility and you should always test more than one candidate from more than one line if you can.
There are two different methods of analysing and comparing Y-DNA: STR or Short Tandem Repeat testing is useful for comparing the relationships between individuals in a genealogically relevant timeframe. What do we mean by genealogically relevant? Simply, the timeframe over which we are probably going to be able to support any findings with documentary evidence. SNP or Single Nucleotide Polymorph testing is used to define a person’s haplogroup and investigate their ancient origins.
STR Testing: haplotypes
Short Tandem Repeats (STRs) are positions along the DNA molecule where the same sequence of nucleotides or bases is characteristically repeated a number of times, e.g. AGTCAGTCAGTCAGTCAGTC. Each STR marker is named, typically in the format DYS391 (where D = DNA, Y = Y chromosome and S = (unique) segment). The number of times the sequence is repeated at each marker is counted and the results of the Y-DNA test take the format shown below:
This set of results is an individual’s haplotype. When purchasing a Y-DNA test you will see numbers Y-DNA37, Y-DNA67 and Y-DNA111. These are the number of STR markers tested or haplotype resolution. The example above tested at 12 markers only. Whilst early tests could only look at 12 markers, 37, 67 and 111 are now more common as the technology has developed, with some tests looking at even greater numbers of markers. You can still compare results with someone who tested with a different number of markers: a test at 37 markers looks at the same 12 markers as a 12 marker test, plus an additional 25, and so on.
The results of the STR tests are compared with those of others on the commercial websites to see if they are matches. If all results match the genetic distance is zero. In the example above, if another set of results was compared and all were the same except the result for DYS391 was 9 instead of 10, this would be termed a genetic distance of one. If DYS391 was 9 and DYS426 was 13 this would be a genetic distance of 3, i.e. the difference of 1 at DYS391 plus the 2 on DYS426.
Caution: If you have an exact match at 12 or 25 markers this does not necessarily mean you are closely related. A comparison of 67 markers is looking in more detail at the DNA and could reveal that there are actually large differences in the results. In the example below, there appear to be four exact matches when testing at 25 markers. However, the same four individuals also took Y-DNA67 tests. When these are considered it is seen that that is a genetic distance of anything between 4 and 6.
But what does this mean in terms of how closely related you are to someone? Are these Cummings families all related if there are some differences in the STR results? Whilst Y-DNA largely passes down unchanged from one generation to the next, occasionally there is an error or mutation in the replication process and a difference will occur. As the changes tend to occur at regular intervals the knowledge of rate of mutation of STR markers can be used to predict the time to most recent common ancestor (MRCA). Not all STR markers mutate at the same rate; a genetic distance of, say, 4 will not always equate to the same time to MRCA.
This sounds complicated but is simplified by the tools available to us online. The Family Tree DNA website allows you to get an idea how far back your common ancestor lived by clicking on the orange “TiP” (“Time Predictor”) icon. If we click on the “TiP” icon for J Cummings in the 25 marker test example above we get the following:
This indicates that J Cummings and the test subject have a 85% chance of sharing a common ancestor in the past 8 generations. However, when 67 markers are compared, this changes:
Now there is only a 51% chance they shared a common ancestor over the last 8 generations, but an 82% chance they shared an ancestor within 12 generations.
A more generic way of assessing genetic distance, without bring in the difference in mutation rates, is as follows (click on the image for a larger version):
So our J Cummings is likely to be related to the test taker within a genealogically relevant timeframe, but the other other individuals are probably connected further back in time. To get the best from a Y-DNA test, always test the highest number of markers you can sensibly afford to.
The results of STR testing, in the format of number of repeats per STR marker, are called individual’s haplotype, as in the table above. However, the information regarding the known rate of mutation in the different STR markers can also be used to estimate an individual’s haplogroup.
An individual’s haplogroup is their location in the human Y-DNA haplogroup tree. Everyone fits on this tree, some branches dating far further back in time than those derived from more recent mutations. A simple graphic is shown below but there are many branches, or subclades, within each haplogroup. Haplogroups are further defined with SNP testing, as described below.
SNP testing: defining haplogroups
A Single Nucleotide Polymorph (SNP) is a point along the DNA molecule known to differ from one individual to another – a point at which a mutation has occurred at some point in time. Rather than test areas of repeating nucleotide sequence like STR testing, this type of testing analyses which nucleotide is found at many individual locations or SNPs (pronounced “snips”).
SNP testing is primarily used to improve upon the estimate from STR testing and define an individual’s haplogroup and can be used to assess ancient origins. As more SNP mutations occurred more branches formed in the haplotree (above), each defined by one or more SNPs. A more detailed version of the Y-DNA haplogroup tree can be found on the website of the ISOGG (International Society of Genetic Genealogy).
If an individual has a mutation at a particular SNP he moves down to the relevant branch, if not he stays where he is on the tree. Y-DNA haplogroups used to be written in the form R1a1a1b2a2a1 but as more research is conducted more and more branches are discovered. Now the name of the haplogroup is shortened to the letter from the major haplogroup branch, follow by the final SNP at which a mutation was detected. Care needs to be taken with comparing haplogroups, as one tester may have conducted SNP testing to a deeper level than another (i.e. they may appear different but could actually be from the same higher branch).
In the example above for Cummings Y-DNA all four individuals have had their haplogroups defined with extensive SNP testing, the “Big Y” test at Family Tree DNA. We can see that all individuals are within haplogroup R, but not the same branch. SNP testing can therefore be used to compliment the STR results: we would expect J Cummings and K R Cummings to be more closely related to one another than to the others tested (the test taker is also R-YP983).
The haplogroup can also be used to assess the ancient origins of an individual. For example, the haplogroup R originated in Central Asia. More information on the origin of haplogroups can be found here.
The Testing Companies
Only Family Tree DNA currently offers a separate Y-DNA test where STR results can be compared with other users. Your own STR data can also be downloaded for further analysis elsewhere.
Haplogroup / SNP testing:
Haplogroup and SNP testing is also available from Family Tree DNA.
Whilst other DNA companies do not offer a separate Y-DNA test examining STRs, some do provide the Y-haplogroup a part of their single combined DNA test:
Living DNAs test results for males includes measurement of roughly 20,000 SNPs on Y-DNA to define the haplogroup.
23 and Me:
The 23 and Me DNA test results for males include measurement of “hundreds of Y-chromosome single nucleotide polymorphisms (SNPs)” to define the haplogroup.
Using DNA testing to solve genealogical problems is dependent on the databases of test results. If no one else from your paternal line has yet tested, you won’t get any matches. In addition, all of the commercial companies databases currently contain more results of those from the US than anywhere else. This is changing as more and more test, from all over the world. The test results are powerful but you may have to be patient.
A plea from me
Do you have any COWLING ancestors from England? I have registered a Cowling One Name Study and am interested in expanding this to incorporate a DNA Project. If you are a male Cowling descended from Cowlings in England, particularly those from Cambridgeshire, Yorkshire and Cornwall and would be interested in taking part do please get in touch. Similarly if you have Cowling relatives and are just interested in the One Name Study, do get in touch too. I would love to hear from you.
The next post will focus on mtDNA (mitochondrial DNA), the DNA we can use to assess ancient origins on the female line.