Showing posts with label beginners. Show all posts
Showing posts with label beginners. Show all posts

Wednesday, May 16, 2018

Making the Most of Your DNA Matches

One of the more frustrating aspects of AncestryDNA is how few people have a family tree available, and when they do, it's often private or a tree so small you might think you can't get any use out of it. Of course, I would encourage everyone to contact their DNA matches with private trees and politely ask for an invite, and I would also encourage people to contact their matches who have no trees, as they might know enough about their ancestry to make a connection between you even if they didn't add it to a tree. But often times, people don't respond to our messages, or they decline our invite request. Dead end after dead end, right? Well, there are a few ways around these dilemmas. Although some a little specific to AncestryDNA, they can often be utilized with other companies too.

1. Look for a family tree, even if one isn't attached.
When you open the match details page, if there is a family tree available but not attached to the DNA test, it will have a drop down menu where you can select the tree to preview (shown above and left). In the screenshot above, it shows how initially, it looks like this DNA match has no family tree, but they do have one unattached to their DNA results. Selecting it from the drop down menu brings up a preview. It's a small tree, but enough to identify our most recent common ancestor, since their grandfather was the brother of my great grandmother.

This one you do need to be careful with because while sometimes, people simply forget to attach their tree to their DNA test, it's also possible that the family tree doesn't belong to the person whose test you match (or the tree may belong to that person but they are not the "home person" for the tree, as is automatically selected). For example, one of my close cousins has taken the test, but his wife is managing it. His wife has started her family tree, but not his, and I only know this because I know them well enough to know whose tree it is. To anyone else who doesn't know them, they could mistake the wife's tree for his own. In this case, there is a good reason the tree wasn't attached to the test. So definitely look for those unattached family trees, but don't make too many assumptions about them.

Don't dismiss a tree like this!
2. Build a tree from their shrubs.
Don't dismiss trees that seem too small to make any use of. As long as they have deceased ancestors in their tree (whose details are therefore public) you can do what genealogists do best: research! Build on that tiny shrub of a tree, researching further back than the tree owner did until you find your common ancestor.

In the example above/left, you might look at this family tree and think there is not enough information to find the most recent common ancestor, but you'd be wrong. This person's father is a descendant of my 4th great grandparents John Hendricks Godshalk and Barbara Kratz. How do I know? Because I took this tiny tree and I researched the ancestors until I connected it to my own tree.

3. Build downwards on your own tree.
Research all the descendants of your known ancestors, as far down as you can. It really helps when you're trying to make a connection with a small 'shrub' of a tree such as discussed above. You won't have to research your match's tree back very far if you've already done the work on your own tree.

This is especially useful for trees with endogamy - for example, I have a branch of Mennonites on my tree and after tracing many other descendant lines of my ancestors, it quickly became clear there are a number of surnames that are strongly associated with the colonial Mennonites who settled in Pennsylvania, especially when more than one appears in a tree. So if I see names in someone's tree like Oberholtzer, Funk, Detweiler, Bergey, etc, even though none of these are my ancestors, I immediately know they are likely from my Mennonite branch just from seeing the surnames. In fact, in the screenshot above the match's father's name was Detwiler, immediately suggesting I should follow that side back until it linked to my own tree, and it did. Even on branches without endogamy, it can still be useful, just not as immediately apparent.

Notes always showing in list allows me to quickly see
which ancestors I share with matches I have in common
with someone
4. Look at your Shared Matches.
If there really is no tree whatsoever you can make use of, and the person won't respond to your messages, all you can do is look at the DNA matches you have in common with each other. If any of them are matches you've already determined your shared ancestry with, then it's possible this match is also descended from the same branch. If more than one are descended from the same branch, then it's very likely this person is too. The more shared matches who descend from the same branch or ancestor, the more likely the person with no tree does too.

This process can be sped up greatly by using a Chrome extension called MedBetterDNA. It has the option to "always show notes", which means any notes you make on a DNA match will show up in the list of matches, including the list of Shared Matches. In other words, every time you identify the shared ancestor of a DNA match, make a note of that ancestor in the notes section, then every time that match is a Shared Match with someone else who doesn't have a tree, you will know it without having to open up additional match's details. See the screenshot example above. I can't not stress enough how much more efficient this has made my workflow.

5. Use the Search option for private trees.
It's frustrating to see all those private trees, especially when the owner doesn't respond. But you can get an idea of what surnames are in their tree by using the search option. That doesn't mean your shared ancestor is definitely from that surname, but it is especially useful for private trees you have a Shared Ancestor Hint with. Knowing you do have a shared ancestor with that match makes it much more likely a shared surname is the source of that ancestor. This method is a little tedious though, since you have to randomly search for surnames from your tree and hope you get a hit for the match you're looking for, but you should theoretically get there eventually if there is a Shared Ancestor Hint. However, be aware that the search function isn't hugely reliable and often misses people who definitely have a surname you're searching for in their tree. I think it's a site indexing issue. So it doesn't always work, but when it does, it's helpful. It is also useful in combination with the above tip (a surname search result plus Shared Matches who are confirmed from the same branch as that surname is very good evidence your Shared Ancestor Hint is from that branch).

6. Test other family members.
Testing family members, especially parents, is beneficial because you can at least see which of your matches also match those family members, and therefore which side or branch of your tree the shared ancestor is likely from. No tree? Won't respond to messages? No shared cousins who have been identified yet? Well, at least I can see whether they match my mom, dad, paternal grandfather, or any of my known, close cousins on either side who have tested.

Be aware that the Shared Matches feature only includes high confidence (or higher) matches who are estimated 4th cousins or closer, but if you manage any of your family member's kits, you can see which matches you have in common at any level/degree by opening that match's profile. In the example above, you'll see my dad (Jim) matches Agnes and two other kits she manages, even though they do not meet the criteria of "Shared Matches". So when I look at Agnes or her other kits in my match list, it won't show my dad as a shared match to them, even though you can see here by opening Agnes' profile, they are a match to my dad. So not only testing other family members, but getting permission to manage their test is also very beneficial to at least figuring out which side/branch someone is connected to.

7. Search the internet for your DNA match
This one may seem a little intrusive to some, but the data is public and it's out there, so why not make use of it? There are certain websites like familytreenow.com, truepeoplesearch.com, and pipl.com where you can search for people by their real names, or sometimes by a username. Ancestry.com and FamilySearch.org has some public records of living people too. Even just a Google search can yield results; some people use their real names on AncestryDNA - so search for it. Sometimes, you can find them on Facebook or other contact details. Sometimes, you can find out their parents names, and from there, build a tree and connect it to your own. I know these sites can be controversial to some who feel they are a violation of privacy, but they are using public data and not violating any laws. If you are concerned, you can request your information be removed from these sites.

Even when people use anonymous usernames, sometimes they post on Ancestry's message boards with info on their tree and you can find them by Googling the username. Sometimes they use the same username on other websites and you can get in touch with them that way.

It is important to remember that not everyone has as great an interest in genealogy and DNA as we do. Many (perhaps even most) people take the test only for the ethnicity report and may never return to the site after seeing them. Others might be adopted and not know anything about their biological ancestry, and in some cases, there are individuals who might have died after taking the test and not given any family members access to their account. There are many reasonable explanations for why people don't respond to our messages, so try not to get too frustrated by it. Focus and work with what you have, and don't let the rest get to you or you'll drive yourself crazy!

Wednesday, January 3, 2018

The Importance of Primary Sources

This death certificate is only a primary source for the death
info. His parents names are actually wrong, and his birth year
is in conflict with records from when he was alive, suggesting
the informant, his son, may have gotten it wrong.
I often see people asking about which source is better for a certain fact or event and this is a good time to address the differences between a primary source and a secondary source. A primary source is a document which is recorded at the time of the event it's detailing. A secondary source is one that is detailing an event that occurred in the past, and therefore may be more likely to be incorrect. A primary or secondary source can also be a person, regarding whether or not that person was alive/witness to the event in question. So to understand the reliability of a record, we have to understand what it's a primary source for, and what it's not. Here's a quick rundown:
  • Birth records are the only primary source for a birth. This may include a birth announcement in the newspaper, but the further back you go, the less likely this becomes. Equally, the further back you go, the less likely that civil vital records were kept. Delayed birth certificates aren't a primary source, but may be the only record of a birth in existence. Also keep in mind that some places would fine individuals for reporting a birth too late, which means they may have lied about the birth date to avoid being fined.
  • Baptism records are only a primary source for the baptism, not the birth. However, if the baptism occurred only a few days after the birth, that's pretty much as good as a primary source for the birth too (if it recorded the birth date - do not assume the baptism and birth date are the same if both aren't recorded). Especially if there's no birth record in existence, a baptism record is likely as good as it's going to get. However, if the baptism took place years after the birth, maybe even months, that is not a primary source for the birth because enough time has passed for the actual birth details to have been remembered incorrectly.
  • Marriage records are only a primary source for the marriage. Particularly if the parents of the bride or groom were deceased, you can't be sure their names are correct. Be careful not to mistake a marriage bann, engagement announcement, or marriage license for the actual marriage.
  • Death records are only a primary source for the death. If it includes an address where the deceased was living at the time of death, then it's also a primary source for that. But it's a secondary source for the birth date and location, both because the document is normally recorded many years after the birth (unless it's an infant death), and because the informant for the death record is often someone who wasn't even alive or present at the time of the deceased's birth. It's also not a primary source for the parent's names or birth locations; it's very common for those details to be incorrect. Death records are usually a good source for the burial location, even though they are recorded before the burial takes place, and therefore that info theoretically could change before it happens.
  • Obituaries are generally considered a type of death record and therefore can be considered a primary source for the death if they are published within a few days of the death, as is typical. Excepting potential printing errors, of course (i.e. the informant may have provided the correct death information, but the typist misprinted it).
  • Gravestones aren't really a primary source for anything! At the most, they may be a primary source for the location of the burial, but I have seen gravestones erected for people before their death, who then actually wind up buried elsewhere. However, when this happens, there's usually a lack of at least a death date on the gravestone. It's also not a primary source for the date of the burial, since gravestones don't normally have the burial date listed on them. You might think it's a primary source for the death date, but gravestones often aren't created for weeks or even months after the death, plenty of time for people to remember the exact date incorrectly. 
  • Cemetery/burial records are only a primary source for the burial information. Unlike gravestones, these usually include the interment date and wouldn't exist unless the deceased was actually buried there.
  • Census records are only a primary source for data that was current at the time the census was taken, such as: residence, occupation, citizenship, literacy, etc. All other data that occurred prior to the census - birth/age, marriage, immigration, etc is secondary. Additionally, even things like the occupational data may be subject to the knowledge of the informant and could be incorrect. Also don't forget that in the US, pre-1880 censuses did not record relationships to the head of the household. While you can often surmise relationships based on the order in which people are listed, ages, and names, you can't be sure about them without other supporting documents to confirm. 
  • Family bibles may or may not be a primary source for any or all of the data within, depending on when each item of information was recorded and who recorded it. Unfortunately, there's generally no way to know for sure when the data was recorded, or who by. You can sometimes get an idea based on the handwriting and/or different types of pens used at different times. For example, you might note the birth info was recorded at a different time from the death info. But this still doesn't assure they were recorded at the time of those events. They could have each been recorded years after the fact, whenever the author (and we may not even know who the author was) got around to it.
  • Wills and Probates can contain a lot of valuable and reliable information, like the names of someone's children, the details of their estate/property, etc. But even though they are related to the death fact, they typically don't contain a date or location of the death, let alone a cause of death. Don't mistake the will or probate dates for the death date, but you can usually get a time frame for the death date - sometime after the will date, and before it was probated.
  • Lineage books are a secondary source for everything in them, since they are written after all the events took place. However, many lineage books use primary sources for at least some of their information. That doesn't necessarily mean the entire book is reliable, but the particular data coming from primary sources should be. Not all authors note their sources, but many do.
A gravestone with no dates - this person was
actually buried in a different cemetery (I
believe his parents erected gravestones for
their children in the family plot, but some of
them wound up choosing other cemeteries. This
is not typical in my experience.
Naturally, we do not always manage to find a primary source for each bit of information and that doesn't mean we can't use secondary sources. Even primary sources can be wrong sometimes, they are just much less likely to be so. We just have to work with what we have, and what exists, and understand what is more or less likely to be accurate. Having data from a secondary source doesn't mean we can't put that data or that source in our tree, it just means we should keep looking for better or additional resources to help confirm or deny it. Family trees are forever a work in progress and no one should assume that once a piece of information is put into a tree, it means you're confident it's accurate. The sources you cite in your tree should speak for themselves as to their reliability.

Judging which secondary source is more reliable for what type of conflicting data can be difficult, and we have to weigh when, how, and by who the data was recorded/provided. You may think it makes the most sense to go with a birth year that you find on most of the records for an individual, but what if all those records are from later in his life, or even after his death? A record from his childhood, closer to when he was born, and when his parents, who were there for the birth, were still alive and one of them may have been the informant may actually prove to be the more reliable source. Of course you can never know for sure, so it's also best to put all recorded facts in your tree as alternate data, but you still have to pick a default/preferred one. Hopefully, this has given you some things to consider when choosing a default/preferred fact to go with, and given you a good understanding of primary and secondary documents.

Tuesday, September 19, 2017

A Gedmatch Admixture Guide: Parts 3 and 4

Continuing on from Parts 1 and 2 where I covered the different projects and calculators available for Admixture Proportions and what Oracle is and how to read it, I've had some requests to cover the other viewing options available like Admixture Proportions by Chromosome and Chromosome Painting. So that's what I'll be covering in Parts 3 and 4. For Part 5 on Spreadsheets, click here.

Part 3 - Admixture Proportions by Chromosome

How to find it: From your Gedmatch home page, under "Analyze your data" and then "DNA raw data", choose the option for Admixture (Heritage)" like you did in Part 1, but this time you're going to select " Admixture Proportions by Chromosome" from the bullet list. Be sure to select a project and then calculator and put in your kit number like normal. I would go with whatever calculator you found reflected your known ancestry best. If you haven't read Part 1 yet, you should do so first.

Admixture Proportions by Chromosome shows you your admixture proportions as broken down by individual chromosome; or, in other words, what percentages of each chromosome are most commonly found in which populations/ethnicity. This gives you a much more detailed view of where your DNA is most commonly found.

Admixture proportions (or ethnicity percentages) broken
down by chromosome
So with Eurogenes K13, it shows my chromosome 1 is 28.1% North Atlantic, 15.7% Baltic, 27.7% West Mediterranean, 16.9% West Asian, 10.9% East Mediterranean, and 1.1% Amerindian. This option can often show results in populations that don't show up in a normal Admixture Proportions calculator. However, always keep in mind small percentages may just be from "noise" - like a false positive. I have no Native American ancestry so the 1.1% Amerindian probably doesn't mean anything. You'll also note how I get some North Atlantic results, in varying amounts, on every single one of my chromosomes.

My Eurogenes K13 results
In my normal K13 results, I got 39.03% in North Atlantic, so this is just breaking that average of 39.03% down by chromosome. If you add up all the percentages for one population and divide it by 22 (number of chromosomes) you'll get your overall average for that population. You may note it's a little off from what the admixture calculator originally gave you - for example my average for North Atlantic when each chromosome is added up and divided by 22 is 38.89%, not the original 39.03%. I am not sure why that is, but it's such a small difference I'm not going to worry about it too much. If someone has more information on this discrepancy, please comment below!

At the bottom it says "Number of SNPs eval" - this is just how many of your SNPs were used for the evaluation.

It doesn't show which particular segments each percentage is found on though, but that brings us to the next options.

Part 4 - Chromosome Painting and Reduced Size

How to find it: Same as above, but select "Chromosome Painting" or "Chromosome Painting - Reduced Size" from the bullet list instead.


Chromosome Painting is a visual representation of your admixture proportions not only by chromosome but by segments of each chromosome. The different colors show which segments of each chromosome were most similar with which populations. When there are overlapping colors on the same segment, it means that segment is found in more than one population. The higher the spike, the stronger the match to that population. So segments where there are solid blocks of one color are more solidly found in only that population. Above is just a small portion of one of my chromosomes (7, I believe), as an example of the various populations that will show up for any given segment.

You'll note there are numbers along the bottom of each chromosome - this is marking the amount of base pairs in millions. One centiMorgan is one million base pairs. So if you have a segment painted with a certain color stretching from "10M" to "20M", for example, that's 10 million base pairs, or 10 cMs. Don't get too excited if you see colors for some unexpected populations - small segments could just be noise.

Chromosome painting reduced size
The reduced size option just condenses it so it's easier to view on a single screen. After viewing the full size, you'll quickly see just how cumbersome it is to get an overview, so the reduced size is ideal for that. The full size is better for examining particular portions. They don't label each chromosome but they are listed chromosome 1 to 22, from left to right. They are also rotated so the start of the chromosomes are at the bottom.

You may notice in either the full or reduced size that similar populations (though it's more noticeable in full), or neighboring regions, often spike and dip almost in unison with each other. This is because neighboring regions tend to share a lot of DNA and be genetically similar so when you see this, what you're seeing is that these portions of your DNA may be somewhat indistinguishable among two or more groups. This is important in understanding that not all DNA can be narrowed down to the more specific areas or countries that so many people wish it could, not with any reliability. It also illustrates why you might get results in a region that you have no known ancestry in when it neighbors a region you do have ancestry in.

23andMe's chromosome painting
If you tested with 23andMe, you may be somewhat familiar with chromosome painting already. 23andMe's option for it is a little more straight forward. It doesn't have all the spikes and dips, just solid blocks showing which segments were put into which groups (shown left). However, it does show the two sides of each chromosome whereas Gedmatch doesn't seem to do this. Although in some ways, Gedmatch's painting is more detailed, it is essentially the same concept, just a slightly different approach.

As another example, below is also a graphic from 23andMe - it's not a part of your results from this company, it's just showing, in part, how they determine ethnicity. Their example uses the more detailed type of chromosome painting found at Gedmatch, and it is labelled to show the probability of each ancestry on one side with increasing percentages of likelihood. It can be found in their guide article on ancestry composition. Gedmatch's chromosome painting can be read the same way (ie, the higher the peak, the higher the probability of that segment being from that population).


Disclaimer: Please note I am not a professional in the genetics industry, and it is difficult to find information particularly on some of the more advanced admixture tools on gedmatch. This is how I have come to understand the results and tools through my own experiences and research, but please, if someone more knowledgeable can correct me if I've misunderstood something, or can fill in some gaps, let me know by commenting below.

Thursday, April 6, 2017

Finally! A Gedmatch Admixture Guide!

Update: I think perhaps I was not clear enough when originally writing this that ethnicity/admixture is only an estimate or interpretation of your DNA, it is NOT an exact science and different interpretations often yield wildly different results. It's usually accurate on a continental level, but sub-continental regions generally share too much DNA to always be able to reliably tell them apart. That is why most of Gedmatch's calculators often cover broad areas. The more specific an area is narrowed down too, the more speculative the results are. Plus, different sample groups and different algorithms will always produce different results and there is no one option that is always going to be more reliable than any other. While the ethnicity reports can be fun and interesting to explore, which is why I wrote this guide, they really should not be taken literally. You should not attempt to use them to definitively prove your ethnic origins (on a sub-continental level), or exact amounts of any given ethnic origin, or a specific geographic path your ancestors might have taken over time, or especially to confirm a specific ancestor's identity. If that is what you're looking to do (particularly the latter), you are better off working with you DNA matches (if you have not opted into matching, you should seriously consider it, since that is where the true value of the test lies). Additionally, be aware that Gedmatch's admixture calculators haven't been updated in years (though all of this still applies to other companies who have provided updates more recently). With all that in mind, I hope this guide is useful to helping people understand the different interpretations of their DNA available on Gedmatch. Have fun, but remember, don't take it too seriously.

For those unaware, Gedmatch.com is a website where you can upload your raw DNA data for further analysis and matching with people from other companies who have also upload their data.

Parts 3 and 4 on Admixture Proportions by Chromosome and Chromosome Painting now available.

Part 5 on Spreadsheets is now available.

Part 1 - Admixture Proportions

Introduction
Despite all the help articles available on Gedmatch.com, none of them really offer a comprehensive guide to understanding the admixture calculators for newbies. Most of them are guides on understanding DNA in general, or how to upload your data, or using the one-to-many or one-to-one tools. In fact, there is a very good beginners guide to the matching side of things found here. But the most common questions I see about Gedmatch are “which admixture calculator do I use?” and “what do the results mean?” There is a Gedmatch wiki page on admixture, and there is Kitty Cooper's slide presentation, but I don’t think they really answer all the questions most people are looking for, especially regarding Oracle. Even Googling the topic only turns up spotty results from forums and blogs, nothing that really lays it all out. Since no one else has done it, here is my attempt. Please keep in mind I am no expert and have no formal education in genetics, this is just the knowledge I’ve gathered over the years from various sources as a result of trying to understand my own DNA results.

Admixture is a scientific term for the ethnicity percentages you received from a DNA company like Ancestry.com, FamilyTreeDNA, 23andMe, or MyHeritage. It’s important to understand that each admixture project on Gedmatch is created by a different person, mostly academics. Note that most of the admixture results will include some basic info on the calculator, either on the results page, or through a link from the creator. However, the info provided may still be technical and difficult to understand for the average person, because they were primarily created for academic purposes. This is an attempt to translate some of that info into something more understandable to the average user. I apologize that this guide favors info on European backgrounds, but that is simply what I’m most familiar with, being a European descendant myself.

Be aware that it’s common practice in DNA admixtures to refer to populations from prehistoric times as “ancient”, even though this is a bit of a misnomer. In historical terms, ancient history marks the beginning of recorded history, but here, “ancient” generally refers to the time before written history, prehistory. Some time periods might be specified as “neolithic”, or “paleo/paleolithic” etc.

Select a project from the drop down menu, leaving the other
options as they are, then click "continue"
Step 1: Pick a project.
There are 7 projects to choose from in the Admixture (Heritage) tool (found under "Analyze your data" and "DNA raw data"), but what are they? What do they mean? Which one should you pick? Here’s a basic breakdown:

(Note: below the projects drop down menu there are options like "Admixture Proportions (with link to Oracle)" and "Chromosome Painting", etc. Don't mess with those for now, just stick with the top default option, Admixture Proportions (with link to Oracle), as that is what this guide will cover.)

  1. MDLP
This is a global calculator and attempts to break your results down into different parts of the world. It’s good as an overview, but if, for example, you already know you’re European, it’s probably unnecessary. It’s also heavy on ancient groups. The blog for this project is found here: http://magnusducatus.blogspot.com/

  1. Eurogenes
As the name suggests, this is primarily for people with European backgrounds. While it does have populations outside Europe, there are usually more sub-continental regions for Europe than any other continent. I highly recommend this as the go-to project for people with sole European ancestry. The blog for this project is found here: http://bga101.blogspot.com.au/

  1. Dodecad
This project says it focuses primarily on Eurasians, but most of the calculators are geared more towards Asian and African ancestry than European. It’s not ideal for Europeans, but may be useful for people with mixed ancestry. The blog for this project can be found here: http://dodecad.blogspot.com/

  1. HarappaWorld
This calculator is primarily for people with South Asian ancestry. The blog for this project can be found here: http://www.harappadna.org/

  1. Ethiohelix
This is an African based project, though it does have options for people with mixed backgrounds (but always including African). There is no Native American in this project at all. The blog for this project is found here: http://ethiohelix.blogspot.com/

  1. puntDNAL
This is primarily a project on ancient DNA. There is no website, but questions and comments about should be directed to Abdullahi Warsame at puntdnalking@gmail.com

  1. GedrosiaDNA
This project focuses primarily Eurasian (especially Indian and Asian) and ancient DNA. There is no website, but for further questions, please contact the creator at Dilawerkh4@gmail.com


Once you've selected a project, you need to enter your kit
number and then select a specific calculator.
Step 2: Pick a calculator.
You’ll find that for each project, there are often several calculators to choose from. How to choose? What do they mean? What are the differences? Well, for starters, the numbers following a ‘K’ indicate how many populations (or regions/categories) that calculator includes. So for example, Eurogenes EUtest V2 K15 has 15 populations. So choose one depending on how many regions you want to break your results down into. Keep in mind the more populations and therefore the more specific the regions are, the more speculative the results will be.

Don't forget to put in your kit number - if you've forgotten it, go back to the home page and copy it.

Certain other tests may be specific to deeper, more ancient (prehistoric) ancestry, like Hunter-Gatherer vs Farmer. Any abbreviation that starts with ‘A’ probably stands for ‘ancient’, but I will post a comprehensive terminology list at the end of this guide. These calculators for ancient DNA aren’t very useful if you’re just looking for an opinion on your more recent ethnicity results.

Other calculators might be specific to certain types of ancestry. For example, Eurogenes’ Jtest is specific to Ashkenazi Jewish ancestry. There’s no need to run this test if you don’t have any Jewish ancestry. In fact, you might get false results in Ashkenazi if you run this calculator and have no Jewish ancestry.

(Note: ignore the option below the calculator drop down menu, this is for data collection purposes. If all 4 of your grandparents are from the same ethnic group and you want your DNA to be a part of the sample groups they use to create these calculators and determine populations, then go ahead and fill it out. Otherwise, you can ignore it.)

Here’s a more detailed breakdown of each calculator. I've also created a spreadsheet listing the populations included for each calculator, along with my recommendations for good calculators to use based on your ancestry or what you're looking for.

MDLP
  • MDLP K11 Modern - 11 global populations including ancient
  • MDLP K16 Modern - 16 global populations including ancient and modern - results page includes full population descriptions
  • MDLP K23b - 23 global populations including ancient
  • MDLP World22 - 22 global populations including ancient, full details including maps of what areas each category covers are found here - there are several Native American categories so this may be ideal for Native American ancestry
  • MDLP World - 12 global populations, probably the original MDLP calculator

Some Population Maps
for Eurogenes ANE 
Eurogenes
  • Eurogenes K13 - 13 global populations, mostly European. Creator made this the default as it “seems to hit the spot for most people” with European background. Details here
  • Eurogenes EUtest V2 K15 - 15 global populations, mostly European, also a popular option. Details including regional maps for each category found here
  • Eurogenes ANE K7 - 7 populations, Ancient North Eurasian, meaning this looks at ancient DNA mostly in Europe, Western Asia, and Africa. Details found here and some maps available here
  • Eurogenes K9b - 9 global populations, approximates Geno 2.0 analysis
  • Eurogenes K9 - 9 global populations, map available here (population descriptions no longer available)
  • Eurogenes K10 - 10 global populations, map available here (population descriptions no longer available)
  • Eurogenes K11 - 11 global populations, map available here (population descriptions no longer available)
  • Some population maps
    for Eurogenes K36
    Eurogenes K12 - 12 global populations. North European ancestry is said to do well with this calculator. Map available here (population descriptions no longer available)
  • Eurogenes K12b - 12 global populations, excluding Native American (Amerindian), map available here (population descriptions no longer available)
  • Eurogenes K36 - 36 global populations, mostly European. This is the most detailed breakdown for Europeans, but that also makes it highly speculative. Details found here and maps available here - there's also an interesting application that will map out your personal K36 results
  • Eurogenes Hunter-Gatherer vs Farmer - 12 ancient Hunter-Gatherer vs Farmer populations. Map available here
  • Jtest - Jewish Ashkenazi, 14 global populations but mostly European, this is essentially the EUtest with an Ashkenazi category. Details including maps are here
  • EUtest - 13 global populations, mostly European minus Jewish Ashkenazi. Details including maps are here
Some Dodecad K12b
Population Maps

Dodecad
  • Dodecad V3 - 12 populations, mostly Asian and African, 2 European, no Native American. More info
  • Africa9 - 9 populations, all African except one European (no Asian, no Native American). More info
  • World9 - 9 global populations, not specific to any continent so good as an overview regardless of your ancestry. More info
  • Dodecad K7b - 7 populations, mostly Asian, 2 European, 1 African, no Native American. More info
  • Dodecad K12b - 12 populations, mostly Asian, 3 African, 2 Middle East, 2 European, no Native American. More info and population maps

HarappaWorld
  • HarappaWorld only has one calculator and as explained above, it’s primarily for South Asian ancestry. It does include some European, African, and Native American populations, but its focus is on South Asian: Indians, Pakistanis, Bangladeshis and Sri Lankans.

Ethiohelix
  • EthioHelix K10 + French - 10 populations, 9 African, one “French” which acts as a European population. This is really only useful/accurate for people with mixed African and European ancestry. Maps available here
  • EthioHelix K10 + Japanese - 10 populations, 9 African, one “Japanese” which acts as an Asian population. Only useful for people with a mix of African and Asian ancestry. Maps
  • EthioHelix K10 + Palestinian - 10 populations, 9 African, one “Palestinian” which acts as a Middle Eastern population. Only useful for people with a mix of African and Middle Eastern ancestry. Maps
  • EthioHelix K10 Africa Only - 10 strictly African populations, nothing else. Do not use if you have no African ancestry as results won’t be accurate. Maps

puntDNAL
  • puntDNAL K10 Ancient - 10 ancient populations, incorporates Caucasus HG as well as Early Neolithic Farmers and Western European HG.
  • puntDNAL K12 Ancient - 12 populations, utilizing ancient oracle, more info provided on results page
  • puntDNAL K12 Modern - 12 populations utilizing modern oracle, more info provided on results page
  • puntDNAL K13 Global - 13 modern populations, focuses primarily on Asia (6 Asian populations, 3 African, 2 European, 1 Oceania, 1 Native American). From the creator: "The impetus in creating this calculator was the release of the Southeast Asian study, which inspired me to create a calculator that included a Southeast Asian component and give my Southeast and Northeast asian people a more accurate calculator for their ancestry." Population details
  • puntDNAL K15 - 15 populations, focuses primarily on Africa (particularly East Africa), but also includes some West Asia, and Europe. More info
  • puntDNAL K8 African only - 8 populations, as the name suggest, it’s strictly an African calculator

GedrosiaDNA
  • (Removed) Eurasia K9 ASI - 9 populations, modeled around the ancient Ancestral South Indian component, no Native American. More info on population descriptions
  • (Removed) Eurasia K10 CHG - 10 ancient populations, modeled on Caucuses Hunter Gatherers, more info on population descriptions
  • (Removed) Eurasia K11 CHG-NAF - 11 ancient populations, modeled on Caucuses Hunter Gatherers and Neolithic Anatolian Farmers, more info on population descriptions
  • Gedrosia K3 - 3 populations, Eastern Eurasian, Western Eurasian, and Sub-Saharan African. More details
  • (Removed) Gedrosia K15 - 15 populations with a focus on the Indian subcontinent. Population descriptions
  • (Removed) Eurasia K14 - 14 populations, using the same Neolithic and Bronze Age source data as the K14 Neolithic calculator, plus some modern populations
  • (Removed) Eurasia K14 Neolithic - 14 global populations, focus is on ancient Neolithic and Bronze Age genomes from across Eurasia. Population descriptions
  • Gedrosia K12 - 12 populations, designed for individuals of predominantly South Asian and West Asian ancestry for inferring gedrosian Balochi admixture. No Native American. More info
  • (Removed) Gedrosia K11 - 11 populations with a focus on Kalash Indo European peoples of Pakistan. Population descriptions
  • Ancient Eurasia K6 - 6 ancient populations, primarily Europe, Asia, and in between, 1 African, no Native American. Further descriptions are available on results page.
  • (Removed) Near East Neolithic K13 - 13 ancient populations, with a focus on the Near East. Details provided on results page.


Step 3: Understanding the results: A Terminology Guide
A list of populations you might see and a brief description. I did not include some of the most self-explanatory ones. Some that I have listed might still be obvious to some people, but I’ve seen others ask about them on occasion. If there isn’t one listed here, you might learn a lot by just googling it. There is also a good abbreviation guide here: https://isogg.org/wiki/Abbreviations
Keep in mind different calculators may use different terms to refer to the same region or population.

  • Amerindian or Amerind - Native American (ie, American Indian meshed into one word)
  • Anatolian - mostly Turkey
  • Ancestral Altaic - Asia (excluding South), and Eastern Europe
  • ANE - Ancient North Eurasian
  • Archaic African - broad category for prehistoric Africans
  • Archaic Human - broad category for prehistoric humans around 500,000 years ago
  • ASE - Ancient/Ancestral South Eurasian
  • Ashkenazi - Ashkenazi Jewish of central/eastern Europe (not the same as Sephardic Jewish)
  • ASI - Ancient/Ancestral South Indian
  • Australian - aboriginals of Australia
  • Australoid - “people indigenous to Southeast Asia, South Asia, Australia, Melanesia, Polynesia, Micronesia, and historically parts of East Asia.” (Wikipedia)
  • Austronesian - “relating to or denoting a family of languages spoken in an area extending from Madagascar in the west to the Pacific islands in the east.” (Google)
  • Baloch - people of Iranian Plateau and Arabian Peninsula (primarily the Middle East)
  • Baltic - regions surrounding the Baltic sea
  • Bantu - Central and south Africa
  • Basal - Basal Eurasian?
  • Beringian - areas surround the Bering Strait (Eastern Russia and Alaska)
  • Biaka - aka Aka, “nomadic Mbenga pygmy people who live in southwestern Central African Republic and the Brazzaville region of the Republic of the Congo” (Wikipedia)
  • Caucasian/Caucasus - people of the Caucasus region, the border between Europe and Asia in between the Black sea and the Caspian Sea
  • CHG - Caucasus Hunter Gatherers
  • EHG - Eastern Hunter-Gatherer
  • ENF - Early Neolithic Farmer
  • Fennoscandian - Scandinavia and Finland
  • Gedrosia - Modern day Makran (semi-desert coastal strip in Balochistan, in Pakistan and Iran, along the coast of the Persian Gulf and the Gulf of Oman)
  • Khoisan - Southern Africa
  • Mbuti - “one of several indigenous pygmy groups in the Congo region of Africa” (Wikipedia)
  • Melanesian - “a subregion of Oceania (and occasionally Australasia) extending from the western end of the Pacific Ocean to the Arafura Sea, and eastward to Fiji.” (Wikipedia)
  • Mesoamerican - Native American in Mexico, Central and South America
  • NAF - Neolithic Anatolian Farmer
  • Oceanian - Aboriginals of the Pacific Ocean islands (may include Australia depending on calculator)
  • Omotic - Southwest Ethiopia
  • Papuan - New Guinea and surrounding islands
  • Pastoralist - Sheep or cattle farmer
  • Pygmy - “certain peoples of very short stature in equatorial Africa and parts of Southeast Asia.” (Google)
  • San - Bushmen of southern Africa
  • SEA - South East Asian
  • SSA - Sub-Saharan African
  • Steppe - “ancient North Eurasian hunter-gatherers' heritage, which was subsequently shown to have an influence in later eastern hunter-gatherers and to have spread into Europe via an incursion of Steppe herders” (MDLP K16)
  • Tungus-Altaic - Northeast China and Siberia
  • WHG - Western Hunter-Gatherer
  • WHG-UHG - Western Hunter-Gatherer/Unknown Hunter-Gatherer
  • Volga-Ural - Part of Russia (central)


Conclusion
Which project and calculator you go with greatly depends on your known ancestry. I know all this info is probably still a little overwhelming even with (or perhaps because of!) this guide. If you’re of European descent, and a newcomer to Gedmatch, and you just want a second opinion on your ethnicity results from any of the Big 3 companies (Big 4 now maybe, with MyHeritage joining the bandwagon), I’d recommend Eurogenes K13 or K15. Personally, I tend to prefer K15, because there are maps available showing specifically what regions are covered by which populations. Certainly, you can play around with any of the other Eurogenes calculators too (except Jtest if you’re not Jewish). Most of the other projects and calculators are either geared more towards ancient DNA, other continents, or a mixed ancestry. You may find a non-bias global calculator in some of the other projects, but it’s probably not going to provide the breakdown of Europe you’re looking for.

If you’re looking for an ancient calculator, I again tend to stick to one of Eurogenes’ (HG vs F, or ANE), but MDLP have some good options too. There’s also a couple in puntDNAL which I don’t think have a bias towards any one type of ancestry.

If you’re African, Asian, or of mixed heritage, there are a number of options to choose from, but I unfortunately can’t recommend any over any others. Most global calculators will include Amerindian (I have noted when one doesn’t), but MDLP World22 seems to have the most categories for Native Americans and may be ideal for that.

I was surprised to realize Eurogene's Jtest is the only one that offers an Ashkenazi (or other Jewish) category, so if you're Jewish, it looks like this is your only option. However, it should be noted that there are many Jewish populations typically included in Oracle/Oracle 4 (see below for more details), not just for Jtest but any other given calculator too. For your reference, I created a spreadsheet that shows which calculators have what Jewish populations available in Oracle/Oracle4.

If you're adopted and don't know what your ethnic background is, is important to remember that there's never going to be one defining ethnicity or admixture report that tells you "this is your ancestry" with any total accuracy. However, I understand the desire to know where you came from, so what I'd recommend is gathering as many reports as you can (within reason - if you're obviously white, there's no sense running an African-only calculator) and compare them in a spreadsheet like I've done here. It will help you spot any consistencies, or see what populations show up most frequently in the highest numbers.

It is frustrating that maps, or at least population descriptions, aren’t available for every calculator, but this is a free service, after all. It’s actually pretty amazing all the work the project creators do to provide this for free.



Part 2 - Oracle
Say what now?

Introduction
The second most common questions I see about Gedmatch are about Oracle. What is it? What do the results means? Oracle is an attempt to pinpoint your origins to a more specific population or region. You'll find many are narrowed down as specifically as regions within countries, or specific religious groups. There are two options: Oracle and Oracle 4. You will find buttons for them listed under your admixture results. Note that not all admixture calculators have Oracle available so pick a calculator that both suits your background and offers Oracle. There is a third button which just says "Spreadsheet" but this is covered this later. There is a also good explanation for this from Roots & Recombinant DNA.

Oracle
Oracle will list your admixture results, then something called Single Population sharing, and finally Mixed Mode Population Sharing.

  • Single Population Sharing attempts to pinpoint a specific, single population that your DNA most closely matches, with a list of the top 20. The distance will tell you how closely you match each group, so the smaller the distance number is, the more closely you match. It is assuming your ancestors all came from the same area/population (so if they didn't, this is probably not ideal for you and the results may not make sense).
  • Mixed Mode Population Sharing will show you your top 20 of two specific, combined populations in order of how closely you match those populations. It is assuming your ancestors came from only two locations/populations (though not necessarily split 50/50). Again, the distance will tell you how closely you match this combo of populations, while the percentage will tell you how much of your DNA matched which population.

Oracle 4
Oracle 4 is essentially the same as Oracle, except it expands on it by providing combinations of 3 and 4 specific populations. The single and double combinations can be different from original Oracle though, so don’t bypass Oracle thinking you’ll get that and more with Oracle 4, it may be best to examine both depending on your ancestry.

  • Using 1 population approximation works the same as Single Population Sharing in Oracle, but I’ve noticed the results are sometimes different, so they’re obviously using a slightly different calculation. Reading the results works the same though: they are showing you a list of specific populations you most closely match, with the distant showing you just how closely you match. Again, this is intended for people whose ancestors all come from the same population.
  • Using 2 population approximation also works similarly to Mixed Mode Population Sharing but you'll notice that the percentages are always 50/50. That's because it's assuming that you have one parent from one population, and the other from another, so you would be 50/50. If that's not the case, this is not ideal for you. For some reason this only lists your top 1 result instead of the top 20. Again, the distance tells you how closely you matched this combo of populations.
  • Using 3 population approximation works the same as 2, but with a combination of 3 populations instead. So it's assuming you have one parent from one population and on your other side, you have a grandparent from a different population, and the other grandparent from a third population. This is why one population will be 50%, and the other two are 25%. It only lists one result. You know what the distance means by now.
  • Using 4 population approximation uses a combination of 4 specific populations you most closely match and lists your top 20 combos. This was designed for people who have 4 grandparents from 4 different places but it can sometimes also work well if most of your ancestry is mainly from 4 different places/populations (because it does not include percentages).

Conclusion
Be aware that the results from Oracle and Oracle 4 will vary depending on what admixture calculator you used, which is why they are found on the admixture results page, and not as a separate calculator. Also keep in mind the results are speculative, but I have found they do often make some sense, and in some cases, can be remarkably accurate. Check out my blog post on a deeper analysis of my Oracle results here. However, if you do not fit the scenarios described of having parents or grandparents from one location, the results may not be reliable for you.

A lot of ancient populations in Oracle will likely have unfamiliar names but there's a good a map showing where many of the samples for the ancient populations came from available here.

If you feel like you've got a good handle on this, continue onto Parts 3 and 4, Admixture Proportions by Chromsome and Chromosome Painting.