Wednesday, March 7, 2018

A Gedmatch Admixture Guide: Part 5: Spreadsheets

Also see Parts 1 and 2 on Admixture and Oracle, and Parts 3 and 4 on Admixture Proportions by Chromosome and Chromosome Painting.

Previously, I posted a link to Roots & Recombination's article on Gedmatch's Spreadsheets so I didn't go into it myself when I was detailing how Gedmatch's admixture tools work. However, I've been seeing some people still have questions so I'm going to cover it after all. Not that Dixon's explanation isn't good, but I know for me, it didn't fully click until I realized what I'm about to show you.

To find the Spreadsheet option, run your kit number through your desired admixture calculator (see Part 1 if you need help with this), and under the buttons for Oracle and Oracle, there will be a button for Spreadsheet.

Eurogenes K13 Spreadsheet

Firstly, to clarify this up front, the Spreadsheets are not your personal results. If you look at the Spreadsheet for the same calculator with different Gedmatch kits, they will all be exactly the same. Above is a portion from Eurogenes K13 Spreadsheet - compare it with your own, you'll see it's the same.

So what are they? Basically, the Spreadsheets are showing you what the more specific Oracle populations would look like when run through any particular admixture calculator. So using Eurogenes K13 as an example, the first row for Abhkasian (a small area in the Caucasus mountains) is showing you that when run through Eurogenes K13, the Abhkasian population got 1.64% in North Atlantic, 4.62% Baltic, 9.81% West Mediterranean, 54.30% West Asian, 22.78% East Mediterranean, etc. What this means is that if you were of full Abhkasian descent, you might expect to get admixture results like this.

To illustrate this, note (below) how if you add up the numbers in a single row, they add up to 100% (give or take 0.01-0.03%, as is usual even for your own results, which is probably just due to rounding up or down the individual percentages). So it's showing the admixture results of specific populations as though they were Gedmatch kits.

Spreadsheet for Eurogenes K13 showing sums of all populations

It also shows either how mixed or how exclusive a certain population's DNA is. You might expect an Italian to get results primarily in East and West Mediterranean, and indeed, most of the Italian populations (East Italian, West Sicilian, Italian Abruzzo, Tuscan, Sardinian, South Italian, North Italian, even Italian Jewish) do get high results in those categories (see below). But notice how they also get high results in North Atlantic, meaning that even people from the Southern most areas of Europe still share a lot of DNA with the Northern part of Europe (at least in this calculator). Even East Sicilians are getting 16.46% in North Atlantic.

Italian populations in Eurogenes K13 Spreadsheet

Not only can this explain some unexpected admixture results, but it can also explain unexpected Oracle results. Previously I've talked about how Eurogenes K13 Oracle 4 results matches me to a lot of Jewish populations, namely Kurdish Jewish and some Iranian Jewish. I have no known Jewish ancestry and don't get any Jewish results from any of the big DNA companies. But if I look (below) at Kurdish Jewish and Iranian Jewish in the K13 Spreadsheet, I see they have the highest results in East Mediterranean, which is expected, but also 8-10% in Wed Mediterranean, which is the same category my Italian ancestry peaks in, so perhaps there is some shared DNA there and K13's Oracle is picking up on that in my case.

Population admixtures for unexpected Oracle results 

It's difficult to find a population that gets more than 90% in one category (shown below), but one of the ones that does is Karitiana, a group of Native Americans in Brazil. This population gets 99.62% in (unsurprisingly) Amerindian, and only trace, less than 1% in other categories. The Dai population (a Chinese group) is another, getting 90.46%, again unsurprisingly, in East Asian. And another example is the Papuans (indigenous peoples Papua New Guinea) getting 94.59% in Oceanian. None of this is surprising, as these are all populations which would be expected to be fairly endogamous to begin with, knowing their histories.

The only populations to get 90+% in one category

So while this tool gives us some very interesting information into the make up of each population, and may help provide some insight into how and why your results turned up the way they did, they are not your personal results.

UPDATE: To help better visualize the Gedmatch Population Spreadsheets, I've started creating some bar charts. I'm a visual person, so I find these charts quicker and easier to make sense of than looking at a bunch of numbers. They are interactive so you can hover over sections to get details. If people find this useful, I'll keep adding them.

Eurogenes K13 Population Spreadsheet Chart
Eurogenes K13 Reverse Chart
Eurogenes EUtest V2 K15 Population Spreadsheet Chart
Eurogenes EUtest V2 K15 Reverse Chart

5 comments:

  1. This post, including the previous parts, is fabulous! One question I have with regards to admixtures: shouldn't there be some sort of time stamp on them? For example, my Dutch ancestors came to the US in 1640, but started in Eastern Europe in the 1300s (if my research is correct). Depending on the model, I could show up as any off these populations, correct?

    ReplyDelete
    Replies
    1. It probably partly depends on how much your Eastern European ancestors mixed with the Dutch populations. But also, since admixtures usually work by comparing your DNA with modern populations, it can be hard to pinpoint a timeline it represents in history. Most companies advertise an approximate time period - like 23andMe says their ancestry composition is representative of "at least 500 years ago", but that "at least" means it could be from much further back, perhaps depending on the individual. AncestryDNA advertises "about 1,000 years" so your results there could theoretically predate even your 14th century Eastern European ancestors. Because it's not a precise science, I don't know if the creators of all of Gedmatch's admixture calculators make predictions on time periods. I have seen the creator of Eurogenes say that K15's Oracle 4 might be better for more recent ancestry, while regular Oracle appears more ancient.

      However, unless you have Native American ancestry, you won't show up with any US populations (not in Gedmatch).

      Delete
  2. What are the spreadsheets for if you choose to run the calculator for chromosome percentage by spreadsheets? That is how I get some Native American to show. The only native I get running and admix calculator is 0.14 Mesoamerican and 0.14 Melanesian. I have one native american allelle on the yourdna native american snp test.

    ReplyDelete
    Replies
    1. Spreadsheets are not your personal results.

      Delete