Wednesday, September 16, 2020

Why All the Scotland?

Since AncestryDNA's latest update introduced Scotland as it's own population, separate from Ireland, separate from England, lots of people are getting unexpectedly high results in Scotland. Even people with no known Scottish ancestry are getting significant percentages in that category. And of course, everyone is asking "why?"

For once, Ancestry actually honestly addressed this by explaining that natives to the British Isles have a lot of genetic overlap and can be difficult to tell apart, highlighting the fact that this is still just an estimate or interpretation of our DNA, and it should not be taken too literally.

But Scotland also has a lot of genetic overlap with mainland Europe, and I wanted to share some data and visuals that help illustrate all this. Firstly, although they haven't added the link for it yet, if you pull up the "full history" of the Scotland category (add "/ethnicity/Scotland/history" to the URL after the long that ID number), you'll see it lists all the surrounding areas included in "Scotland" (screenshot above):

Primarily located in: Scotland, Northern Ireland
Also found in: Belgium, Channel Islands, England, Faroe Islands, France, Iceland, Ireland, Isle of Man, Luxembourg, Wales


That's a big area this category is covering and makes the title of solely "Scotland" seem a little misleading. So is the map, which, apart from Brittany, half of Northern Ireland, and a sliver of Northern England, isn't covering any of the other locations listed here. Brittany, the seemingly rogue area in France that is included in the Scotland map, might seem out of place, but it actually makes a lot of sense. Brittany, as the name suggests, is actually heavily Celtic. In the 5th century, Celtic Britons fled the Anglo-Saxon invasion of Britain and went to what is now Brittany, France. In fact, people there still speak a Celtic language called Breton that bares a similarity to Scottish Gaelic. But Scotland and France were often allies throughout history (united by their shared enemy, the English), so it wouldn't be unusual to see genetic similarities to other parts of France too.

And there's more.

I reference the PCA chart in the ethnicity white paper a lot, and there's a reason for that. It shows us upfront just how much genetic overlap there is among different regions. The latest PCA chart (shown right) is the most detailed yet, including a break down of countries that are lumped into bigger regions in our results.

It can be a little difficult to tell some of the icons apart, so I actually overlaid some colored blobs to show the overlapping regions. Even that can be difficult to tell apart because the overlap is so significant for the British Isles alone. This is why the rest of the British Isles is included in the "also found in" details.


The light blue blob is Ireland, dark blue is Scotland, red is Wales, and dark grey is England. Scotland, Wales, and England in particular are almost indistinguishable, and Ireland still have significant overlap with them. So it's hardly surprising if your break down of the British Isles isn't exactly what you'd expect.

And Scotland has some noteworthy overlap with a lot of mainland Europe too, not all of which are included in the "also found in" details. According to the PCA chart, European countries that have overlap with Scotland include Germany, France, Denmark, Netherlands, Norway, and even Sweden.

It's difficult to even see which countries are included because there's so much overlap.

So basically, if you have ancestry from any of these regions, including the ones in the "also found in" details or the ones in the PCA chart, it could theoretically be turning up in your Scotland results. So the final inclusive list should be more like:

Northern Ireland
England
Wales
Ireland
Netherlands
Denmark
Norway
Germany
Luxembourg
France
Belgium
Channel Islands
Faroe Islands
Iceland
Isle of Man
Sweden

That's all of the British Isles, and the majority of Scandinavia and Northwest Europe.

Granted, AncestryDNA's algorithm may have been able to weed out the likelihood of some of those areas showing up under Scotland (I know they remove PCA outliers), and perhaps that's why not all of these areas are listed in the full details, but that's not necessarily foolproof, so I would still keep in mind that all of these places have some genetic overlap with Scottish samples. 

The PCA chart is very enlightening and anytime you have a question about DNA ethnicity and unexpected results, this chart might be able to answer it. AncestryDNA aren't always very forthcoming about the fact that Europe is so genetically mixed and neighboring regions often share too much DNA to accurately tell them apart, but the PCA chart doesn't lie (though you can generally exclude extreme outliers). I just wish they'd release ones for other parts of the world too, and some for areas where continents mix. For example, I'd love to be able to see how much genetic overlap Southern Italy might have with the Middle East and Northern Africa. I'd also like to see what populations Ashkenazi Jews most closely match (at one point, they were on the European PCA chart, but due to the fact that they were so dissimilar to any other group in Europe, they were obviously removed - I'd love to see if perhaps they are more closely related to Middle Eastern samples than European ones). And of course, not everyone is white and it'd be great if AncestryDNA provided as much background data about other parts of the world as they do with Europe. Providing PCA charts for them would be a great start.

Additionally, AncestryDNA used to have a chart that showed the average admixture for their samples (for people native to each region). For example, it showed us that the average person from the region which was "Italy/Greece" could expect to get about 10% results in the Middle East or Caucasus. It was highly informative in illustrating how genetically mixed some areas are (and also how distinct other populations can be). I have begged AncestryDNA support multiple times to make this data available again, but they refuse. I think they don't want to "confuse" customers too much, but in my experience, the less information you give people, the more confused they'll be. The constant questions about this I see on social platforms prove it.

No comments:

Post a Comment