Wednesday, May 16, 2018

Making the Most of Your DNA Matches

One of the more frustrating aspects of AncestryDNA is how few people have a family tree available, and when they do, it's often private or a tree so small you might think you can't get any use out of it. Of course, I would encourage everyone to contact their DNA matches with private trees and politely ask for an invite, and I would also encourage people to contact their matches who have no trees, as they might know enough about their ancestry to make a connection between you even if they didn't add it to a tree. But often times, people don't respond to our messages, or they decline our invite request. Dead end after dead end, right? Well, there are a few ways around these dilemmas. Although some a little specific to AncestryDNA, they can often be utilized with other companies too.

1. Look for a family tree, even if one isn't attached.
When you open the match details page, if there is a family tree available but not attached to the DNA test, it will have a drop down menu where you can select the tree to preview (shown above and left). In the screenshot above, it shows how initially, it looks like this DNA match has no family tree, but they do have one unattached to their DNA results. Selecting it from the drop down menu brings up a preview. It's a small tree, but enough to identify our most recent common ancestor, since their grandfather was the brother of my great grandmother.

This one you do need to be careful with because while sometimes, people simply forget to attach their tree to their DNA test, it's also possible that the family tree doesn't belong to the person whose test you match (or the tree may belong to that person but they are not the "home person" for the tree, as is automatically selected). For example, one of my close cousins has taken the test, but his wife is managing it. His wife has started her family tree, but not his, and I only know this because I know them well enough to know whose tree it is. To anyone else who doesn't know them, they could mistake the wife's tree for his own. In this case, there is a good reason the tree wasn't attached to the test. So definitely look for those unattached family trees, but don't make too many assumptions about them.

Don't dismiss a tree like this!
2. Build a tree from their shrubs.
Don't dismiss trees that seem too small to make any use of. As long as they have deceased ancestors in their tree (whose details are therefore public) you can do what genealogists do best: research! Build on that tiny shrub of a tree, researching further back than the tree owner did until you find your common ancestor.

In the example above/left, you might look at this family tree and think there is not enough information to find the most recent common ancestor, but you'd be wrong. This person's father is a descendant of my 4th great grandparents John Hendricks Godshalk and Barbara Kratz. How do I know? Because I took this tiny tree and I researched the ancestors until I connected it to my own tree.

3. Build downwards on your own tree.
Research all the descendants of your known ancestors, as far down as you can. It really helps when you're trying to make a connection with a small 'shrub' of a tree such as discussed above. You won't have to research your match's tree back very far if you've already done the work on your own tree.

This is especially useful for trees with endogamy - for example, I have a branch of Mennonites on my tree and after tracing many other descendant lines of my ancestors, it quickly became clear there are a number of surnames that are strongly associated with the colonial Mennonites who settled in Pennsylvania, especially when more than one appears in a tree. So if I see names in someone's tree like Oberholtzer, Funk, Detweiler, Bergey, etc, even though none of these are my ancestors, I immediately know they are likely from my Mennonite branch just from seeing the surnames. In fact, in the screenshot above the match's father's name was Detwiler, immediately suggesting I should follow that side back until it linked to my own tree, and it did. Even on branches without endogamy, it can still be useful, just not as immediately apparent.

Notes always showing in list allows me to quickly see
which ancestors I share with matches I have in common
with someone
4. Look at your Shared Matches.
If there really is no tree whatsoever you can make use of, and the person won't respond to your messages, all you can do is look at the DNA matches you have in common with each other. If any of them are matches you've already determined your shared ancestry with, then it's possible this match is also descended from the same branch. If more than one are descended from the same branch, then it's very likely this person is too. The more shared matches who descend from the same branch or ancestor, the more likely the person with no tree does too.

This process can be sped up greatly by using a Chrome extension called MedBetterDNA. It has the option to "always show notes", which means any notes you make on a DNA match will show up in the list of matches, including the list of Shared Matches. In other words, every time you identify the shared ancestor of a DNA match, make a note of that ancestor in the notes section, then every time that match is a Shared Match with someone else who doesn't have a tree, you will know it without having to open up additional match's details. See the screenshot example above. I can't not stress enough how much more efficient this has made my workflow.

5. Use the Search option for private trees.
It's frustrating to see all those private trees, especially when the owner doesn't respond. But you can get an idea of what surnames are in their tree by using the search option. That doesn't mean your shared ancestor is definitely from that surname, but it is especially useful for private trees you have a Shared Ancestor Hint with. Knowing you do have a shared ancestor with that match makes it much more likely a shared surname is the source of that ancestor. This method is a little tedious though, since you have to randomly search for surnames from your tree and hope you get a hit for the match you're looking for, but you should theoretically get there eventually if there is a Shared Ancestor Hint. However, be aware that the search function isn't hugely reliable and often misses people who definitely have a surname you're searching for in their tree. I think it's a site indexing issue. So it doesn't always work, but when it does, it's helpful. It is also useful in combination with the above tip (a surname search result plus Shared Matches who are confirmed from the same branch as that surname is very good evidence your Shared Ancestor Hint is from that branch).

6. Test other family members.
Testing family members, especially parents, is beneficial because you can at least see which of your matches also match those family members, and therefore which side or branch of your tree the shared ancestor is likely from. No tree? Won't respond to messages? No shared cousins who have been identified yet? Well, at least I can see whether they match my mom, dad, paternal grandfather, or any of my known, close cousins on either side who have tested.

Be aware that the Shared Matches feature only includes high confidence (or higher) matches who are estimated 4th cousins or closer, but if you manage any of your family member's kits, you can see which matches you have in common at any level/degree by opening that match's profile. In the example above, you'll see my dad (Jim) matches Agnes and two other kits she manages, even though they do not meet the criteria of "Shared Matches". So when I look at Agnes or her other kits in my match list, it won't show my dad as a shared match to them, even though you can see here by opening Agnes' profile, they are a match to my dad. So not only testing other family members, but getting permission to manage their test is also very beneficial to at least figuring out which side/branch someone is connected to.

7. Search the internet for your DNA match
This one may seem a little intrusive to some, but the data is public and it's out there, so why not make use of it? There are certain websites like familytreenow.com, truepeoplesearch.com, and pipl.com where you can search for people by their real names, or sometimes by a username. Ancestry.com and FamilySearch.org has some public records of living people too. Even just a Google search can yield results; some people use their real names on AncestryDNA - so search for it. Sometimes, you can find them on Facebook or other contact details. Sometimes, you can find out their parents names, and from there, build a tree and connect it to your own. I know these sites can be controversial to some who feel they are a violation of privacy, but they are using public data and not violating any laws. If you are concerned, you can request your information be removed from these sites.

Even when people use anonymous usernames, sometimes they post on Ancestry's message boards with info on their tree and you can find them by Googling the username. Sometimes they use the same username on other websites and you can get in touch with them that way.

It is important to remember that not everyone has as great an interest in genealogy and DNA as we do. Many (perhaps even most) people take the test only for the ethnicity report and may never return to the site after seeing them. Others might be adopted and not know anything about their biological ancestry, and in some cases, there are individuals who might have died after taking the test and not given any family members access to their account. There are many reasonable explanations for why people don't respond to our messages, so try not to get too frustrated by it. Focus and work with what you have, and don't let the rest get to you or you'll drive yourself crazy!

Tuesday, April 17, 2018

Which DNA Company is the "Best" for Ethnicity?

It frequently gets asked which DNA company is the "best", especially based on the ethnicity report alone. It's important to know that the ethnicity report is only ever an estimate, and they can vary greatly among the different companies, but which one is more accurate can also depend on the individual. ISOGG rate 23andMe the highest for ethnicity accuracy, and Nat Geo the lowest, but they don't include LivingDNA in that comparison, and I know from social media, not everyone feels the same way about each company. So I was curious to see what the majority would say if given a survey (if there even is a majority).

Well, here it is. If you've tested with even one of the companies included in the survey (23andMe, AncestryDNA, FamilyTreeDNA, LivingDNA, MyHeritage, and Nat Geo's Geno 2.0) please consider contributing your findings, it will only take a few moments (there are a max of only 13 quick questions, fewer if you haven't tested with every company - it merely asks "have you tested with this company?" and if you answer yes, it asks how accurate you felt the results were): "Best" DNA Company for Ethnicity Survey

Results will be posted once there's enough data collected.

Tuesday, April 3, 2018

23andMe's New Sub-Regions

My new sub-regions from 23andMe
Recently, 23andMe rolled out 120+ new regions in their ethnicity report (Ancestry Composition), but they are actually sub-regions that don't include a percentage (they also aren't included in Chromosome Painting). They are calculated much the same way Genetic Communities at AncestryDNA are, which begs for a comparison.

My initial feelings on 23andMe's new sub-regions are that although they have fewer of them than AncestryDNA's 300+ Genetic Communities, it does seem as though one is more likely to get sub-regions at 23andMe than they would be to get GC's from AncestryDNA. 23andMe correctly identified that my "British & Irish" results are actually from the UK, and my Scandinavian results are from Norway. I also have a sub-region of "Italy" under my existing "Italian" results (see left) - that probably sounds rather obvious, but when you look at the list of all sub-regions, you see that there's also an available sub-region of Malta listed under "Italian" - so once again, they've correctly identified my Italian ancestry and not mistaken it for Maltese.

No European GC's at AncestryDNA
Meanwhile, over at AncestryDNA, I have zero Genetic Communities in Europe (I have one for Pennsylvania Settlers though) - see the screenshot to the right. My dad does get one for Southern Italy because he's half Italian, but no such luck for me. AncestryDNA offer 13 GC's in Great Britain, 17 in Scandinavia, and 14 in Europe South, but I get nada for any of them. 23andMe offer measly 2 sub-regions under British & Irish (UK and Ireland), and only 4 in Scandinavia, but since I actually got sub-region results, I can't complain. AncestryDNA may have more sub-regions, but if there's fewer people getting results in them, then they aren't as useful. 23andMe have certainly just raised the bar a little bit.

It is a little bit of a shame 23andMe weren't able to identify my German ancestry, separate from France and other sub-regions in this group. So far, LivingDNA were the only ones to accurately accomplish this, and it was with percentages.

If you click on "See all tested populations" at the bottom of your 23andMe Ancestry Composition, you'll be able to see that each sub-region, although having no percentage, does show how strongly you match that group with a 5 dot system (shown below). The more dots, the more strongly you match that population. Only if you have 2 or more dots does the group show up on your Ancestry Composition page, but when you click on "See all" you may find you match additional groups with only 1 dot. For example, I have 1 dot for Sweden, but I have no Swedish ancestry and because it's only 1 dot, it doesn't show on my Ancestry Composition unless I click "See all". My existing sub-regions for Italy, United Kingdom, and Norway each have 2 dots, which is why they all show up on my Ancestry Composition page.

Dots showing the strength of my
connection to these groups
You may note that none of your 23andMe percentages have changed, that's because the new regions don't include a percentage. They are calculated differently from the ethnic percentages and use a different reference database. Also, don't assume that having results in a sub-region means they are saying the entire percentage from the parent region is coming from that sub-region. In my case, it's true because I know my family history, but for example, if I also had Irish heritage, the results aren't saying all 17.2% British & Irish is coming from the UK, it could also be coming from Ireland, I just didn't get results for that. I don't actually have Irish ancestry that's not Scots-Irish though.

My previous 23andMe results
for comparison
You may have also noticed that the names of a few populations have changed. This is simply to better reflect the areas they cover, it does not mean the data has changed. Instead of "Middle Eastern" it is now "Western Asian" and "North African" is now "North African & Arabian". What was "Central & South African" is now being called "African Hunter Gatherer" (I'm not entirely sure that's a better description for the newcomers to DNA). Also, "Oceanian" is now called "Melanesian". Originally "Mongolian" is now "Manchurian & Mongolian", and "Yakut" is now "Siberian". Additionally, they appear to have removed the parent categories that once showed the accumulative percentages of some sub-continental regions. For example, it used to group my Northwest European results together - so added up (British & Irish, French & German, Scandinavian, and Broadly NW European) it was 63.3% (show left). That's hasn't changed, if you add up those groups, it's still the same percentage, they are simply no longer showing it so I have to add them up myself. Not a huge loss, but a bit of a shame that I can no longer easily see the divide between my North and South European DNA (which has always been very distinctive).

Here's a complete list of the new sub-regions:

Original 23andMe's populations for comparison
  • European
    • Italian
      • Italy, Malta
    • French & German
      • Austria, Belgium, France, Germany, Luxembourg, Netherlands, Switzerland
    • British & Irish
      • United Kingdom, Ireland
    • Scandinavian
      • Norway, Sweden, Denmark, Iceland
    • Iberian
      • Portugal, Spain
    • Sardinian
    • Balkan
      • Albania, Bosnia and Herzegovina, Bulgaria, Croatia, Greece, Macedonia, Moldova, Montenegro, Romania, Serbia
    • Finnish
    • Eastern European
      • Belarus, Czech Republic, Estonia, Hungary, Latvia, Lithuania, Poland, Russia, Slovakia, Slovenia, Ukraine
    • Ashkenazi Jewish
    • Broadly Northwestern European
    • Broadly Southern European
    • Broadly European
  • Western Asian & North African (formerly Middle Eastern & North African)
    • North African & Arabian (formerly North African)
      • Algeria, Bahrain, Egypt, Jordan, Kuwait, Libya, Morocco, Saudi Arabia, Tunisia, United Arab Emirates, Yemen
    • Western Asian (formerly Middle Eastern)
      • Armenia, Azerbaijan, Cyprus, Georgia, Iran, Iraq, Lebanon, Syria, Turkey, Uzbekistan
    • Broadly Western Asian & North African
  • Sub-Saharan African
    • West African
      • Cabo Verde, Cameroon, Ghana, Liberia, Nigeria
    • East African
      • Eritrea, Ethiopia, Kenya, Somalia, Sudan
    • African Hunter-Gatherer (formerly Central & South African)
    • Broadly Sub-Saharan African
  • South Asian
    • Broadly South Asian
      • Afghanistan, Bangladesh, India, Mauritius, Nepal, Pakistan, Sri Lanka
  • East Asian & Native American
    • Japanese
    • Korean
      • North Korea, South Korea
    • Siberian (formerly Yakut)
    • Manchurian & Mongolian (formerly Mongolian)
      • Kazakhstan, Kyrgyzstan, Mongolia
    • Chinese
      • Hong Kong, Mainland China, Taiwan
    • Southeast Asian
      • Cambodia, Guam, Indonesia, Laos, Malaysia, Myanmar, Philippines, Singapore, Thailand, Vietnam
    • Native American
      • Argentina, Aruba, Belize, Bolivia, Brazil, Chile, Colombia, Costa Rica, Cuba, Dominican Republic, Ecuador, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, Panama, Paraguay, Peru, Puerto Rico, Uruguay, Venezuela
    • Broadly East Asian
    • Broadly East Asian & Native American
  • Melanesian (formerly Oceanian)
    • Broadly Melanesian
      • American Samoa, Fiji, Samoa, Tonga

You can also view a list of populations available from each DNA company here and see how 23andMe compares with other companies.

Sunday, March 18, 2018

Dating Old Photographs: Example #2 (Crayon Enlargement)

This is one I found in an antique store recently. It is actually a crayon enlargement, which was an enlarged copy made of a cabinet card or carte de visite. The trouble with enlargements of such photos was that the results often weren't very clear, so the artist or photographer would then enhance it by drawing over it with other materials like charcoal or pastels. Sometimes, such as in this case, it would be colorized, but that was not always the case, there are many black and white crayon enlargements. Either way, the effect of the enhancement typically makes the image look almost more like a drawing or painting than a photograph, but at the same time, certain elements appear too realistic to not be a photograph. That's because it's basically both.

Known date of colorized enlargement: 1882. Estimated date of original photograph: about 1880-1882.

I didn't purchase this one and didn't have a measuring tape with me, but it was probably something like 11x14, though they can be even larger. Attached to the back, there was a torn piece of paper with the date as October 2, 1882. Although the date of this one is known, the Library of Congress says crayon enlargements were available from the 1860's to the 1920's. There is a second date written along the side of the paper, November 14, 1882 - most likely, this is the date it was finished, which means it took little over a month to complete. The paper, as you can see below, includes information about what colors should be used. Much of it is missing, but it looks like the beard was described as "light brown but very grey". Something else, perhaps his hair, was "very brown and not so grey". I'm guessing the "(very?) black but very grey" was meant to be his eye color. Unfortunately, the price is not included, though there is a space for it on the form, and there's no company name surviving.

Excuse my fingers, I had to hold down a torn
and curling piece of the paper
The enlargement may be from 1882, but the original photograph could theoretically be from almost any time before that. They were typically made from carte de visites (1860s-1870s) and cabinet cards (1870s-1890s) so that means we're not looking at anything before the 1860's. Looking at the clothing and hair, men's fashion tended to evolve much more slowly and subtly than women's fashion but there are a few things to take note of. The facial hair style of a mustache connected to the muttonchops with a clean shaven chin (known as friendly muttonchops, face shelf, or hulihee) was popularized by Ambrose Burnside, and is where the term "sideburns" comes from. At the time Burnside introduced the style in the early 1860s, it was considered unusual, so it probably didn't become popular until a little later, and examples of it can be seen into the 1880s. That doesn't narrow down the original photo much more, probably around 1865-1882. The thing that really helps narrow it down is the fact that the man in the crayon enlargement is wearing a high buttoned, single breasted waistcoat with this style of lapels, which (according to History in the Making) was most popular around 1880-1910. That strongly suggests the original was from not much longer before the enlargement was made in 1882.

Wednesday, March 7, 2018

A Gedmatch Admixture Guide: Part 5: Spreadsheets

Also see Parts 1 and 2 on Admixture and Oracle, and Parts 3 and 4 on Admixture Proportions by Chromosome and Chromosome Painting.

Previously, I posted a link to Roots & Recombination's article on Gedmatch's Spreadsheets so I didn't go into it myself when I was detailing how Gedmatch's admixture tools work. However, I've been seeing some people still have questions so I'm going to cover it after all. Not that Dixon's explanation isn't good, but I know for me, it didn't fully click until I realized what I'm about to show you.

To find the Spreadsheet option, run your kit number through your desired admixture calculator (see Part 1 if you need help with this), and under the buttons for Oracle and Oracle, there will be a button for Spreadsheet.

Eurogenes K13 Spreadsheet

Firstly, to clarify this up front, the Spreadsheets are not your personal results. If you look at the Spreadsheet for the same calculator with different Gedmatch kits, they will all be exactly the same. Above is a portion from Eurogenes K13 Spreadsheet - compare it with your own, you'll see it's the same.

So what are they? Basically, the Spreadsheets are showing you what the more specific Oracle populations would look like when run through any particular admixture calculator. So using Eurogenes K13 as an example, the first row for Abhkasian (a small area in the Caucasus mountains) is showing you that when run through Eurogenes K13, the Abhkasian population got 1.64% in North Atlantic, 4.62% Baltic, 9.81% West Mediterranean, 54.30% West Asian, 22.78% East Mediterranean, etc. What this means is that if you were of full Abhkasian descent, you might expect to get admixture results like this.

To illustrate this, note (below) how if you add up the numbers in a single row, they add up to 100% (give or take 0.01-0.03%, as is usual even for your own results, which is probably just due to rounding up or down the individual percentages). So it's showing the admixture results of specific populations as though they were Gedmatch kits.

Spreadsheet for Eurogenes K13 showing sums of all populations

It also shows either how mixed or how exclusive a certain population's DNA is. You might expect an Italian to get results primarily in East and West Mediterranean, and indeed, most of the Italian populations (East Italian, West Sicilian, Italian Abruzzo, Tuscan, Sardinian, South Italian, North Italian, even Italian Jewish) do get high results in those categories (see below). But notice how they also get high results in North Atlantic, meaning that even people from the Southern most areas of Europe still share a lot of DNA with the Northern part of Europe (at least in this calculator). Even East Sicilians are getting 16.46% in North Atlantic.

Italian populations in Eurogenes K13 Spreadsheet

Not only can this explain some unexpected admixture results, but it can also explain unexpected Oracle results. Previously I've talked about how Eurogenes K13 Oracle 4 results matches me to a lot of Jewish populations, namely Kurdish Jewish and some Iranian Jewish. I have no known Jewish ancestry and don't get any Jewish results from any of the big DNA companies. But if I look (below) at Kurdish Jewish and Iranian Jewish in the K13 Spreadsheet, I see they have the highest results in East Mediterranean, which is expected, but also 8-10% in Wed Mediterranean, which is the same category my Italian ancestry peaks in, so perhaps there is some shared DNA there and K13's Oracle is picking up on that in my case.

Population admixtures for unexpected Oracle results 

It's difficult to find a population that gets more than 90% in one category (shown below), but one of the ones that does is Karitiana, a group of Native Americans in Brazil. This population gets 99.62% in (unsurprisingly) Amerindian, and only trace, less than 1% in other categories. The Dai population (a Chinese group) is another, getting 90.46%, again unsurprisingly, in East Asian. And another example is the Papuans (indigenous peoples Papua New Guinea) getting 94.59% in Oceanian. None of this is surprising, as these are all populations which would be expected to be fairly endogamous to begin with, knowing their histories.

The only populations to get 90+% in one category

So while this tool gives us some very interesting information into the make up of each population, and may help provide some insight into how and why your results turned up the way they did, they are not your personal results.

UPDATE: To help better visualize the Gedmatch Population Spreadsheets, I've started creating some bar charts. I'm a visual person, so I find these charts quicker and easier to make sense of than looking at a bunch of numbers. They are interactive so you can hover over sections to get details. If people find this useful, I'll keep adding them.

Eurogenes K13 Population Spreadsheet Chart
Eurogenes K13 Reverse Chart
Eurogenes EUtest V2 K15 Population Spreadsheet Chart
Eurogenes EUtest V2 K15 Reverse Chart

Saturday, February 24, 2018

Dating Old Photographs: Example #1

I have so many old photographs in my family's collection, many of whom are unknown, or at least the dates are unknown. Previously, I gave some tips on how I've narrowed down when a photo was likely taken, but I'd like share the multitude of photos I have as examples. I'll start with this portrait of an unknown woman from my family's collection. I believe her to possibly be a family friend of my ancestors, most probably the Fallows or Godshalls, given the location and time period.

My estimate: 1896-1899

With this one, the first thing I did was look up the photographer at these addresses. Louis Baul had a studio at 56 North 8th Street and also 1937 Germantown Ave, Philadelphia, during the years 1889 to 1908. This narrows it down a little bit, but that's still a 20 year period. To narrow it down further, we need to look at the materials used, as well as the clothing and hair.

The mount used is very ornate, and textured. According to Phototree.com, these became popular in the late 1890s, which fits within the photographer's time frame. Additionally, according to the fashion dating guide at the University of Vermont, the puffy shoulders you see here, particularly the size and shape, are indicative of the mid to late 1890s. The hair is also typical of the mid to late 1890s, as women began to grow out and flatten the frizzy bangs which were popular in the 1880s and early 1890s, and parting their softer waves in the center.

Lastly, the color of the photo is important too. In earlier decades, carte de visites and cabinet cards were printed on sepia like paper and card, with brownish tones to them. It wasn't until the 1890s when true black and white photos became available. This one may be a touch brownish, when I grayscale it completely in Photoshop, there's a notable difference, however, that could be attributed to age. In comparison with older cabinet cards, this is not sepia.

So everything is consistent with being from the 1890s, most probably from the late 1890s.

Monday, February 19, 2018

LivingDNA Review

LivingDNA are a British DNA company providing an ethnicity report (autosomal DNA) and a Y-DNA haplogroup (if you're male), and mtDNA haplogroup, for $159 (sales as low as $89 are periodic though). It does not include matching with other testers, although the company says this will be coming in the future, for autosomal DNA (I suspect they're trying to build up their database of testers first). They do offer a way to upload your raw data from other companies for free, however, it's well hidden and hard to find on their site (you can access it here), you won't get your results until August 2018, and it's unclear what the results will include.

As a British company, they have focused greatly on British DNA and offer the most breakdown available for this region than any other company at the moment. They also offer the most breakdown for Europe, the Middle East, Native America, and parts of Asia, but they are oddly lacking in any Jewish populations, and their breakdown for Africa and Oceania is fairly average. You can compare their breakdown of populations to other companies here.

But just how accurate are these more specific breakdowns? It's important to remember all DNA ethnicity reports are only an estimate, and in my experience, the more specific the regions are, the more speculative it is. It's difficult to say just how accurate the specifics at LivingDNA are. Of the known locations my British branches have come from, they include: Lancashire, Kent, Scotland, Hertfordshire, Essex, and Suffolk. However, there's probably other locations I don't know about, plus, DNA can go back further than my tree. My LivingDNA results within Great Britain include the
following (also shown on map below):
My regions of Britain and Ireland from LivingDNA
  • South England 8%
  • East Anglia 6.6%
  • Northumbria 6.2%
  • Southeast England 3.8%
  • Central England 3.6%
  • South Central England 2.7%
  • Lincolnshire 2.7%
  • North Yorkshire 2.7%
  • Devon 1.5%
  • Northwest Scotland 1.5%
  • South Wales 1.5%

This is not representative of Lancashire, but it does cover my other known regions, and then some. Unfortunately, Lancashire is my most recent English branch (immigrated in the mid 1800s), so you'd think I'd have more of that than anything, whereas the other areas are from colonial times. Again, it's difficult to say how accurate this may be given that DNA can be more representative of about 1000 years ago, while my tree has only been researched as far back as about a few hundred years. Additionally, given the small percentages, it's entirely possible some of these are just attributed to noise (like a false positive).

What is very consistent with my family tree is that the only result in Ireland I get is actually a part of Northwest Scotland (Scots-Irish). Despite having a couple "Mc's" in my tree, they are all Scots-Irish, not Irish. Also, the total amount of 40.6% in Great Britain & Ireland is very consistent with my known ancestry. I estimate from what I can that my tree is approximately 35% British. Other reviews have been saying that LivingDNA tends to overestimate their total British results, so I was pleasantly surprised to see mine were fairly accurate.

What about the rest of Europe? Here's the results:
  • Europe (South) 30.2%
    • North Italy 17.3%
    • Tuscany 10.4%
    • Aegean 2.5%
  • Europe (North and West) 27.8%
    • Germanic 17.1%
    • Scandinavia 10.6%
  • Europe (East) 1.4% (on "Standard" setting, this is unassigned)
    • East Balkans 1.4%
My Europe South regions from LivingDNA

A total of 30.2% in Southern Europe is somewhat consistent with my tree (I had one Italian grandparent, so 25% on paper), but interestingly it's in almost exact agreement with most other companies. AncestryDNA says 31%, FamilyTreeDNA says 33%, and 23andMe says 29.5%. MyHeritage are the only outliers with 41.6% (which is one of the reasons I feel MyHeritage are the worst for ethnicity). However, looking at LivingDNA's breakdown for it, this is not really consistent with my tree. Most of my Italian branches have been researched back to the 1700s, and they are all from Southern Italy or Sicily, primarily three towns: Monteroduni, Sulmona, and Polizzi Generosa. LivingDNA has my results mainly in upper and mid Italy. You could possibly argue that Monteroduni and Sulmona are right on the boarder of the region they are calling "Tuscany" (the middle portion in pink on the map above/right), but certainly, Polizzi Generosa (Sicily) is not highlighted at all. Granted, the southern tip of Italy is highlighted as a part of the "Aegean" region, but I only get 2.5% in this category. Populations charts (example below) frequently show how North Italy and South Italy are genetically very different, so for my largest results in Italy to be in North Italy when my Italian ancestry is from Southern Italy just doesn't seem right. The entire Italian side of my family are dark haired, dark eyed, with olive toned skin. We are definitely Southern, and that is disappointingly not shown in LivingDNA's results.

Population chart from AncestryDNA - the closer the dots,
the more genetically similar (note the dots for Italy
show two groups, the larger one is northern Italy,
the smaller one is southern Italy,  showing
how genetically different they are)
Next we have a total of 27.8% in North & West Europe, with 17.1% Germanic and 10.6% Scandinavia. This could just be a coincidence, but if not, then a big congratulations is in order to LivingDNA, because they are pretty much the first company to accurately tell my British, Germanic, and Scandinavian DNA apart from one another. Every other company jumps from one extreme to another, or plays it safe by lumping a large portion of my DNA into a "broadly" Northwest European category, unable to break it down further (23andMe). According to my tree, I should be about 25% Germanic (Western Europe) and 12.5% Norwegian (Scandinavian). At other companies, Western Europe ranges from 0% to 17.9%, and Scandinavia ranges from 0% to 12.3%. While the upper ends of these ranges seem on par with LivingDNA, it is always at the expense of the other group (i.e., 12.3% in Scandinavia means 0% in Western Europe). If you're interested, you can see my complete results from all different companies here (although I did not include the sub-regions of Britain, there were too many). It's a shame Germany and Scandinavia can't be broken down further like Great Britain or even Italy are, but hopefully that will change in the future. I'll look forward to seeing how accurate it may be. I also note that LivingDNA was able to accurately tell Germany apart from France, something no other company has even attempted to do.

Lastly, we have the tiny 1.4% East Europe, which they're putting in more specifically in East Balkans (although the map coverage is the same for both). I have no known Eastern European or Balkans ancestry, but it's worth noting that in "Standard" mode, this 1.4% becomes "unassigned". So they are obviously unsure about this, and therefore it's likely just noise.

Similar to 23andMe, LivingDNA provides several levels of speculation or specification for your ethnicity results. There are three modes: Complete, Standard, and Cautious. Complete attempts to identify any "unassigned" DNA found in Standard mode. There was very little difference for me, which is why I used Complete mode here. As I mentioned, there was the 1.4% unassigned which got put in Europe East, and then there was 3% unassigned under Great Britain and Ireland which got put into the 1.5% Devon and 1.5% Northwest Scotland. Cautious mode groups regions more broadly (see below). Within each mode, there is an option to view results on a Global scale, Regional, or Sub-Regional. At Global, I'm 100% European on every mode. This is a little bit contrary to other companies, which often give me at least trace amounts of Middle East, North Africa, or South Asia. 

My results in Cautious mode
In Cautious mode, these are my Regional/Sub-regional results (also shown on map to the right):
  • Great Britain and Ireland 40.6%
    • Southeast England-related ancestry 18.2%
    • North Yorkshire-related ancestry 11.7%
    • East Anglia 6.6%
    • South Wales-related ancestry 1.2%
    • Great Britain and Ireland (unassigned) 3%
  • Northwestern Europe-related ancestry 27.8%
  • Pannonian Cluster-related ancestry 19.8%
  • South Italy-related ancestry 10.4%
  • Europe (unassigned) 1.4%

It's interesting to note that in Cautious mode, there is a 10.4% in "South Italy-related ancestry". It's not a very high amount, but it's interesting that it swapped from North Italy to South Italy for some reason. Meanwhile, my Scandinavian results have strangely disappeared completely. The map above is showing how some areas are found in more than one category. So the grayish blob over Germany is gray because it's in both "Northwestern Europe" and "Pannonian Cluster". Likewise, the brown parts of Britain are brown because they are in both "Great Britain & Ireland" and "Northwestern Europe". These results are more comparable with how other companies group their categories. That doesn't necessarily make it more accurate, just more broad.

My mtDNA haplogroup migration map from LivingDNA
As for the Y and mtDNA haplogroups, I am female so I have no Y haplogroup, and my mtDNA haplogroup is consistent with 23andMe and FTDNA's Full Sequence test: T2b. No revelations there. It includes a written history of the haplogroup, a coverage map, showing countries where your haplogroup is most commonly found, a migration map showing the route your haplogroup took out of Africa, and finally a Phylogenetic tree showing how your haplogroup descends from Mitochondrial Eve (or Y Chromosomal Adam). In comparison, 23andMe only offers the written history, the migration map, and the Phylogenetic tree, no coverage/frequency map. Also noteworthy, while 23andMe and LivingDNA include roughly the same amount of mtDNA raw data (23andMe includes 4,318 mtDNA SNPs, while LivingDNA includes more than 4,000), LivingDNA includes significantly more Y-DNA SNPs (roughly 20,000 to 23andMe's 3,733). Of course, neither of them include mtDNA or Y-DNA matches, so if that's what you're looking for, you'd have to take FTDNA's dedicated tests.

LivingDNA also provides a very detailed, interactive display of your results to share with others. Here's mine. While other companies often provide a similar way of sharing your results, none that I've seen have been quite this detailed or interactive. Does it share too much? LivingDNA also allows you to control what you share by giving you the option to remove elements or widgets.

I was hesitant to test with LivingDNA, given their lack of DNA matching, and the higher price tag, I felt like what you got wasn't worth that much money. Then it was on super sale over Christmas so I decided to take the plunge. I am pleased with the ethnicity report - at regional level, it's been the most accurate for me so far, but the sub-regional results need some work. Particularly if you already know your haplogroups, I wouldn't pay full price for this test, but I do think it's worth exploring, especially if they add DNA matches in the future. 

Wednesday, January 17, 2018

Gencove Review

Some of the apps Gencove offer
UPDATE: Gencove no longer accept uploads from other companies.

Gencove sells DNA tests for $59.99, but they also offer a free upload of your raw DNA data if you tested elsewhere. With the free upload, you get all the same options you do if you tested with them, namely an ethnicity report and matching with DNA relatives.

They also offer "apps", some from third parties. There is even a Promethease app for $10, though it would be cheaper and probably easier to just upload directly to Promethease (see a full review of Promethease here). There's also an app for GenePlaza, which was also previously reviewed here. Though the app is free, GenePlaza's reports each cost a small fee. Clicking on the GenePlaza app in Gencove merely takes you to GenePlaza's website. The other apps are free too, but they aren't particularly useful. They include:
  • Discover your microbiome - Bacteria and viruses that live in your mouth
  • My Genome - Info about your genomic data
  • Sleep - Are you a morning or evening person?
  • iobio.io - Compare your genome to ClinVar
  • YouGenomics India - Help improve genomics for South Asia
  • Gencove Mobile App - Compare results with friends on iOS or Android
  • Open Humans - Contribute to research and citizen science

When I tried Microbiome it simply said "Microbiome not available" with no explanation as to why, so that was totally useless.

My Genome is just that - it's where you find your raw DNA data. You can download your raw data, you can view which apps on Gencove you've given permission to access your data, and you can view and manage your consent to participate in research.

The Sleep app is interesting but the results claimed I'm a morning person, which I have never been. The app asks you a few questions about your sleep habits before showing your results and it did note "It seems that the genetic score and questionnaire results don’t match - an interesting outlier! That's probably because the genetics of sleep is not very well understood yet."

iobio
The iobio app loads your DNA to gene.iobio.io which is a little bit of a technical app that will tell you if you have any variants of certain medically related genes. For example, it includes a report on BRCA1 and BRCA2, which are genes associated with breast and ovarian cancer. Despite the technical looking nature of the site, it will tell you, in plain English, if you carry any variants of the genes included in the report or not. Hovering over each gene will pop up a brief summary of what it is associated it. Most of them are likely somewhat rare, since I had no variants for any of them. There are 40 in total: PTEN, BRCA1, BRCA2, TP53, STK11, MLH1, MSH2, MSH6, PMS2, APC, MUTYH, VHL, MEN1, RET, RB1, SDHD, SDHAF2, SDHC, SDHB, TSC1, TSC2, WT1, NF2, COL3A1, FBN1, TGFBR1, TGFBR2, SMAD3, ACTA2, MYH11, MYBPC3, MYH7, TNNT2, TNNI3, TPM1, MYL3, ACTC1, PRKAG2, GLA, MYL2, LMNA, RYR2, PKP2, DSP, DSC2, TMEM43, DSG2, KCNQ1, KCNH2, SCN5A, LDLR, APOB, PCSK9, RYR1, CACNA1S, ATP7B, BMPR1A, SMAD4, OTC. If you have reason to check on any of these and want a quick, free way to do this, this is a good option, but it can also be easily accessed independently of Gencove, just go to http://iobio.io, however, it's not very user friendly and I couldn't find a way to upload my data, so going through Gencove may actually be the better option.

YouGenomics India, recently renamed "Genavli Biotech", is a research project for South Asia, attempting to improve ethnicity reports for people with South Asian ancestry. Naturally, it wouldn't be useful for anyone who is not South Asian but if you are, you should look into this. As far as I can tell, Gencove's app simply links to the YouGenomics India website.

The Gencove Mobile App doesn't really offer anything that the website doesn't apart from some surveys which I presume are for research purposes. It allows your to see your ethnicity report and the unavailable microbiome report, and connect with or invite your friends. That's about it. 

The Open Humans app merely takes you to openhumans.org, which is an open research project. Gencove does not load your data there, so there's really no need to go through Gencove if you wish to participate in this project.

Gencove's populations for their ethnicity report
Most people will likely be most interested in the ethnicity report. There are 26 populations available, some of them are broad, large regions, while others only cover a small region. They include: Northern and Central Europe, Northern Italy, Northern British Isles, Southwestern Europe, Middle East, Eastern Mediterranean, Bengal, Central Africa, Eastern Africa, Northern Africa, Central Indian subcontinent, Southern Indian subcontinent, Oceania, Southeast Asia, Northeast Asia, Anatolia-Caucasus-Iranian Plateau, Central Asia, East Asia, North-central Asia, Northeast Europe, Scandinavia, Finland, Southern Africa, Western Africa, Ashkenazi Jewish, Americas. A map showing what these populations cover is shown left/above.

My personal results were not particular accurate, although I did note that if I added together all my results in populations probably associated with my Italian ancestry versus those from my Northwest European ancestry, the numbers were consistent with what most other companies say. Here are my Gencove results:

My Gencove ethnicity report
48% Northern and Central Europe
21% Northern Italy
15% Northern British Isles
7% Southwestern Europe
6% Middle East
3% Eastern Mediterranean

My Italian ancestry is southern, not northern, but if you add up the results for Northern Italy, Southwestern Europe, Middle East, and Eastern Mediterranean, you get 37%, which is very similar to the 36% from AncestryDNA and 38% from FTDNA. While my results in more specific regions may be all over the place across different companies, the divide between northern Europe and southern seems very distinct with me so when an ethnicity report is consistent with that, I know there's at least some reliability to it. 

Lastly, Gencove offer the "Relative Radar" which finds people you share DNA with. Unfortunately, there must not be very many testers/uploaders in their database because it found none for me so all I can say about it is that it seems to use a visual display, plotting relatives who share more DNA with you closer to your profile icon.

Conclusion: Since it's free, there's really no harm in checking out Gencove (unless you have concerns about research participation). Because some of their "apps" simply link to other sites, it looks like they offer more than they really do. The ethnicity report, sleep app, and iobio data were the only really interesting or useful options, but even with those, don't expect too much. I definitely wouldn't pay $59.99 to test with them, although the low price point in comparison with other testing companies may be appealing, you would get more out of your money by testing with AncestryDNA or 23andMe and then uploading to Gencove for free.

Monday, January 15, 2018

GPS Origins Upload Review

There are several different DNA tests available from HomeDNA, but this is a review of their "GPS Origins Algorithm" test which allows you to upload your raw DNA data from another company for an ethnic origins analysis. It costs $39 and as far as I can tell, it's pretty much the same results you'd get if you'd bought their "GPS Origins Ancestry Test" for $199. It does not include DNA matching with other testers or any kind of  health report.

Migration routes seem a little misleading, and generic
I found the results to be not very accurate, and even a little bit misleading. The information talks a lot about prehistoric and ancient origins, even though an autosomal SNP test doesn't generally go back that far. Other info says the test results "begin over 1,000 years ago" which is consistent with other autosomal ethnicity tests, which usually state the results represent about 500-1000 years ago. There's also a lot of information about Y and mtDNA which makes it sound like these include Y and mtDNA, even though it doesn't. It even has a map showing "migration routes", one in red and one in blue, which makes it look maternal and paternal. Not only is that impossible since I am female and don't have Y-DNA, but there is also a note: "Although both Migration Patterns represent your Maternal and Paternal DNA routes, we do not differentiate which route is maternal and which is paternal." So why make them blue and red? And why are there only two "routes"? I have several European origins, not only according to my tree, but also according to this ethnicity report. While GPS Origins do offer a Y test and an mtDNA test, they are not included in the autosomal DNA test.

Info on my Sardinian migration route when clicking on
one of the map markers (above)
The migration routes are suspicious because they don't offer much explanation as to how they determined these routes. From what I can tell, it seems like the "routes" are just generic info tossed up for people who get results in one part of Europe or another. For example, I got 11.7% in Sardinia, and my "blue" route shows it starting in Sardinia and moving north through Italy. Basically, the routes just seem to be providing generic info on the common migration pattern of Sardinians. The other "red" route appears to be from either my Fennoscandia results or Western Siberia. Again, I'm not very clear on how they chose these two routes, or why there's only two of them, apart from the fact that they seem to be trying to make them look like paternal and maternal results when they're not.

GPS Origins claims the migration route include "precision targeting—sometimes down to the village or town" which is a gross exaggeration. The map markers may look precise, but when you read the info attached to it, you can see it's not that specific.

The results in Sardinia, Fennoscandia, and Western Siberia are a little off to begin with. I'm getting 21.5% in Fennscandia, 11.7% in Sardinia, and 11.6% in Western Siberia. I do have Scandinavian roots in Norway, but it should only about 12.5%. The Sardinia results are probably coming from my Italian ancestry, but peaking in Sardinia doesn't seem right. As for Western Siberia, I have no ancestry there at all.

My ethnicity percentages
My full results are:

#1 Fennoscandia 21.5%
Origin: Peaks in the Iceland and Norway and declines in Finland, England, and France

#2 Sardinia 11.7%
Origin: Peaks in Sardinia and declines in weaker in Italy, Greece, Albania, and The Balkans

#3 Western Siberia 11.6%
Origin: Peaks in Krasnoyarsk Krai and declines towards east Russia

#4 Orkney Islands 11.5%
Origin: Peaks in the Orkney islands and declines in England, France, Germany, Belarus, and Poland

#5 Southern France 9.9%
Origin: Peaks in south France and declines in north France, England, Orkney islands, and Scandinavia

Why are they talking about Y and mtDNA in my autosomal
DNA results?
#6 Basque Country 9.3%
Origin: Peaks in France and Spain Basque regions and declines in Spain, France, and Germany

#7 Tuva 8.1%
Origin: Peaks in south Siberia (Russians: Tuvinian) and declines in North Mongolia

#8 Southeastern India 5.4%
Origin: Endemic to south eastern india with residues in Pakistan

#9 Northern India 5%
Origin: Peaks in North India (Dharkars, Kanjars) and declines in Pakistan

#10 The Southern Levant 3%
Origin: This gene pool is localized to Israel with residues in Syria

#11 Arabia 1.6%
Origin: Peaks in Saudi Arabia and Yemen and declines in Israel, Jordan, Iraq, and Egypt

#12 Northwestern Africa 1.1%
Origin: Peaks in Algeria and declines in Morocco and Tunisia

#13 Western South America 0.3%
Origin: Peaks in Peru, Mexico, and North America and declines in Eastern Russia

You can compare this with my results from other companies here.

The Legal Genealogists says GPS Origins is one not to bother with, and based on this I would agree. I definitely would discourage anyone from buying a test with them, considering it costs $199 compared to the more popular DNA companies charging $99 or less. You're basically paying twice the amount to get inferior ethnicity reports and no DNA matching with other testers. As for the $39 upload, it may be worth it for curiosity sake, but don't expect much from it. I found it very disappointing.

Here's another thorough review from another blogger.

Wednesday, January 3, 2018

The Importance of Primary Sources

This death certificate is only a primary source for the death
info. His parents names are actually wrong, and his birth year
is in conflict with records from when he was alive, suggesting
the informant, his son, may have gotten it wrong.
I often see people asking about which source is better for a certain fact or event and this is a good time to address the differences between a primary source and a secondary source. A primary source is a document which is recorded at the time of the event it's detailing. A secondary source is one that is detailing an event that occurred in the past, and therefore may be more likely to be incorrect. A primary or secondary source can also be a person, regarding whether or not that person was alive/witness to the event in question. So to understand the reliability of a record, we have to understand what it's a primary source for, and what it's not. Here's a quick rundown:
  • Birth records are the only primary source for a birth. This may include a birth announcement in the newspaper, but the further back you go, the less likely this becomes. Equally, the further back you go, the less likely that civil vital records were kept. Delayed birth certificates aren't a primary source, but may be the only record of a birth in existence. Also keep in mind that some places would fine individuals for reporting a birth too late, which means they may have lied about the birth date to avoid being fined.
  • Baptism records are only a primary source for the baptism, not the birth. However, if the baptism occurred only a few days after the birth, that's pretty much as good as a primary source for the birth too (if it recorded the birth date - do not assume the baptism and birth date are the same if both aren't recorded). Especially if there's no birth record in existence, a baptism record is likely as good as it's going to get. However, if the baptism took place years after the birth, maybe even months, that is not a primary source for the birth because enough time has passed for the actual birth details to have been remembered incorrectly.
  • Marriage records are only a primary source for the marriage. Particularly if the parents of the bride or groom were deceased, you can't be sure their names are correct. Be careful not to mistake a marriage bann, engagement announcement, or marriage license for the actual marriage.
  • Death records are only a primary source for the death. If it includes an address where the deceased was living at the time of death, then it's also a primary source for that. But it's a secondary source for the birth date and location, both because the document is normally recorded many years after the birth (unless it's an infant death), and because the informant for the death record is often someone who wasn't even alive or present at the time of the deceased's birth. It's also not a primary source for the parent's names or birth locations; it's very common for those details to be incorrect. Death records are usually a good source for the burial location, even though they are recorded before the burial takes place, and therefore that info theoretically could change before it happens.
  • Obituaries are generally considered a type of death record and therefore can be considered a primary source for the death if they are published within a few days of the death, as is typical. Excepting potential printing errors, of course (i.e. the informant may have provided the correct death information, but the typist misprinted it).
  • Gravestones aren't really a primary source for anything! At the most, they may be a primary source for the location of the burial, but I have seen gravestones erected for people before their death, who then actually wind up buried elsewhere. However, when this happens, there's usually a lack of at least a death date on the gravestone. It's also not a primary source for the date of the burial, since gravestones don't normally have the burial date listed on them. You might think it's a primary source for the death date, but gravestones often aren't created for weeks or even months after the death, plenty of time for people to remember the exact date incorrectly. 
  • Cemetery/burial records are only a primary source for the burial information. Unlike gravestones, these usually include the interment date and wouldn't exist unless the deceased was actually buried there.
  • Census records are only a primary source for data that was current at the time the census was taken, such as: residence, occupation, citizenship, literacy, etc. All other data that occurred prior to the census - birth/age, marriage, immigration, etc is secondary. Additionally, even things like the occupational data may be subject to the knowledge of the informant and could be incorrect. Also don't forget that in the US, pre-1880 censuses did not record relationships to the head of the household. While you can often surmise relationships based on the order in which people are listed, ages, and names, you can't be sure about them without other supporting documents to confirm. 
  • Family bibles may or may not be a primary source for any or all of the data within, depending on when each item of information was recorded and who recorded it. Unfortunately, there's generally no way to know for sure when the data was recorded, or who by. You can sometimes get an idea based on the handwriting and/or different types of pens used at different times. For example, you might note the birth info was recorded at a different time from the death info. But this still doesn't assure they were recorded at the time of those events. They could have each been recorded years after the fact, whenever the author (and we may not even know who the author was) got around to it.
  • Wills and Probates can contain a lot of valuable and reliable information, like the names of someone's children, the details of their estate/property, etc. But even though they are related to the death fact, they typically don't contain a date or location of the death, let alone a cause of death. Don't mistake the will or probate dates for the death date, but you can usually get a time frame for the death date - sometime after the will date, and before it was probated.
  • Lineage books are a secondary source for everything in them, since they are written after all the events took place. However, many lineage books use primary sources for at least some of their information. That doesn't necessarily mean the entire book is reliable, but the particular data coming from primary sources should be. Not all authors note their sources, but many do.
A gravestone with no dates - this person was
actually buried in a different cemetery (I
believe his parents erected gravestones for
their children in the family plot, but some of
them wound up choosing other cemeteries. This
is not typical in my experience.
Naturally, we do not always manage to find a primary source for each bit of information and that doesn't mean we can't use secondary sources. Even primary sources can be wrong sometimes, they are just much less likely to be so. We just have to work with what we have, and what exists, and understand what is more or less likely to be accurate. Having data from a secondary source doesn't mean we can't put that data or that source in our tree, it just means we should keep looking for better or additional resources to help confirm or deny it. Family trees are forever a work in progress and no one should assume that once a piece of information is put into a tree, it means you're confident it's accurate. The sources you cite in your tree should speak for themselves as to their reliability.

Judging which secondary source is more reliable for what type of conflicting data can be difficult, and we have to weigh when, how, and by who the data was recorded/provided. You may think it makes the most sense to go with a birth year that you find on most of the records for an individual, but what if all those records are from later in his life, or even after his death? A record from his childhood, closer to when he was born, and when his parents, who were there for the birth, were still alive and one of them may have been the informant may actually prove to be the more reliable source. Of course you can never know for sure, so it's also best to put all recorded facts in your tree as alternate data, but you still have to pick a default/preferred one. Hopefully, this has given you some things to consider when choosing a default/preferred fact to go with, and given you a good understanding of primary and secondary documents.