Showing posts with label thrulines. Show all posts
Showing posts with label thrulines. Show all posts

Tuesday, September 7, 2021

ThruLines is not the enemy

I see a lot of skepticism out there about ThruLines, and some of it is warranted, because it is based on family trees, which can have errors that get copied multiple times. But that doesn't mean you should dismiss ThruLines entirely, there are ways to get reliable use out of it, and not just by finding records that confirm them. There are ways to use DNA to find biological relatives or break down brick walls in your tree even when there's no written records of the lineage, and ThruLines is just one tool that can help you do this.


It's basically a matter of probabilities. The more people you match who are descended from multiple siblings of your ancestor, especially when all those descendants all or mostly match each other to form a cluster, the less likely it becomes that it's an error. When the matches mostly all match each other to form a cluster, you know they are all related and descended from the same branch/ancestor - you just need to identify which branch/ancestor, which is where trees and ThruLines come in. Each sibling that those matches descend from would have to be an error for trees/ThruLines to be wrong, so the more siblings you match descendants of, the more likely the trees are accurate. If you match 20 people (who mostly all match each other too) descended from 5 siblings of your ancestor, what are the chances there's been an error in the trees for each of those 5 siblings, plus your own ancestor? Extremely unlikely. In the example above (click to enlarge), there's 41 matches descended from 8 siblings of Elizabeth Mertz, so for this all to be wrong, there would have to be 9 different errors. This amount of evidence is really very conclusive, and I can probably confirm this family now.

Even assuming there's only one error and those siblings are indeed siblings to each other, but your ancestor is the lone error, and not actually their sibling, what are the chances you would match that many people from a certain family, if you weren't related to that family somehow? Using the example above again, what are the chances I match 41 people descended from those 8 siblings, if Elizabeth Mertz is not one of their siblings? Again, it's very unlikely - and the only way this would be possible is if there was a lot of endogamy involved, but even so, it would still be pointing you towards a specific population you're likely descended from (and matching surnames from the same endogamous population means you're probably related to that specific family somehow), so you don't want to dismiss it.

Granted, it doesn't confirm who exactly the parents of those siblings are, only that they are indeed siblings. For that, you'd have to go up another generation and do the same thing - look for people descended from siblings of the alleged father and mother. In the example above, it doesn't really confirm that Phillip Mertz is the father of Elizabeth and all her siblings, only that they are siblings from the same parent(s), whoever that may be. But for now, it's probably safe to add Phillip Mertz at least as a placeholder until more research can be done (it really is okay to add speculative data to your tree as long as you know it's speculative!).

In the example below, you can see how this ThruLines doesn't confirm descent from Benjamin Butler - the 6 DNA matches are descendants of children of David Butler, so this really doesn't confirm this potential ancestor at all.

And there's other limitations, mainly the fact that the Shared Matches tool (which is the only way to confirm if matches match each other and form a cluster) only includes estimated 4th cousins or closer (20+ cM). AncestryDNA really need to provide something more comprehensive. They say it's limited to 20+ cM because it would tax the server too much if they expanded it to include all matches. But at the very least, they could expand it to 15+ cM segments, which have a 100% chance of being identical by descent. That would still exclude most matches (8-15 cM) and therefore not be as taxing on the server, but include all matches that have a 100% chance of being IBD, which would make ThruLines so much more useful and reliable. At the moment, they are excluding hundreds, even thousands of IBD matches from the Shared Matches tool, which is extremely debilitating. Alternately, they could offer another tool that would be less taxing on the server - a simple one-to-one comparison. Pop in two match usernames, which would tell us whether those two matches match each other or not. Very simple, not very taxing, but it would get the job done.

Even so, it's still possible to get reliable usage out of ThruLines. Remember, ThruLines is only automating a process that people used to manually do (and still do when the relationship exceeds ThruLines' 5th great grandparent limit). If it weren't possible to use DNA to confirm relationships when there is no written record of it available, what use would DNA be, and how do you think all these NPEs are being discovered? While it's true that you do have to watch out for tree errors being replicated in ThruLines, if you understand how DNA and ThruLines work, there is useful data you can get out of it. To often, I see people who seem to completely dismiss ThruLines, as though it's not reliable at all, but you're only hindering your own research by thinking that.

Friday, August 21, 2020

Major Breakthrough with DNA

I think I finally broke through the biggest brick wall on my tree. I had forever been stuck at my 3rd great grandmother, Emma Elizabeth Sherwood (left), who married William Henry Mills. Despite having found her maiden name, I could never find her parents or any record of her before her marriage. Born about 1838 in New York, there were a lot of girls with the same or similar name in New York around that time. I'd tried to research by elimination, but I was still left with too many options that could have been her. And DNA? I made some efforts but it was really difficult with a fairly common surname like Sherwood. I never got anywhere promising.

Until now. I decided to work on some closer DNA matches that I hadn't been able to identify before. I randomly picked one from my mom's side who had several shared matches with people confirmed from my Mills branch. This match, we'll call him 11B, had a small family tree added, enough that I could build on it. Although that is supposed to be ThruLines' job, it doesn't always catch everything. I started digging and before long, I found that 11B's 2nd great grandmother was Orannah Sherwood b. 1841 in New York.

I instantly thought she could be a sister of my Emma Elizabeth Sherwood. Right surname, born only about 3 years apart in the same state. Plus, I know this DNA match 11B is somehow connected to my Mills branch and Emma Sherwood married William Henry Mills. But I tried not to get my hopes up too high, because Sherwood is a common name, and lots of people lived in New York in the late 1830s/early 1840s. 11B could be connected to my Mills branch in some other undiscovered way entirely. More research was needed, so I researched the other branches of 11B's tree and found no other connection to my tree, let alone to my Mills branch.

I then found Orannah, fortunately not a super common given name, in NY in the 1850 census and guess what? She had a sister named Emily E Sherwood b. abt. 1837.

The 1850 census showing the Sherwood family with Emily/Emma


Things are looking much more promising. Granted, Emily was supposedly born in Indiana according to the 1850 census, not New York, but that could be wrong. Or it could be right and she never knew it. Her older sister Louisa also seems to have been born in Indiana in 1835, and then her younger brother Homer was born back in NY in 1839, so the family could have been in Indiana for only a few years and Emily/Emma may not have remembered it and just assumed since she grew up in NY that that's where she was born. It's strange for us today with all our documentation to think that someone didn't actually know where they were truly born, but it happened a lot in history.

Another smaller piece of evidence is the fact that the 1850 census tells us Emily's father, Nathan, was born in New York, which is consistent with later records of Emma saying her father was born in New York too. Unfortunately, it's not as consistent with her mother, which later records say she was born in either New York or New Jersey, while the 1850 census for Annis O, the presumed mother of Emily, says she was born in Vermont.

Here's the craziest bit, though, and is a real testament to why you shouldn't just outright dismiss family stories. Once upon a time, my grandmother was doing genealogy research and left behind a wealth of information, though rarely cited her sources. Much of what she wrote down was word-of-mouth info from cousins she tracked down and wrote to. In her handwritten info, she claimed that William Henry Mills (Emma Sherwood's husband) had a sister named Belinda who married a man with the surname Beals. Turns out, William did have a sister named Blendena, which was obviously misremembered as Belinda, but her only married name was Church, not Beals. None of William's other sisters or relatives married anyone named Beals either, so I was really scratching my head over where this name came from and considering that maybe it was totally fictitious, even though about 90% of my grandmother's info I've proven to be accurate, and the remaining 10% has turned out to hold some kernel of truth, with only some of the details being wrong.

Well, guess who did have a sister whose married name was Beals? Emily Sherwood! Her older sister Louisa married Silvanus Beals in 1855 in Indiana. And note how this is the same sister who was supposedly born in Indiana? The family probably had some kind of connection to Indiana.

I even managed to explain how Emma and Louisa wound up marrying in different states in the same year. Louisa's husband, Silvanus Beals, apparently was living in the same county that Emma married William Henry Mills in, Wyandot County, Ohio. That links Silvanus, and therefore potentially also Louisa, to the same place Emma was married. Additionally, Silvanus' obituary says he worked for a railroad company as a young men, the same industry that William Henry Mills spent his life in. Perhaps they worked together before they met their wives, maybe Louisa introduced Emma to William through her fiance or vice versa. There clearly appears to be a connection there.

The evidence is starting to really pile up, but is it all just a coincidence? How could I know for sure this was the right family, given the slight difference in the given name, Emma vs Emily, and the difference in her birth place as well as her mother's birth place? 

Firstly, I started researching Emily, not Emma, as though she was a different person. If I could find her on later records as having married someone else, not William Henry Mills, or never married at all, that would disprove the theory that they were the same person. I didn't find anything like that, but of course that doesn't confirm they were the same person, it only means that's still a possibility.

I also found Emily in the FamilySearch tree as Emma, which is apparently coming from a book "Descendants to the eight generation of Thomas Sherwood (1586-1655) of Fairfield, Connecticut Vol 2" which was published in 1985, so it's obviously very much a secondary source (and really doesn't contain much info), but it certainly suggests Emily's name could have actually been Emma. It's not a stretch.

But what I really wanted was to find more DNA matches descended from this family. I was hesitant to put this family into my tree because it meant putting a lot of speculative data in my tree, but I did it because I wanted to see if ThruLines would find more descendants. And after a few days, the matches came rolling in! 7 so far, and they will only continue to grow as my tree grows. Unfortunately, this family has been a little difficult to research, so it's been a struggle, but worth it. 

ThruLines showing 5 out of 7 DNA matches from the Sherwood family so far


It appears that Nathan probably died sometime in between 1853 and 1855, and Annis in either 1854 or 1855, because their last child was born 27 Mar 1854. As a result, the children were split up and scattered, sent to live with other families. In 1855, we know that Lousia got married in Indiana, and Emily/Emma, assuming they are the same person, was married in Wyandot County, Ohio. They may have been living with family in those areas. Also in 1855, Oreannah was sent to live with the family of her future husband, Charles C Baxter. Their brother, Homer, was an apprentice living with a seemingly unrelated family in a different part of NY on the 1855 NY State census. Another brother, Dwight, was adopted by another member of the Baxter family, who was fortunately neighbors with the ones who took Oreannah in, so at least these siblings got to be near one another. The youngest brother, Frank, was actually born in March 1854 and adopted as an infant by Franics Postel and Sarah Baxter (Sarah being the sister of Oreanna's husband, yet another connection to the Baxter family) before the 1855 NY census, supporting the theory that Nathan and Annis died around that time. 

I am still working on researching the other children, but I'm having difficulty and I think it's because they were all split up after their parent's deaths. If I'm having difficulty researching them, others probably are as well, and indeed, when I look for these people in other trees, there are usually dead ends. If no one has these people well researched in their trees, ThruLines doesn't have much to follow. So it's not necessarily because I'm on the wrong path, there's just no established path yet for ThruLines to pick up on, which is kind of exciting to be working on something no one else has done much work on yet. Of course, the downside to that is how difficult it is.

Additionally, when I look at my Shared Matches with the confirmed matches descended from Nathan and Annis, I find most of them don't have any tree added at all, and among those that do, most of them are tiny. Another hindrance of ThruLines. All I can do is build my own tree as much as possible down descendant lines and see if they eventually link up with more trees. For now, this is an excellent start, and I'm thrilled to finally have found Emma's family!

Tuesday, January 14, 2020

How Copying Errors Can Really Screw Up ThruLines

I want to illustrate how ThruLines is only as reliable as the family trees it's using, and how even if you appear to have a few DNA matches linking you both to the same ancestor, that doesn't make it accurate. As we know, it's common for the inexperienced genealogist (of which there are many) to blindly copy data from other trees without verifying it. All it takes is a handful of people copying the same error for ThruLines to make a wrong connection that might seem accurate because there's more than one DNA match.

My dad (JB, shown above) is a DNA match with two people, RJ and RH (you can ignore MM in the screenshot above, that's a legitimate connection I've verified). ThruLines suggests that they are both connected to my dad via his ancestor, Giovanni Biello. It does not include Giovanni's wife, so these are allegedly half cousins, but to my knowledge, Giovanni was only married once. I was open minded to the idea I might have missed another marriage, but then I noticed something else.

RJ and RH both descend from someone named Denizi Biello, b. Feb 1862 (according to the 1900 US Census), supposedly the son of my ancestor Giovanni Biello, b. 13 Jul 1847. This would mean Giovanni was only 14 when Denizi was born, which might be biologically possible but it's extremely unlikely. Men didn't marry until they were old enough to support a family, which they would not be at 14, and if they had children out of wedlock, the child was often left at the church as a foundling (meaning the father not identified). So something about this just didn't sound right to me.

Yet, 4 people had added Denizi as the son of this Giovanni to their tree. And in their defense, Denizi's death certificate does indeed say his father was Giovanni Biello and his mother Domenica Scioli.

Denizi's, or Dionisio Biello's civil birth record
Digging through the civil records of Monteroduni, Italy where both men were born, I found Denizi's birth records which finally held the answer. For starters, his original name was Dionisio Biello, and he was actually born several years before 1862, on February 2, 1856. But his parents names were correct: Giovanni Biello and Domenica Scioli. Only, Giovanni was not born anywhere near 1847. He was recorded as aged 38, which would make him born about 1818. It also said Giovanni was the son of the late Dionisio, whereas "my" Giovanni Biello was the son of Lorenzo.

So, two completely different men, as I suspected. I do not know who originally made the wrong assumption that Denizi was the son of "my" Giovanni but then 3 other people copied the error, ThruLines picked up on it and then found two people descended from Denizi. Since there were only 4 tree with the same error, I contacted all of them to let them know - hopefully they'll make the change and the incorrect ThruLines will disappear, but imagine if this error had been copied by more than 3 people! It probably wouldn't have been worth sending a message to each one since many likely wouldn't get the message or make the correction.

Granted, I have to admit that Monteroduni is a small town in rural Italy where there is a ton of endogamy and cousin marriages. I'm pretty sure everyone with ancestry in Monteroduni is related to everyone else there in some way, but it's still important to figure out the correct relationships whenever possible. Biello is actually a very rare surname and given that and the same location, our Biello lines probably intersect at some point, it's just a question of whether it's before or after civil records began in 1809. I'll certainly keep looking.

Monday, January 6, 2020

ThruLines vs DNA Circles

The most common question I see about ThruLines is whether it uses DNA, or family trees, or both, and there seems to be a good deal of misunderstanding and confusion about ThruLines, especially in regards to how it compares to the now retired DNA Circles. The best thing I can say is that ThruLines does things with trees that DNA Circles didn't do, and DNA Circles did things with DNA that ThruLines doesn't do.

I'll start with what DNA Circles used to do.

A screenshot from Ancestry's blog of how DNA Circles
showed everyone in the group shared DNA with each other 
DNA Circles would first look for a group of people who mostly all shared DNA with each other. So let's say you shared DNA with A, B, and C and on top of that, A, B, and C all shared DNA with each other too. In addition, you did not share DNA with another person called D, but D did shared DNA with A, B, and C, and so was included in the group. So not everyone in the group had to match every single other person, but had to match enough people in the group to justify including them. Once this DNA group (or circle) was established, the system would then look for a common ancestor among your trees - and only among the trees of the people in the group. Once the common ancestor was identified, people in the group who didn't have this ancestor in their tree yet (or didn't even have a tree to begin with) would get DNA Circles in the form of "New Ancestor Discoveries". Within the tools of DNA Circles, you could see who all was in the group and who shared DNA (and who didn't) with whoever else (see above right). (Note: I think the minimum for a Circle to be created was 7 people not too closely related to each other, not 5 like the example I'm using, but I reduced it for the sake of ease).

So, DNA Circles was primarily looking at the shared group DNA and only using trees to identify the source of all that shared DNA (which is actually much how Genetic Communities work too, but I digress).

ThruLines works sort of in the completely opposite way.

ThruLines is only showing these matches descend from the
same ancestor based on trees - it does not tell you whether
they share DNA with each other or not
ThruLines looks for a common ancestor between you and an individual DNA match by looking at the entire database of (searchable) family trees. This is something DNA Circles didn't do, because it only looked at the trees of those in the Circle. So ThruLines is taking match A and looking for a common ancestor by trying to compile all the data from available trees (not just the tree of you and match A). It does this for each DNA match individually, so it then separately finds match B and C are also supposedly descended from the same ancestor and it groups A, B, and C based on what the trees say is their shared ancestor, regardless of whether these matches also share DNA with each other or not (and it doesn't even tell you if they do or not). ThruLines does not check to see if A, B, and C also share DNA with each other like DNA Circles did, and it doesn't even involve D because you don't match D. You can sometimes check to see if they share DNA with each other yourself by using the Shared Matches feature, but remember only estimated 4th cousins or closer are included in the Shared Matches list. If you share less than 20 cM with A, B, and C, you can't see whether they share DNA with each other or not.

What does this mean? It means that beyond the fact that ThruLines is only looking for tree connections with your DNA matches, DNA really isn't a part of ThruLines. The groupings are not based on DNA like DNA Circles were because ThruLines doesn't even know, much less show you if the people in the group share DNA with the others in the group or not. Knowing which of your matches also match each other is important for establishing a connection to an ancestor because remember, family trees are fallible and you can't rely on them alone, especially when there's also no known paper trail to confirm it. You need those DNA groups/circles to tie those alleged descendants together and confirm what the trees say, but ThruLines doesn't do this.

Another screenshot from Ancestry blog showing the now
retired New Ancestor Discoveries.
Another thing ThruLines doesn't do, because it's primarily working off trees instead of primarily working off DNA like Circles did, is provide you with Circles, or New Ancestor Discoveries, even if you don't have a tree attached to your test results (shown right). Because ThruLines is working off trees, it needs a starting point in your tree to connect it to other people's trees. Without at least a basic tree with your parents in it (and ideally they want at least 4 generations), you will not get any ThruLines at all. But because DNA Circles was firstly looking at groups of shared DNA, it didn't matter what your tree said or whether you even had a tree, it could tell you that you fit into that established DNA group based purely on DNA. While ThruLines does include "Potential Ancestors" not yet in your tree, this is again based on what trees say and you will not get any if you don't have a tree attached to your results. This is a drawback for adoptees and people looking for unknown biological family since they have no biological tree to add.

Hopefully that helps clarify the main differences between these two tools. They are/were both useful in their own ways, but they are different and understanding their differences is important so you're not making assumptions about ThruLines and letting it lead you astray.