Genealogical Musings: tips

Showing posts with label tips. Show all posts

Monday, October 14, 2024

Using AI to Transcribe and Translate Documents

Being able to upload an image of a document to AI and have it instantly and accurately transcribe the difficult to read handwriting or translate a foreign language from an equally difficult to read handwriting is an exciting idea, but in practice it does not work well. Specifically, the biggest problem I find with it is that when it can't read something, it decides to just... make something up that seems in keeping with the type of document it is. Clearly, AI is smart enough to recognize what kind of document it is and supply the type of information you'd expect from that type of record, but it is apparently not smart enough to realize why the information needs to be accurate and not fabricated.

Take for example this will of Henry Rorer (above, click to enlarge). I uploaded it to ChatGPT and asked it to transcribe it. This is what it gave me:

In the name of God Amen the twenty-second Day of January in the year of our Lord one thousand seven hundred and eighty. I Henry Rorer of Bristol Township in the County of Philadelphia Yeoman being weak in body but of perfect mind and memory thanks be given unto God, therefore this my last Will and Testament. That is to say, I first and principally I resign my soul to God that gave it and my body to the Earth to be buried in a decent and Christian like manner at the discretion of my Executors herein after named and touching such worldly Estate wherewith it hath pleased God to bless me in this life. I give and dispose of the same in the following manner, and form.
Imprimis. I give and bequeath unto Mary Willard one Chest that she is to choose out of all my Chests and Boxes to be paid and delivered her by my Executors hereafter named.
Item. I give and bequeath unto Margaret my dearly beloved Wife all my Estate both real and personal during her widowhood, excepting such legacies as I shall hereafter mention in this my last will and testament and if in case my Wife should die or marry, then it is my will that my Estate shall be sold and divided equally amongst my children or the survivors of them share and share alike, and if either of my Children should die under age or without lawful issue that then my Estate shall be equally divided amongst the survivors of them, and I do hereby nominate and appoint my beloved Wife and my Friend Henry Rorer and George Rorer Executors of this my last Will and Testament, and I do hereby utterly revoke and disannul all former Wills and Testaments by me heretofore made Ratifying and Confirming this and no other to be my last Will and Testament.
In witness whereof I have hereunto set my hand and Seal the day and year above written. Signed Sealed published pronounced and declared by the said Henry Rorer as his last Will and Testament in the presence of us.
Phillip Hagey
Peter Shall
Henry Rorer (Seal)

At first glance, it looks great. But when you actually compare it to the original document, it quickly becomes clear some of it was just completely made up. Firstly, note that the date is slightly off. It was actually the 29th day of January, not the 22nd. It's understandable a single word might be misread, but it goes downhill from there. Where the transcription reads "thanks be given unto God, therefore this my last Will and Testament" the record actually says "do make and ordain this my last Will and Testament." The transcription then goes on to add in an entire sentence that isn't included in the record: "That is to say, I first and principally I resign my soul to God that gave it and my body to the Earth to be buried in a decent and Christian like manner at the discretion of my Executors herein after named..."

Although not accurate, it's at least not giving us any misleading information. The paragraph following it, however, is fabricating people and property never mentioned in the original. The transcription says:

Imprimis. I give and bequeath unto Mary Willard one Chest that she is to choose out of all my Chests and Boxes to be paid and delivered her by my Executors hereafter named.

The actual document says:

Imprimis It is my Will and I do order that in the first place all my just Debts & funeral charges be paid and satisfied.

ChatGPT did not even attempt to get it right. It's as though it ignored even the words it could make out in favor of a sentence that makes sense instead of just leaving blank spaces where it isn't able to read certain words. So the entire sentence was changed just to produce something that wasn't missing words. This could be really disastrous to your research if you're not paying attention. It fabricated a name where there was no person mentioned, making it look like Henry Rorer willed one Mary Willard her choice of a chest or box when there was no such person ever mentioned in his will and no mention of any chests or boxes either.

Additionally, the transcription looks surprisingly short. Even though the will is only about one page length, it seems like more than what the transcription included, and indeed, it is missing a lot of information from the rest of the document including many people who are of importance to Henry Rorer.

The same thing essentially happens when you ask ChatGPT to translate a foreign handwritten document, wherever it struggles to read the handwriting, including names, occupations, ages, etc, it simply makes up random data to replace it. It changes names and other data just to produce a complete transcription, whether accurate or not.

So here is your warning. AI is not there yet. Maybe there's more reliable ones out there, but I imagine they aren't free. Whatever FamilySearch is using to transcribe records in their experimental Full Text Search using some kind of OCR for handwriting seems to work much better than ChatGPT. But not all records are on FamilySearch, let alone under their Full Text Search.

As ever, the genealogist mantra remains true: you have to do the work yourself and verify everything.

Wednesday, August 10, 2022

Understanding Marriage Bonds

It's really important that we understand what certain records are before we try to extract information about them. In your research, you may periodically come across marriage bonds, which are not be to confused with a marriage license or marriage banns. Recently, I noticed someone had mistook a marriage bond for a marriage license and was wrongly assuming that the high amount money quoted on the marriage bond (500 pounds, or $1,000 depending on the time period) was a marriage license fee. As a result, they thought only wealthy people could afford to get married at all. Here's an example below from Brunswick County, North Carolina, 1856 (click to enlarge).

This was admittedly a lot of money for that time period. In 1856, $1,000 would be equivalent of about $36,000 today. Imagine paying a marriage license fee of $36,000! Some people would have nothing left for the actual wedding ceremony and reception, and some people wouldn't be able to afford it at all. Imagine not being able to afford to get married!

But that wasn't actually the case. Especially in history when it was really important for people to be married when they had kids. There's no reason the government would put such high restrictions on getting married for the average citizen to make it impossible or undesirable to get married, they would only be encouraging people to have children out of wedlock, and that doesn't make sense for the time period.

So what was this $1,000? A marriage bond was like a legal promise to marry, and the fee helped assure that there would be no legal impediments to the marriage because you and your co-signers only had to pay the fee if it was discovered that there was some sort of legal impediment to the marriage. One example of this is when the bride or groom was actually still married to someone else and not free to marry again. Since that didn't happen very often, most of the time, that $1,000 never needed to be paid. another example might be if the bride or groom was legally too young to marry without their parent's or guardian's approval, which they didn't have. If they lied about their age but then the truth was discovered so they couldn't legally marry, that fee would have to be paid.

Notice the terminology on the record:

"...now in case it should not hereafter appear that there is any lawful cause or impediment to obstruct such marriage, then the above obligation is to be void..."

So if there's no reason they can't be married, then the obligation to pay the fee is void (they don't have to pay the fee).

This was to help prevent things like bigamy or young people lying about their age, because the bride or groom won't be likely to sign an agreement saying they owe $1,000 if it turns out they're already married to someone else or too young to marry. Some people might have done it, thinking no one would find out, but most people will definitely think twice about committing bigamy or lying about their age if it means paying a $1,000 fine, equivalent to $36,000 today.

Doesn't it make a lot more sense that the government would impose a high fee if it was discovered someone was trying to or did commit bigamy, than the government trying to discourage and even prevent some people from getting married at all? This is why it's so important to understand records instead of making our own assumptions about what they mean. If your assumption doesn't seem to make much sense, there's probably a more logical explanation. If you don't know what something means, usually a quick Google will help, and in the case of a more niche or nuanced topics, there are typically experienced researchers who can help in various genealogy communities. There are plenty of communities on Facebook, Reddit, Wikitree, FamilySearch, etc.

Sunday, February 27, 2022

What exactly is the AGBI and how do we use it?

By now, you've probably occasionally come across a source known as AGBI, or American Genealogical Biographical Index, and maybe you've even attached it to your tree because it comes up as a hint for your ancestor, and everyone else has attached it to the same ancestor, and you don't want to miss out, right? But the details are usually vague, what is it even referencing and how do you know the records are for the right person?

The AGBI on Ancestry is basically an index of an index. It's referencing a big series of books that indexes tons of sources on early Americans. I don't know why Ancestry's index doesn't include all the data included in the book's index, but it doesn't. So to find the original source, you first have to look up the AGBI book index. You can find the books in a number of places online, I usually use the one at FamilySearch because it's free and accessible from home (it's not a restricted collection) - scroll all the way down (passed the listings that say off-site storage).

Ancestry's index should include a volume and page number, although weirdly the books don't include page numbers, that's okay, because they're in alphabetical order. So simply open the volume you're looking for, and then find the name you're researching in alphabetical order. There will likely be several entries for the name you're looking for, but you can usually tell which one you need from the location and/or time period included in Ancestry's index. Even so, the AGBI books can be seemingly as vague as Ancestry's index is, and sometimes it takes some understanding and/or Googling of what it's referencing.

For example, "Pa. Archives" is not a reference to the Pennsylvania State Archives, it's a reference to another series of books that includes primary records from early Pennsylvania called the Pennsylvania Archives - there should be a series number, a volume number, and a page number. The Pennsylvania Archives are also available online at various sites, Google Books, Archive.org, Ancestry, FamilySearch, Fold3, etc.

Another example is a source just called "Transcript" - this is a reference to the Boston Evening Transcript, a newspaper that ran a genealogy column from 1906-1941, including details on ancestors not exclusive to Boston or Massachusetts. Obviously, it's very much a secondary source, so I'd be careful with it, but it's available from Newspapers.com covering the years 1848-1914, and at American Ancestors covering 1911-1941 (select the Boston Evening Transcript from "Database").

You'll also see references to Revolutionary War Rolls and Pensions, those are fairly self explanatory. There's also states with "Heads Fams" which is referring to the names of the Heads of Families listed on the 1790 census. Since the 1790 census is already widely available online and probably already attached to your tree when appropriate, this isn't a very useful citation anymore. There's lots of other sources included in the AGBI, but usually they are self explanatory, or you can find out what they mean with a little bit of Googling.

It's really important that you find the original record the AGBI is referencing because the index is so vague, there's really no way to know from it whether it's for the right person you're researching or not. You may often find that once you look up the original record, it's not actually a reference to the person you're researching after all. Probably, researchers on Ancestry just attached it to their tree because the name and perhaps location and/or time period fit, without looking into it further probably because they simply don't know how to find the original. But particularly with common names, you can't assume that means it's for the correct individual, and now you don't have to be one of those people.

Wednesday, January 5, 2022

How Far Back Can We Go?

How far back can we research our family tree? It's a question that comes up periodically, especially from beginners who are sometimes overwhelmed with finding other people's trees going back very far. In practice, the answer will vary greatly depending on your tree. One branch might dead end in the 18th century, another might go back to the 16th century, and another yet might link to royalty and date back to Charlemagne (8th-9th century). But how far back is it plausible or realistic? At what point exactly do all these trees that date back to ancient times, mythical figures, Adam and Eve, etc become impossible?

In general, the simplest answer for European research that has no known connection to royalty or nobility is that the 1500s is the end of the line. Like I say, not all your branches will likely even go back that far. Many times, the trail simply runs cold well before that point. For example, if you're American, you may never be able to find the specific origins of an immigrant ancestor. But if you're lucky, you may find a few branches here and there that go back to the 1500s.

Why the 1500s? Because that's when parish records began to be mandated in Europe. England was among the first to do so. In 1534, England separated from the Catholic Church and formed the Church of England, a protestant church, all so that Henry VIII could divorce his wife and marry his mistress. A mere 4 years later, England required that their brand new church begin keeping parish records of baptisms, marriages, and deaths/burials in 1538. Around the same time in 1540, the Lutheran Church also started requiring parish records be kept throughout their rapidly expanding churches in central Europe. In 1563 at the Council of Trent, the Catholic Church ordered that parishes keep baptism records, and not long after, other orders in various countries required marriages and deaths/burials. This meant essentially most churches in Catholic or Protestant Europe were expected to keep parish records from that point onward.

However, not all churches began adhering to these requirements right away. Many were slow to start keeping records, so depending on the location, you may not find parish records going back quite this far. In England, only 14.8% of parishes were keeping records by 1555, and that had risen to 54% by 1600. Most parishes in Italy didn't start keeping records until about 1595, but at the same time, a few Italian parishes (namely Palermo and Firenze) had taken it upon themselves to keep records long before the Catholic Church mandated it, sometimes going back as far as the 14th century! In France, general compliance wasn't until around the mid-1600s, and most Reformed churches were keeping records by 1650 as well. So in many places, you may only be able to go back to the 1600s.

Additionally, even when records were kept from this early on, not all have not survived to today. Many were damaged, lost, or destroyed over time, through natural disasters or war, or simply deteriorated over time. Some from the 16th century may have survived but there might be large gaps, making it impossible to connect the dots.

So, how far back parish records go, and whether they've survived to today or not really depends on the specific location in Europe, but in general, it's safe to say the 1500s are the furthest plausible cut off point. Unless a branch has genuine links to royalty or nobility (and there's a lot of false links out there, so be careful), or you're among one of those rare exceptions of parish records going back to the 14th century, a tree extending beyond the 1500s is probably not accurate or reliable.

That doesn't mean every tree going back to the 1500s is reliable though, just that you would have to look more deeply to determine that. As I mentioned earlier, in some cases, your trail may dead end with your immigrant ancestor. If you can't find the specific origins of your immigrant European ancestor, it doesn't even matter how far back European parish records might go. And just because parish records may go back this far doesn't necessarily mean you can use them to reliably trace your lineage. Parish records are notoriously vague, containing very little information that can often make it impossible to say for sure if the records you're looking at are for the right person you're researching. Especially when you only have access to an index and not the original documents (which is common for early parish records like this). All it takes is more than one person with the same name born around the same time and location to completely throw their identity into question. Or one ancestor moving across the country with no record of it, and having no idea where to find them. Records can be so scarce, it's safe to say that if you're not descended from a somewhat notable lineage that was better documented, like wealthy land owners, merchants, or holding some sort of position like a sheriff (not necessarily nobility), there's a very good chance you'll never be able to reliably research back as far as the 1500s, even if the parish records exist.

Now, I keep saying "unless you have a genuine connection to royalty or nobility". So what if you do? Despite the amount of false links out there to royalty, some of them are genuine, and in those cases, it is possible (likely, even) to go back much further than the 1500s. Most royal and noble lines are well documented even before parish records were kept, because their titles were inherited, so documenting their lineages, especially male lineages, was very important. How far back they go depends entirely on the lineage, but many royal lines go back to Charlemagne, who ruled much of Western Europe in the late 8th century and early 9th century. Charlemagne's ancestry has also been traced back to his 5th great grandfather, a 6th century nobleman named Ansbert, whose wife, Blithilde (or alternate spellings), has been claimed to be the daughter of Chlothar I, but this is highly debatable. Ansbert is generally considered the end of the royal line, and not all lines will go back that far.

As you can see, even royal lines only go back to about the 6th century at the most, so proving European descent from BC is just not possible. There are many theories out there, but none are proven. Any tree that goes back to BC is highly speculative at best. That's not to say the family trees of people who lived in ancient history can not be traced within Antiquity, just that there's a known genealogical gap in between Antiquity and the Middle Ages.

Sources:

FamilySearch Wikis:

Medieval Lands - A prosopography of medieval European noble and royal families (using primary sources)
Descent from Antiquity
Parish register

Thursday, December 30, 2021

Antenati's New Site Design - is it Actually Better?

Not too long ago, the Italian Archives website, Antenati, or "Ancestors Portal" got a face lift. At first, everyone raved about what an improvement it was, and admittedly, the ability to find and navigate to the records you're looking for has been a great improvement. Unfortunately, it has come at the cost of the Archives no longer supplying an inherent way to download full resolution images, which means we can't save copies of the records for our personal reference. We can take a screenshot, but to get the whole document, it will be too small to read. And if we zoom in to take a screenshot, we won't get the full document.

There is a way around this - but it's basically a hack, and who knows if it will remain available forever. It's also complicated and includes several steps involving the html code. But if you're brave enough, here's how to do it:

Step 1: Navigate to the image you wish to download, and click the icon with 3 horizontal lines located in the upper left corner of the image viewer window (see screenshot below, the icon is highlighted in yellow, click to enlarge).

Step 2: This will bring up a side bar on the left with information. Note the page number listed here (highlighted in yellow in screenshot below), because you'll need that later.

Step 3: Scroll down the side bar to the bottom where you'll see a link just below where it says "IIIF manifest". Click the link (highlighted in yellow in the screenshot below).

Step 4: Here's where it gets tricky. The link opens a page with a bunch of html coding. Different browsers seem to display it differently - if you're lucky, it will be organized with nested lines and different colors, making it easier to find what you're looking for, and the URLs will be clickable links. If you're unlucky like me, you'll see a big long block of text/coding with no links, no colors (shown below). What you're look for first is the page number you took note of in step 2. In the code, it will say "label":"pag. 31" (or whichever page number you're looking for). If you're having trouble finding it, you can use your browser's "Find" or "Find in Page" option to search for it (the screenshot below shows the page number 31 highlighted because I searched for it).

Step 5: Look just above your page number in the code for a URL that looks like this: https://iiif-antenati.san.beniculturali.it/iiif/2/wrZgxjz/full/full/0/default.jpg (URL is highlighted in grey and shows relation to the page number in the screenshot below) - the part that says "wrZgxjz" in my URL will be different for you. That's okay, that's what you want. That's the specific image code you're looking for. Copy and paste the whole URL (or click on it if it's clickable) into a new browser tab.

Step 6: If you're unlucky like me and the URL you copied and pasted includes duplicate slashes so you're getting a "Page not found" result, remove the duplicate slashes. The URL should look like this: https://iiif-antenati.san.beniculturali.it/iiif/2/wrZgxjz/full/full/0/default.jpg, not like this: https:\/\/iiif-antenati.san.beniculturali.it\/iiif\/2\/wrZgxjz\/full\/full\/0\/default.jpg or like this: https://iiif-antenati.san.beniculturali.it//iiif//2//wrZgxjz//full//full//0//default.jpg. If it's annoying to delete all those extra slashes every time, you can always just bookmark the proper URL and then just copy and paste the image code into the URL.

Step 7: Once you get the correct image to load, you can right click it and save the full resolution image.

Although the new site might be faster and easier to navigate, the inability to save crucial documents (which you'd think was the entire purpose of the site) is a huge step backwards. This hack is cumbersome, but for now, it's the only option. Good luck.

Sunday, September 19, 2021

How to Group Your DNA Matches to Help Break Down Brick Walls

How do you break down a brick wall with DNA? It's what everyone wants to know - after all, what is the point of getting a DNA test if the ethnicity report is unreliable? Everyone says the true value of the test is in your DNA matches, but how do you utilize them to actually be useful in your research? To break down brick walls? To do what paper research couldn't?

This sort of ties in with my instructions on how to find unknown biological ancestors with DNA, though that was targeted more at NPE or adoption situations. However, the same basic process and workflow can be applied to breaking down brick walls. In the past, I've detailed specific cases where I've used DNA to break down a brick wall, but some of them are a little unique - every situation might be a little different, and therefore might require a bit of a different process. But here's the basics.

In my post about finding unknown biological ancestors, in Step 1, it says, "Look for your closest DNA match that you can't identify as being from another known branch of your tree."

But wait - how do we even get to the point of finding a match you can't identify? You do that by identifying and grouping as many matches as you can. This is how my workflow goes, it works best for me, your mileage may vary, but in my experience, this is how most people do it in some way or form. Some maybe use a spreadsheet and the "Leeds Method", but ultimately, it's just a matter of grouping your matches by what branch of your tree they belong to, and since AncestryDNA have a built in grouping tool, I find that works best for me.

Grouping your matches.

Step 1: Create a group for each "branch" of your tree. Which branches? I recommend a group for each of your sixteen 2nd great grandparents, unless any of those 2nd great grandparents were from the same specific location, or endogamous population, because they will be difficult to tell apart. For example, my 2nd great grandparents who both came from the same tiny town in Italy called Monteroduni got grouped together because I have no other branches from there, and since the town is so endogamous, it would be difficult to always tell them apart. So I just have one group for "Monteroduni". Don't group by broader locations, like country. I did that by grouping my other 2nd great grandparents together because they were both from Norway, but now I regret that because they came from totally different parts of Norway, so there's no endogamy between them. So although I recommend a group for each 2nd great grandparent, depending on your ancestry, you may want to sometimes group them differently.

16 groups does mean that it will fill up a lot of your available groups, AncestryDNA only allows you a maximum of 24, so you will only have 8 groups left to do with whatever you want. So like I say, you may want to group them differently, but this is what worked best for me.

Step 2: Start at the top of your match list and work your way down. Do you recognize your top match? Or can you see from their tree (if they have one) what ancestor you share? Is there a ThruLines/common ancestor hint for them that you can verify? If you already know the match or can identify how you're related to them, mark the branch you share by adding them to a group you've created for that branch. Do not assume a shared surname alone is the source of your shared DNA, it must be an actual common ancestor.

You may also want to add a note of your common ancestors, so you can see who they are more easily, and also so you know there's identified common ancestors (though I also have a group for MRCA - matches that have identified a most recent common ancestor).

My top matches are all my Italian cousins, you can see how
I've grouped them and added our MRCA to notes

Step 3: Do the same for the next match, and the next - keep going until you can't identify a match. When that happens, look at your Shared Matches with that person. Are any of them the people you've already identified with a common ancestor? If so, they are likely also from the same branch (especially if there's more than one match they share from the same ancestor/branch), so add them to that same group.

I don't know my MRCA with Bettye because she hasn't added a tree,
but I can tell she's from my Smith branch because she matches
several people who are confirmed Smith descendants

If they have a tree, even a tiny one, build on it until you can find the connection to the branch you know they are likely from (focus on lines that come from the same/nearby location). If you can't find a common ancestor, that's okay, leave them in that group and you can come back to them another time.

Step 4: Keep doing this, ideally for all your estimated 4th cousins and closer (20+ cM). That's a lot, I know (I currently have 1,048 matches that share 20+ cM with me). It takes time, it's a lot of work, but in the end you'll wind up with 3 types of matches: those with identified common ancestors, those who likely come from an identified branch, and those you have no clue how you're related, not even a potential branch.

What to do with these groups?

This is where there will be some overlap in my instructions on finding an unknown biological ancestor. Look at the closest match that you haven't even been able to group into a certain likely branch (or a common ancestor). Even if they don't have a tree, that's okay - look at your Shared Matches with them and open any match that has a viable tree. Compare the trees - do any of them share an ancestor with each other that you don't recognize? If so, research that ancestor and build a tree for them, you may find it links up with yours somehow, maybe even by breaking down a brick wall, or that it leads to an NPE - when someone's parent(s) is/are not their biological parent(s).

Additionally, you can look at your closest match that you haven't identified a common ancestor with, but you have grouped them into a likely branch. If they have a tree, again, build on it, and keep researching until you can find a connection. See my case example of Emma Elizabeth Sherwood.

This method of grouping your matches to single out the ones you can't identify at all can help lead you to some enlightening revelations, but they tend to be rather random. You don't know what you're going to find, you don't know which brick wall it might break down. Even the matches you can group into a likely branch but you're still searching for the common ancestor might surprise you - in my example of Emma Elizabeth Sherwood (above), I knew the match was related to my Mills branch (Emma's husband), but I had no idea it would finally break down the Sherwood brick wall that had been blocking me for 12 years.

Other methods.

There's other methods of breaking down a brick wall with DNA, ones that are more targeted for a specific brick wall, but they heavily rely on the surname you're looking for not being a very common one. You basically just search your matches trees for the surname you're looking for, and then compare the trees of the matches in the results, looking for a common ancestor among them. It can work well when the name isn't common, because it's likely most of the matches in the results will be the ones you're looking for. But the more common the name is, the more matches there will be in the results that aren't related to the branch you're looking for. That's why this never worked with Emma Elizabeth Sherwood (in my above example), because Sherwood was too common of a surname, I only found her family by using the more random grouping method and not knowing where an unknown match would lead me.

The surname search method would be much more effective if AncestryDNA would offer a very simple feature: the option to search for a surname within a specific location. At the moment, you can search for a surname or location, but not a surname in a location. So you can search for Smith OR Christian County, Kentucky, and you can search for them both at the same time, but it will include results for match's trees that have either the surname Smith, OR the location Christian County, Kentucky. And even if the tree includes both, it's not necessarily for the same branch or ancestor, it might be their Jones branch that's from Christian County, Kentucky, while their Smith branch is from Pennsylvania. For common surnames, we need a way to narrow it down, and the best way to do that is by looking for surnames within a specific location. At the moment, we can only do that manually by searching for a surname, and going through each match in the results to see for ourselves if that branch is from the right location. If so, then we can look for a specific common ancestor. It's very time consuming, and the more common the surname is, the less realistic it is to go through all those matches manually, yet there's a very simple way to make it easier, if AncestryDNA would just listen to their customers.

The surname search works a lot better if it's not a common surname. I successfully used this method with the surname Deaves, and also a suspected maiden name of Brannin.

You can also search by just location, but this only really works if your ancestors are from a very small, unique town, especially where there's endogamy. In my above example about my 2nd great grandparents who came from a tiny Italian town called Monteroduni, it's safe to say that the town is so small and endogamous that anyone who has ancestry from Monteroduni is probably related. Certainly, any DNA match of mine that has ancestry from Monteroduni, it's safe to say that's very probably how we are related. So I can very easily search my matches trees for the location of Monteroduni and even if I can't find a common ancestor between us, most likely that's probably where our common ancestors were from. Brick walls are difficult with endogamy though, so that might be the most I'll ever be able to determine. Searching by location may not break down any brick walls in your tree, but it does help you identify and sort your matches into groups/branches, which can help you find other unknown matches that may lead to a brick wall.

Like I say, sometimes breaking down a brick wall with DNA can be unique to the situation. Sometimes you have to think about what you're looking for, and consider the best way to come at the problem. But this should give you the basics to get you started. Feel free to share your success stories!

Friday, January 8, 2021

Small but Significant Changes at Ancestry.com

Ancestry is rolling out some new tweaks to their website that has everyone in a tizzy and I don't really know why, because in many ways they seem like an improvement to me.

One of the changes was the removal of the clickable alphabet at the top of our List of All People. It allowed us to jump to surnames that start with any letter by clicking on the letter. I know the removal of this seems like a negative, but it's really not. What remains are two name search fields, one for first name, the other for last name. They were always there, and they always offered the ability to do what the alphabet list offered too, which I imagine is why the alphabet was removed - there is no point in having two different tools that do the exact same thing. You can still jump to surnames starting with any letter by simply putting that letter into the surname search field. But even better than that, the search fields offers way more versatility than the alphabet did, because you can also do the same for the first name field (shown above), and you can use more than one letter, so you can quickly bring up all "Mc" or "Mac" surnames, for example (shown below). I believe this was always an option, a lot of people just apparently didn't realize it.

The bigger change is in hints, clicking on a hint now brings up a side bar to preview the hint (shown below) instead of it loading a whole new page. People are complaining that it requires more clicks to confirm and attach the hint now, but that's just not true. In the past, you had to click on the hint, it loaded a new page where you could review the record, then you clicked "yes" to the hint being correct and it loaded the page that allowed you to edit the data you're adding, then you click "save to your tree" and you were done. That's 3 clicks.

It's the same now. You click on the hint, but instead of it loading a new page, the side bar pops up. Instantly, I felt this was an improvement because you haven't left the person's profile page, so you can still fully view and compare all their data, and sources, etc. You can click on Facts, Gallery, etc and the side bar remains up, allowing for a full comparison (shown below). In the past, the only way to do this was to right click the hint and open it in a new tab, which you can still do, but now there's no need. I always did this, because I generally want to refer back to the profile while checking a hint. Now, finally, I don't have to open a new tab, which is going to make my workflow much more efficient.

Even if you didn't open the hint in a new tab in the past like me, the number of clicks is still the same. After clicking on the hint and the side bar popping up, you click "yes" to the hint being correct and it loads the page where you can edit the data you're adding, and then you click "save to your tree" just like before. That's 3 clicks.

So I'm really not sure what the fuss is all about. The changes either won't slow your workflow, or they will actually improve it. Give it a chance, you might find it works better.

Friday, November 27, 2020

Which DNA Company Should I Test With?

I did a guide for this a few years ago, but it's already kind of out of date, so let's look over the options again, especially since all the holiday sales are starting to happen. The main question when asking which DNA test/company to go with, is what are your reasons for testing? Instead of detailing each company, I'm going to answer the four main reasons people want to take a DNA test:

1. I'm a genealogy hobbyist and want to use DNA as an additional research tool.

AncestryDNA have the biggest database of testers, and because they are a genealogy website, they are the most likely to have DNA matches with family trees (which is the best way to get the most usage out of your DNA matches). Particularly, if you already subscribe there or have a tree there, it's easiest to have all your work in one place, including DNA. Even if you don't have an Ancestry.com subscription, you'll still benefit from testing at the biggest autosomal DNA database (you will be able to contact your DNA matches even without a subscription, and you can add a tree for free too).

Additionally, because AncestryDNA don't accept raw DNA data from other companies, but other companies (like MyHeritage and FamilyTreeDNA) do accept raw DNA data from AncestryDNA, it's ideal to test with AncestryDNA and then upload your raw DNA data to sites like MyHeritage and FamilyTreeDNA (they have free uploads, but there's a small fee to unlock your full results). You'll get the most out of your money this way, and have access to several databases.

MyHeritage are best for foreign DNA matches, particularly from certain places where MyHeritage is popular (for example, I have lots of DNA matches living in Germany, but only a few from Italy, despite having more recent ancestry from Italy). They also make it easy to find/sort by foreign matches, whereas other companies don't. You may choose to test with MyHeritage for this reason, especially if you already have a subscription/tree there, but again, be aware that you can upload an AncestryDNA test to MyHeritage, but not vice versa. (Right: a screenshot of my number of matches from various countries at MyHeritage).

23andMe are not ideal for genealogy, since they don't host shareable family trees, and they are not a genealogy website. They also cap your DNA match list at about 1,500 people (in comparison, most people at AncestryDNA get about 20,000+ DNA matches), unless you upgrade to a monthly subscription which still only expands it to 4,000 matches (the subscription also includes some additional health report benefits). Some people might cite 23andMe's inclusion of haplogroups in their reports as a reason to test there, but haplogroups generally aren't useful to recent genealogy. Sharing a haplogroup usually just means sharing a most recent common ancestor (on the patrilineal or matrilineal lines) from thousands of years ago, which long pre-dates recorded genealogy.

FamilyTreeDNA do allow you to upload a gedcom, but their database is small and since you can upload your raw DNA data, it makes more sense to test elsewhere and then upload to FTDNA if desired.

2. I want health reports.

23andMe are best for health results. They have the most useful of health reports, and while other companies like AncestryDNA and MyHeritage have added a few "traits" or health reports, they are very minimal and not as useful or extensive as 23andMe's. (Right: an example of 23andMe's Health Predisposition report - their healh reports also include Carrier Status, Wellness reports, Traits, etc).

Whatever company you test with, uploading to Promethease.org for a small fee will provide the most extensive health reports, though it is not super user friendly (and they do not offer testing, it's strictly an upload site). If you're willing to deal with the learning curve, testing at AncestryDNA and uploading to Promethease is a good option for those who want the test for both genealogy and health reasons. Otherwise, you'll have to prioritize one over the other because there's no testing company that's ideal for both.

Also be aware that if you have a specific health report in mind, you might want to consider a test more specific to it. For example, for reports on your genetic predisposition of cancer, I would recommend a more comprehensive test like Color.

3. I'm looking for an unknown biological parent/relative (like in the case of adoption).

First test with AncestryDNA, since they have the biggest database of testers and host family trees. Then upload your raw DNA data to MyHeritage and FamilyTreeDNA for small fees to unlock your full results. You can also upload to Gedmatch for free (but Gedmatch isn't a testing company, just a place to upload, so I won't mention them much in this article).

If your budget allows, also test at 23andMe (because like AncestryDNA, they do not accept uploads, so you have to test with them to be on their database). Although they aren't ideal for genealogy, which may make it difficult to make use of your DNA matches, when looking for unknown biological relatives, you want to maximize your chances of finding the closest DNA relative possible, and that means putting yourself on every database available.

If you are male, and looking for a biological father, or paternal grandfather, you should also consider taking a Y-DNA test at FamilyTreeDNA. Although more expensive than an autosomal DNA test, and there's no assurance that Y-DNA results will be useful because it depends on who else has tested, when it is useful, it can really help, especially in combination with your autosomal DNA matches. Because Y-DNA follows the patrilineal line, it's essentially linked to biological surnames. So excluding other NPEs (non-paternity events) or Y matches whose most recent common ancestor pre-dates the development of surnames, your Y matches surname should theoretically tell you your biological surname. That doesn't always happen, because again, it depends who has tested. But when it does, you can then take that surname and search your autosomal DNA matches trees for it, which should then point you to a most recent common ancestor.

4. I want to learn more about my ethnic ancestry!

I would strongly discourage from taking the test purely for the ethnicity percentages. I know they have great appeal, I know they seem like a quick, easy, and not too expensive way to learn more about your ancestral background, but the fact is, and I can't stress this enough, they are only estimates or interpretations of your DNA and are not particularly reliable. Different companies will likely give you different results, and every company periodically updates their ethnicity reports, which generally changes them, sometimes quite drastically. There is no one company that has the most reliable ethnicity percentages for everyone - which one is more consistent with your personal family tree really depends on the individual, and that could always change with the company's next update.

That said, there are elements of the ethnicity report that can be more reliable. On a continental level (European vs Sub-Saharan vs East Asian vs Native American, etc), the percentages are generally much more reliable, so if you're of mixed race, the report might be enlightening. But the more specific the regional or sub-continental the percentage breakdown is, the more speculative it becomes, with only some exceptions in populations with high levels of endogamy (like Ashkenazi Jewish, or certain islander populations). So while it may be tempting to go with the company that offers the most percentage breakdown into specific nations, keep in mind that this will likely make it less reliable.

Ethnicity percentages are fun to explore, but you can't take them very literally. It's better to view them on a broader scale, covering bigger areas, but of course that's not what most people want. 23andMe's percentages have categories like "Broadly Northwest European" which covers a large area, and therefore is more reliable, but then people complain it's not specific enough.

You may notice I keep specifying ethnicity percentages, or percentage breakdown. That's because some companies offer sub-regional reports that don't include percentages because they are calculated a different way. At AncestryDNA, they are called Genetic Communities, and unlike the percentages, positive results in Genetic Communities tend to be very specific to small areas, and highly accurate. Not getting results in a GC doesn't mean you don't have ancestry there though, you generally need significant ancestry from a specific area to get results in a GC. When you do get GC results, you can be 99% sure you have ancestry from that area, you just won't know how much because there's no percentage. 23andMe have similar sub-regional results with no percentages, but in my experience, they are not as reliable as AncestryDNA's Genetic Communities.

Conclusion

In short, here's my recommendations:

For genealogy - AncestryDNA

For foreign matches - MyHeritage (or test at AncestryDNA and upload to MyHeritage for the best value).

For health reports - 23andMe

For unknown biological family - AncestryDNA, plus uploading to other companies, and if budget allows, also testing at 23andMe.

For ethnicity - if this is your only reason for testing, please reconsider. If you really insist, then I'd recommend either AncestryDNA or 23andMe, for the same reasons I've detailed above: you can upload raw DNA data from AncestryDNA and 23andMe to MyHeritage and FamilyTreeDNA (for additional ethnicity results), but not vice versa. If your interests lean more towards health, go with 23andMe. If you think you may develop an interest in genealogy or family history at any point in the future, go with AncestryDNA.

Thursday, October 22, 2020

23andMe: Worse and Worse

It's never been a secret that I feel 23andMe is the worst DNA option of the 4 main companies when it comes to using it for genealogical purposes. While they do seem to still have the most reliable ethnicity percentages, and they offer the easiest way to get health reports that may actually be useful, when it comes to using our DNA matches for genealogy research, 23andMe are an epic fail, and over the years it has just become worse and worse. Between not hosting family trees/gedcom uploads, and capping our match list more and more, it's hardly surprising I've gotten very little use out of it and now it's only gotten worse.

Years ago, back when I originally tested, they hosted uploaded gedcoms (family trees). Anyone who has done DNA based tree research knows this is essential to getting use out of your DNA matches. But not long after, 23andMe obviously decided this was a waste of their server space, but they at least attempted to provide an alternative. They did a deal with MyHeritage (long before MyHeritage got involved in DNA themselves), where gedcoms at 23andMe could be moved to MyHeritage, and a link to your MyHeritage tree would automatically appear in your 23andMe profile. Unfortunately, this didn't last long because at MyHeritage, you have to subscribe to view other people's trees, and probably a lot of 23andMe users weren't going to subscribe just for that reason. So it quickly became apparent that this was rather useless for most people. And of course, MyHeritage eventually began to sell their own DNA test, so they didn't want to be associated with any other DNA company at that point.

For a while, 23andMe simply didn't host any trees at all. They did offer a spot in your profile to paste a link to an off-site tree. But most people didn't bother, and just like at MyHeritage, viewing trees at Ancestry.com also requires a subscription (though they now have a sharing option, they didn't at the time). So unless your tree was available somewhere for free, this was still useless, which is why most people didn't bother. It seemed like 23andMe had abandoned any pretense they ever had at being genealogically useful.

Recently, they did trial an option where you could link your FamilySearch tree to your 23andMe account. This finally seemed like a great solution - it's free, and it's integrated, not just a link to an off-site tree, but something you could view at 23andMe. Sadly, not many people participated in the beta trial, and after months of beta testing, instead of officially adding it as a feature, it disappeared without a word from the company (something that happens a lot). I don't know if it's because not many people tested it out so they thought it wouldn't get used, or if it was something else, but one day it was just gone, so once again we're left with nothing.

Granted, they have recently added a tree feature that let's you add your ancestors and DNA matches to it, which helps visualize how you are related to some of your closest matches. But it only goes back to 2nd great grandparents (3rd cousins), and more importantly, this is for your own private usage only, no one else can see it. If no one else can see it, no one else can make any use of your tree for genealogical purposes. So this is not really what we actually need.

I did also notice they are advertising a "free quote for a genetic genealogy research package offered by Legacy Tree" which I assume includes a family tree. But not only does that cost a lot of money, it's totally unnecessary if you've already build your own tree. And even if you have a tree built at Legacy Tree, it's not integrated into 23andMe.

If that's not disappointing enough, let's talk about our match list, called "DNA Relatives". 23andMe has always capped our match list. At one point, it was capped at 1,000, then they upped it to 2,000, which was great. And more than that, they offered way to search for and find other people you shared DNA with, that you could connect with and add to your match list. But over time, they gradually removed those features, making it harder and harder to expand your match list. Of course, your match list still expanded as more people tested - it's not like people got bumped off the end of the list as new ones came in. Apparently, 23andMe have decided that these essential matches are taking up too much server space and have quietly reduce our match list to just 1,500 people.

In comparison, I have over 22,000 matches at AncestryDNA, and that's not just because more people have tested there, it's because AncestryDNA's matching threshold is 8 cM. At 23andMe, capping my list at 1,500 people (actually 1,454 for me, whereas previously I had over 1,800) means my most distant matches share 20 cM with me. I regularly point this out, but shared segments of 15+ cM have a 100% chance of being identical by descent. That means 23andMe are excluding thousands and thousands of matches that have a 100% chance of being identical by descent. It's always been a real bummer, and in some ways I'm not sure that losing a mere 400-500 matches is that big of a deal since I never got much use out of 23andMe's matches anyway, thanks to their lack of hosting shareable trees/gedcoms. But here's the worst part about the new changes at 23andMe...

They are offering an option to expand your match list to 4,500... great, right?! Except it's going to cost you. Firstly, if you haven't tested on the V5 chip and/or haven't paid to include Health reports, you'll have to upgrade your test. The expanded service only applies to people with an Ancestry+Health V5 test (because it includes extra health reports too, not just the extended match list, and that requires the raw data in the V5 chip). If you tested previously on an old chip, you can upgrade to V5 Ancestry+Health for $99 (normally $199). If you're already on V5 but don't have Health reports, the upgrade to Health will cost $125.

And on top of that, you will have to pay a yearly subscription of $29. While that is not a huge amount of money, no other DNA company requires a subscription to access extra DNA matches. Especially when you consider that even the expanded match list you have to pay extra for is only a small fraction of what you'd get at AncestryDNA for no extra cost, this offer seems of poor value, unless of course you're actually after the extra health options that come with it, that AncestryDNA doesn't even offer.

What that tells us, is that just like always, 23andMe are really more about the health and ethnicity side of DNA testing, whereas AncestryDNA are geared more towards genealogy. That's not surprising, since Ancestry.com are, after all, a genealogy website, whereas 23andMe are not. But it still means that for us genealogists, 23andMe is not the ideal company to test with.

For more info, see 23andMe's page on their "23andMe+ Experience".

Friday, September 4, 2020

AncestryDNA's Inconsistent cM Totals

Edit: See bottom of article for update.

For several years now, because both of my parents took the DNA test, I have noticed certain DNA matches who share more DNA with me than with one of my parents (usually my mom) and none with the other. In most cases, it's only a difference of less than about 5 cM, which is usually small enough that I figure it's nominal and doesn't matter. But I also have many matches where the difference is 10 cM or greater, which is harder to ignore. The greatest difference I've come across so far has been 20 cM. And I know I'm not the only one, I've talked to a lot of other people who have noticed the same.

Recently, AncestryDNA added to the very little amount of DNA matching data they provide, the ability to see the longest shared segment with a match. This has been enlightening, because as many people have already noticed, there are some cases where the longest shared segment is greater than the total amount of shared DNA. Naturally, this isn't genetically possible, and it's left many people confused. AncestryDNA tried to provide an explanation for it:

"In some cases, the length of the longest shared segment is greater than the total length of shared DNA. This is because we adjust the length of shared DNA to reflect DNA that is most likely shared from a recent ancestor. Sometimes, DNA can be shared for reasons other than recent ancestry, such as when two people share the same ethnicity or are from the same regions."

They are trying to keep it simple, but unfortunately I think it serves only to confuse most people even more. Here's what this means.

AncestryDNA have a program called Timber that removes shared segments it believes are not identical by descent (ie, the shared DNA is not coming from an ancestor within a genealogical time frame, but rather from a shared ethnic background). What AncestryDNA's explanation is saying is that they are applying Timber to the total shared DNA, but not to the longest segment. This explains the reason for the inconsistency between the totals and the longest segment, but not the logic or reasoning behind the bizarre choice to apply it to one and not the other. If you find this frustrating, you're not the only one.

What does this mean for the inconsistent shared totals with a match between parent and child? Well, I've noticed that often, when the totals are inconsistent, so is the total and the longest segment, and this tells me the same Timber action that's removing segments from the totals but not the longest segment is probably what is causing the inconsistent totals between parents and children.

Take for example, this DNA match "RB":

RB shares 39 cM across 2 segments with me, longest segment 47 cM
RB shared 19 cM across 2 segments with my mom, longest segment 47 cM

So, my mom and I both actually share one 47 cM segment with RB, but Timber has removed a chuck in the middle of that (making 2 smaller segments). Generally, that's not necessarily a bad thing if that chunk isn't identical by descent, but for some inexplicable reason, Timber took a larger chunk from the shared DNA with my mom than with me. That shouldn't be happening, because it's the same segment, it should be removing the same amount from each. Instead, it's taking the same shared 47 cM segment and removing 28 cM from one person but only 8 cM from the other, and that doesn't make sense, and doesn't exactly instill much confidence in Timber and it's reliability.

My theory on why this is happening is that it may have to do with endogamy. Most of the matches I've noticed with this problem on are my mom's side, particularly from endogamous branches. Granted, my dad has some endogamous branches too, but my mom has a fairly recent Mennonite branch, who are highly endogamous, and many of these matches are from that branch. I don't know whether endgamy is maybe messing with Timber, or Timber is trying to remove endogamous segments, but whatever it's doing, it shouldn't be doing it so inconsistently, and frankly, I can't believe this issue has gone on for so long unresolved (except it's Ancestry, so I can believe it).

Edit (24 Sep 2020): Recently, AncestryDNA added to the DNA data they provide the "unweighted shared DNA" total - which is the amount of DNA you share with a match before Timber is applied. You can find it by clicking on either the longest segment data or the shared total for more information. This means the inconsistencies between the total and the longest segment make more sense, and so do the matches where I share more than my parent does, but I fear it's only going to cause more questions about what an unweighted total is, why there are two totals, why they are sometimes so drastically different, and which total do we rely on? Theoretically, we should be able to rely more on the weighted (Timber) total, but since I don't trust Timber, there is no easy answer to the last question.

But at least I can now see the original total with matches, which unsurprisingly is now much more consistent with the original total they share with my parent. There are a couple that still have a discrepancy of 6 cM or less, but that's somewhat nominal, I suppose.

Tuesday, February 18, 2020

FamilySearch's Unindexed Images

Recently, FamilySearch made an update to their website in attempts to draw more attention to the wealth of unindexed records in their catalog, all available for free. The records available by using the search or even the collections list are a drop in the water compared to their vast catalog. You have always been able to access the catalog by click on "Search" and then "Catalog" from the drop down menu. Although it's readily available, it generally does not get used by people who don't know what it is or how to use it. Due to the fact that the images are not indexed, you can't search them by name or other details, you have to manually browse the images. To find the right collection, you have to search by location, collection title or author, keyword, subject, or, if you know it, film number (because the catalog used to be for looking up film rolls you could order). It's usually best to search by location, but this also requires knowing what jurisdictional "level" records are held at. For example, probate records are usually held at county level, so if you're searching for probates in Bucks County, Pennsylvania, you have to search the location field for Bucks County, Pennsylvania. Looking under just Pennsylvania will not find collections cataloged at lower levels, like county or city.

FamilySearch Catalog

FamilySearch's answer to this was to create a new option under the "Search" menu at the top of the site called "Images". Here, they have tried to simplify a way to find unindexed collections by making the location search field the only option unless you click on "more" and again on "advanced", which allows you to also search by time period, record/collection type, film number, etc. But unfortunately, the results seem to be lacking a lot of existing collections and the ones it does include are organized in a very convoluted way.

In the catalog, if I search for Philadelphia, Pennsylvania, I get a list of record/collection types, which I can click on to see the individual collections and select any of them. Fairly straightforward. In the new "Images" search, I get a huge list of over 8,000 results, many of which seem to be from the same collection but for some inexplicable reason, are broken down into multiple results (it appears they are broken down by individual film roll number, even though the film number isn't included in the results list). This means, for example, there's dozens of listings of probate collections, sometimes even multiple listings for probates from the same year! How am I supposed to know which one to use? In the screenshot below, it shows how if I'm looking for a Philadelphia probate record from 1913, there's multiple listings for it, and they aren't duplicates, they're different records. This is going to be far more confusing for people than the catalog ever was.

Of course, I can narrow down the results by using those more advanced search options, like adding a year and record type (1913, Probate), but that doesn't solve the problem of there being multiple results just for 1913 Philadelphia Probate records. In fact, there's 115 results! How on earth am I supposed to know which one to use? There is literally nothing distinguishing them from each other except sometimes the image count.

Maybe I just haven't gotten the hang of it yet, but so far, I haven't had any luck finding actual records or collections I know exist in the catalog with this new "Images" search option. As far as I can tell, it looks like they are not including collections that are only visible at a Family History Center or affiliate library, which is a huge portion of their catalog.

I do not understand the purpose or function of this new Images search. They now have 3 different ways to find records on their website (for some, it was confusing enough as it was to have 2 different ways), and none of them include their entire database of records. Honestly, I suggest you skip this and just use the catalog or search engine as usual.

Monday, February 17, 2020

More Colorizing

After trying MyHeritage's new colorizing tool and then giving colorizing myself a whirl in Photoshop, I finally managed to test out another automatic colorizing tool at ColouriseSG. At first, it didn't work, or maybe I just didn't wait long enough, but today it worked!

I used the same first image I did at MyHeritage, the one I then colored myself too. My first impression with just this one photo is that it's better than the one at MyHeritage, but the human touch is still best.

Although it still looks a little like they just added a sepia tone to it, I felt like the colors were a bit more realistic than MyHeritage's, and the eyes appeared less brown. They could arguably be gray.

They also made attempt to add a touch of redness to the lips, but I'm not sure I love the effect, they look a little purplish.

Overall, the colorizing is better than MyHeritage, but I was very disappointed by the fact that ColouriseSG made the photo I uploaded smaller and therefore lesser quality. So if you use this tool, be prepared to sacrifice quality for coloring! For this reason, I decided to not even bother testing it with other images.

As always, if you want something done right, do it yourself.

Wednesday, February 12, 2020

MyHeritage's New Colorizing Photos Tool

Ever wanted to have your old black and white family photos colorized, but don't know how to do it yourself, and don't want to pay a professional an arm and a leg for it? Well, MyHeritage just launched a new free feature from DeOldify that will instantly colorize black and white photos. But how well does it work? I was a little skeptical and couldn't wait to test it out.

The photo I tested was just a simple portrait from about the 1880s. I was surprised how quickly it colorized, and I was pleased with how nice it looked but I realized that it actually just looked like a sepia tone had been added to it. I don't think that was the intention, and the skin tones did have a more fleshy color, but everything else looked like it'd just been sepia toned. A little disappointing.

Additionally, you may not be able to see it very well but this man's eyes were clearly light colored - blue, grey, hazel, etc. Something like that. But zooming in on his eyes shows the sepia/fleshy colors of the skin seems to have just been overlaid on his eyes, making them look brown, as if there was no attempt whatsoever to even color the eyes at all.

And while we're on the subject, they are beautiful eyes, aren't they? I've always thought this guy looks a bit like Leonardi DiCaprio.

To show you the difference between what a computer can do and what a human can do, here is my colorization of the same photo (including spot cleaning/restoration):

Back to MyHeritage. I then tried it with a group photo, thinking the multiple faces, garments, etc would add some variation to the possible colors. This was much more impressive:

Not bad for an automated system! Granted, the photo's highlights are a little blown out in places and some of the faces are blurred from too much movement, but colorizing system handled it pretty well in spite of that.

What's even better is that this is a high resolution image I used. I was a little worried that such an advanced tool available for free would only accept low resolution images (maybe charging for high resolution), but this was a fairly high resolution image and it not only accepted it, it still only took a few seconds to generate a color version. Unfortunately, although it will accept high resolution images, there is a limit to how many photos you can colorize if you have a free account. They don't tell you this anywhere but choose your photos carefully because you only get 10 of them, and deleting previous ones doesn't allow you anymore.

And the colorization still isn't perfect.

You may notice how it doesn't exactly take much risk or leaps with the colors it chooses. The men are in black suits, the women all seem to be in black and dark navy dresses, and the kids are all in white or neutral colors. You can probably understand why - I suppose they don't want a man's suit turning up bright red or something equally unrealistic for the era and gender. That's the downside to using a computer instead of a human who can distinguish these things and safely choose a greater variety of colors to apply.

Additionally, when I zoom in, there are areas that look like something almost resembling purple fringing except not along high contrast edges. You can see these sort of random purple splotches in the zoom-in below, particularly in her hair (pretty sure purple wasn't a trending hair color in 1880 Wisconsin), and sleeves. This is just a small area of the photo but these purple spots turn up everywhere if you look closely enough.

There's also some areas of the image that the computer seems to have some difficulty coloring. You'll note above how her one shoulder does not appear colored, or at least seems to be a different color from the rest of her dress - more of a sepia tone again. You see it most prominently in the skirt behind this child below:

At first, I thought maybe it was due to a shading variation in the original that may have fooled the system into thinking the difference in the shading meant a difference in color, but that is not the case. You can see in the original, there is no shading variation.

I guess the tool just sometimes has difficulty identifying edges and items so when it's unsure, it seems to do this. It's understandable, I suppose - after all, what is required to accomplish this in mere seconds must be an incredibly complex algorithm and coding, and it's provided for free, so I can forgive it for not being perfect.

Lastly, you may have noticed MyHeritage put their logo in the bottom right corner of the colorized image, and a little paint palette icon in the lower left. To avoid these, I'd recommend adding a superficial border to your image where the logo and icon will show up, which you can then crop off later.

I decided to try another photo (above), this time with more elements in it - horses, a house, etc to see if the same problems occurred, and they did. Once again, you can see all the clothing colors are very neutral. And again, you can see some weird rainbow-like discoloration at the top of the house.

And again, there were obviously some spots where the computer had difficulty colorizing or distinguishing between items - as you can see below, the hand on the shoulder looks like it either hadn't been colorized at all or it's blending in with the color of the other boy's jacket. Conveniently for the computer, it chose to "color" this boy's jacket grey!

So if you don't want any creepy dead hands like this, or your ancestors had blue eyes instead of brown, it's best to hire someone to do this for you instead of relying on an automated system. There are also Facebook groups with generous people who will colorize your photos for free, but be aware that Facebook doesn't easily support high resolution images like this does. This option from MyHeritage is still pretty impressive for what it is though, and if you're not bothered by the small problems that you might not even see very well when zoomed out, this will be amazing tool for many people. At the very least, I enjoyed seeing some color in the faces of my ancestors and relatives, as it seems to make them come alive a little more.

I haven't checked it out yet but there's an alternate colorizing option found at ColouriseSG. It appears to be free.