Monday, October 14, 2024

Using AI to Transcribe and Translate Documents

Being able to upload an image of a document to AI and have it instantly and accurately transcribe the difficult to read handwriting or translate a foreign language from an equally difficult to read handwriting is an exciting idea, but in practice it does not work well. Specifically, the biggest problem I find with it is that when it can't read something, it decides to just... make something up that seems in keeping with the type of document it is. Clearly, AI is smart enough to recognize what kind of document it is and supply the type of information you'd expect from that type of record, but it is apparently not smart enough to realize why the information needs to be accurate and not fabricated.

Take for example this will of Henry Rorer (above, click to enlarge). I uploaded it to ChatGPT and asked it to transcribe it. This is what it gave me:

In the name of God Amen the twenty-second Day of January in the year of our Lord one thousand seven hundred and eighty. I Henry Rorer of Bristol Township in the County of Philadelphia Yeoman being weak in body but of perfect mind and memory thanks be given unto God, therefore this my last Will and Testament. That is to say, I first and principally I resign my soul to God that gave it and my body to the Earth to be buried in a decent and Christian like manner at the discretion of my Executors herein after named and touching such worldly Estate wherewith it hath pleased God to bless me in this life. I give and dispose of the same in the following manner, and form.

Imprimis. I give and bequeath unto Mary Willard one Chest that she is to choose out of all my Chests and Boxes to be paid and delivered her by my Executors hereafter named.

Item. I give and bequeath unto Margaret my dearly beloved Wife all my Estate both real and personal during her widowhood, excepting such legacies as I shall hereafter mention in this my last will and testament and if in case my Wife should die or marry, then it is my will that my Estate shall be sold and divided equally amongst my children or the survivors of them share and share alike, and if either of my Children should die under age or without lawful issue that then my Estate shall be equally divided amongst the survivors of them, and I do hereby nominate and appoint my beloved Wife and my Friend Henry Rorer and George Rorer Executors of this my last Will and Testament, and I do hereby utterly revoke and disannul all former Wills and Testaments by me heretofore made Ratifying and Confirming this and no other to be my last Will and Testament.

In witness whereof I have hereunto set my hand and Seal the day and year above written. Signed Sealed published pronounced and declared by the said Henry Rorer as his last Will and Testament in the presence of us.

Phillip Hagey
Peter Shall

Henry Rorer (Seal)

At first glance, it looks great. But when you actually compare it to the original document, it quickly becomes clear some of it was just completely made up. Firstly, note that the date is slightly off. It was actually the 29th day of January, not the 22nd. It's understandable a single word might be misread, but it goes downhill from there. Where the transcription reads "thanks be given unto God, therefore this my last Will and Testament" the record actually says "do make and ordain this my last Will and Testament." The transcription then goes on to add in an entire sentence that isn't included in the record: "That is to say, I first and principally I resign my soul to God that gave it and my body to the Earth to be buried in a decent and Christian like manner at the discretion of my Executors herein after named..."

Although not accurate, it's at least not giving us any misleading information. The paragraph following it, however, is fabricating people and property never mentioned in the original. The transcription says:

Imprimis. I give and bequeath unto Mary Willard one Chest that she is to choose out of all my Chests and Boxes to be paid and delivered her by my Executors hereafter named.

The actual document says:
Imprimis It is my Will and I do order that in the first place all my just Debts & funeral charges be paid and satisfied.

ChatGPT did not even attempt to get it right. It's as though it ignored even the words it could make out in favor of a sentence that makes sense instead of just leaving blank spaces where it isn't able to read certain words. So the entire sentence was changed just to produce something that wasn't missing words. This could be really disastrous to your research if you're not paying attention. It fabricated a name where there was no person mentioned, making it look like Henry Rorer willed one Mary Willard her choice of a chest or box when there was no such person ever mentioned in his will and no mention of any chests or boxes either.

Additionally, the transcription looks surprisingly short. Even though the will is only about one page length, it seems like more than what the transcription included, and indeed, it is missing a lot of information from the rest of the document including many people who are of importance to Henry Rorer.

The same thing essentially happens when you ask ChatGPT to translate a foreign handwritten document, wherever it struggles to read the handwriting, including names, occupations, ages, etc, it simply makes up random data to replace it. It changes names and other data just to produce a complete transcription, whether accurate or not.

So here is your warning. AI is not there yet. Maybe there's more reliable ones out there, but I imagine they aren't free. Whatever FamilySearch is using to transcribe records in their experimental Full Text Search using some kind of OCR for handwriting seems to work much better than ChatGPT. But not all records are on FamilySearch, let alone under their Full Text Search.

As ever, the genealogist mantra remains true: you have to do the work yourself and verify everything.

Saturday, September 24, 2022

Eurogenes K13 Charts and Maps

I know for many people like myself who are visual people, seeing a map of where exactly each region covers can be really beneficial to understanding your Gedmatch Admixture results. There's already some official European maps available for Eurogenes EUtest V2 K15 from the Eurogenes blog, and you can sometimes find some unofficial ones for other calculators, but I haven't seen any for Eurogenes K13, so I gave it my best shot.

Anyone can make these maps with the right tools - the data is readily available from the Population Spreadsheets for each calculator. The difficult part is that the tool I used to create the map was in Google Spreadsheets Charts, which only recognizes modern country names. So I had to categorize every specific population into a modern country. Not easy considering many countries included several populations (I simply averaged them) and many of the populations span several countries (I just put the data in all relevant countries). But still, it wasn't easy, so it's safe to say these are very much unofficial maps, not endorsed by the Eurogenes creator. They are interactive, so hover over each region for the percentage. If anyone can recommend a better free mapping program, please let me know!

K13 North Atlantic Map - essentially a "Northwest Europe" region primarily including British Isles, Scandinavia, and Germanic Europe, though as you can see, it also includes most of Europe to some degree.

K13 Baltic Map - primarily the Baltic States (though data is missing for Latvia) and surrounding areas, though again, you can see most of Europe is included to some degree.

K13 West Med Map - primarily areas that border the western portion of the Mediterranean Sea (both Europe and North Africa), also including the eastern portion of the Mediterranean area to a lesser degree.

K13 West Asian Map - peaks in the Caucasus region, includes surrounding areas (does not include all of Russia, there's just no way to break down the maps more).

K13 East Med Map - primarily areas bordering the eastern portion of the Mediterranean Sea (leaning more heavily to the North African and Middle Eastern areas), though it appears to peak in Yemen, that's due to the Yemen Jewish sample getting the highest results in this category.

K13 Red Sea Map - mainly be areas bordering the Red Sea, though data in some African areas is missing, it peaks in the Arabian Peninsula and Horn of Africa, yet also includes all of North African to some degree.

K13 South Asian Map - peaks in India, Bangladesh, Nepal, and Pakistan, including surrounding areas in varying degrees.

K13 East Asian Map -coming soon.

K13 Siberian Map - coming soon.

K13 Amerindian Map - coming soon.

K13 Oceanian Map - coming soon.

K13 Northeast African Map - coming soon.

K13 Sub-Saharan Map - the area south of the Saharan desert, peaking in West Africa and Bantu regions, but also covering parts of North Africa to a much lesser degree. Data for some areas is missing.

I also created charts showing what percentage each sample population got for each region, so you can get an idea of what each region includes even for the areas I haven't done maps for yet:

K13 Population Chart (by population)

Despite having done all this, I do want to clarify that Gedmatch's Admixture calculators have not been updated in many years, and the reference panels used for them are very small in comparison to the consumer testing companies, so you should definitely take the results with a large grain of salt.

Wednesday, August 10, 2022

Understanding Marriage Bonds

It's really important that we understand what certain records are before we try to extract information about them. In your research, you may periodically come across marriage bonds, which are not be to confused with a marriage license or marriage banns. Recently, I noticed someone had mistook a marriage bond for a marriage license and was wrongly assuming that the high amount money quoted on the marriage bond (500 pounds, or $1,000 depending on the time period) was a marriage license fee. As a result, they thought only wealthy people could afford to get married at all. Here's an example below from Brunswick County, North Carolina, 1856 (click to enlarge).


This was admittedly a lot of money for that time period. In 1856, $1,000 would be equivalent of about $36,000 today. Imagine paying a marriage license fee of $36,000! Some people would have nothing left for the actual wedding ceremony and reception, and some people wouldn't be able to afford it at all. Imagine not being able to afford to get married! 

But that wasn't actually the case. Especially in history when it was really important for people to be married when they had kids. There's no reason the government would put such high restrictions on getting married for the average citizen to make it impossible or undesirable to get married, they would only be encouraging people to have children out of wedlock, and that doesn't make sense for the time period.

So what was this $1,000? A marriage bond was like a legal promise to marry, and the fee helped assure that there would be no legal impediments to the marriage because you and your co-signers only had to pay the fee if it was discovered that there was some sort of legal impediment to the marriage. One example of this is when the bride or groom was actually still married to someone else and not free to marry again. Since that didn't happen very often, most of the time, that $1,000 never needed to be paid. another example might be if the bride or groom was legally too young to marry without their parent's or guardian's approval, which they didn't have. If they lied about their age but then the truth was discovered so they couldn't legally marry, that fee would have to be paid.

Notice the terminology on the record:
"...now in case it should not hereafter appear that there is any lawful cause or impediment to obstruct such marriage, then the above obligation is to be void..."

So if there's no reason they can't be married, then the obligation to pay the fee is void (they don't have to pay the fee). 

This was to help prevent things like bigamy or young people lying about their age, because the bride or groom won't be likely to sign an agreement saying they owe $1,000 if it turns out they're already married to someone else or too young to marry. Some people might have done it, thinking no one would find out, but most people will definitely think twice about committing bigamy or lying about their age if it means paying a $1,000 fine, equivalent to $36,000 today. 

Doesn't it make a lot more sense that the government would impose a high fee if it was discovered someone was trying to or did commit bigamy, than the government trying to discourage and even prevent some people from getting married at all? This is why it's so important to understand records instead of making our own assumptions about what they mean. If your assumption doesn't seem to make much sense, there's probably a more logical explanation. If you don't know what something means, usually a quick Google will help, and in the case of a more niche or nuanced topics, there are typically experienced researchers who can help in various genealogy communities. There are plenty of communities on Facebook, Reddit, Wikitree, FamilySearch, etc.

Wednesday, July 20, 2022

A Chromosome Painter Comparison

Recently AncestryDNA added yet another feature to their DNA tools, a Chromosome Painter. It shows us which portions of our chromosomes they have identified as coming from which regions. It's found under SideView because there's also a breakdown by Parent 1 and 2. AncestryDNA joins 23andMe and FamilyTreeDNA in offering this feature (leaving MyHeritage as the odd man out), so I decided it was time to compare them.

For me, it's easiest to analyze my Italian ancestry since it's genetically more distinct from the rest of my ancestry which is Northwest European. At Ancestry, it's mostly identified correctly as Southern Italy (22%), and some as Northern Italy (9%). At 23andMe, it's primarily put into Italy (23.6%), with a little bit in Greece/Balkans (1.6%), Cyprus (3.2%), and Anatolia (1.6%). A few other less than 1% results in various Southern Europe/West Asia areas add up to only 1.2%. FamilyTreeDNA isn't quite as accurate, but at least they get most of it in Southern Europe, with 28% in Greece/Balkans and only 8% in the Italian Peninsula. However, as you can see, the totals add up to approximately the same amounts at each company: 31% at AncestryDNA, 31.2% at 23andMe, and 36% at FTDNA. This is consistent with the fact that my paternal grandmother was Italian and since my paternal grandfather tested, I know I share 18-19% (depending on the company) with him, leaving 31-32% I obviously got from my Italian grandmother (totaling the 50% from my dad).

Knowing that the percentages are fairly consistent, I wanted to see if the individual segments identified in these regions would be consistent across all companies as well. Overall, there was reasonable consistency between 23andMe and AncestryDNA, but FTDNA was all over the place. Let's look at it chromosome by chromosome, at least on a few of them (I don't think I need to go over all 22 of them).

Chromosome 1

AncestryDNA shows almost the full length of one side of chromosome 1 is Southern Italian (above), apart from a small portion at the end. 23andMe shows the first and last portions of the chromosome as Italian (below, first), with the middle bit missing, but interestingly, it seems at least some of that middle bit is identified as Cypriot (below, second).



Obviously, there's some overlap there and it's saying they're on opposite sides, but there's no way either Italian or Cypriot is coming from my mom's side since she is 100% Northwest European - British, German, Norwegian. So although it may not align perfectly, it does seem to suggest nearly the full length is coming from Italy/Cyprus, which is mostly consistent with AncestryDNA.


Unfortunately, FTDNA isn't as consistent with the other two companies. As you can see (above), the Southern European (light blue) portions are much more broken up, although I suppose one side does seem to be be mostly Southern European. The dark blue portions are Western European, FTDNA's chromosome painting doesn't offer any more breakdown than that and doesn't allow me to isolate the different regions in the visual.

Chromosome 2

On chromosome 2, AncestryDNA (below, first) and 23andMe (below, second) are almost exactly the same. They both put essentially the entire length of one side of the chromosome in Italy (Northern Italy at AncestryDNA), though there's a tiny sliver at the end at 23andMe which they deemed Broadly NW European, that's probably not a significant amount.



But here again, at FTDNA, the results are so inconsistent that it almost seems random (below).


Although one side has more Southern European (light blue) than the other, it's so broken up and looks so similar to chromosome 1, it just doesn't seem very reliable.

Chromosome 3

The results on chromosome 3 are exactly the same at AncestryDNA (below, first) and 23andMe (below, second), while FTDNA (below, third) is once again not as consistent.




I suppose FTDNA has a little more solid light blue than previous chromosomes, but it's not the full, unbroken length we see at AncestryDNA and 23andMe. That little sliver of green is Middle East.

Chromosome 4

This one is also very consistent between AncestryDNA and 23andMe, but for the opposite reason - both companies say no portion of either side of chromosome 4 comes from anywhere in Southern Europe or West Asia. Here we can analyze some of my Northwest European ancestry a little bit. 23andMe (below, second) says both sides of the chromosome are NW European, primarily from France/Germany (light blue), with smaller portions unable to narrow down and identified as Broadly NW European (grey/missing portions). At AncestryDNA (below, first), the entire length of one side is identified as Scottish (lime green), and the full length of the other side is categorized as Norwegian (light blue). This is extremely consistent with my known ancestry - my paternal grandfather was mostly German and Scottish, while my mom is part Norwegian. 23andMe only gives me a small percentage of Scandinavia though, with none of it on chromosome 4. None of this surprises me, since British, Germanic, and Scandinavian have a lot of genetic overlap and are difficult to tell apart, so who knows which company is right, but at least they both agree that both sides of chromosome 4 are NW European.



Not so much with FTDNA (below). Although they do identify most of both sides as Western Europe (dark blue), there are still portions of Southern Europe (light blue) seemingly randomly thrown in there.


At this point, it doesn't even seem worth carrying on comparing FTDNA. The rest of every chromosome is pretty much the same as what I've already shown here. Although the amounts of Western vs Southern Europe vary somewhat in vague keeping with the other two companies, the minimal variation is not worth going into a detailed comparison.

Chromosome 8

I want to skip ahead now to chromosome 8. Chromosomes 5, 6, and 7 are exactly the same at both AncestryDNA and 23andMe - both companies identified the exact same portions as either Italian or Southern European. Chromosome 8 is the first time we really see a significant difference in what the two companies report.



AncestryDNA (above, first) estimates that roughly the second half of one side of chromosome 8 is from Southern Italy (teal), while the first half is Scottish (lime green), and the other side is supposedly from Sweden/Denmark (pink). I don't have any ancestry from Sweden or Denmark, and AncestryDNA puts my combined Scandinavian percentage a little high, and my Germanic a little low, so I'm assuming it's probably coming from my German ancestry.

However, 23andMe (above, second) doesn't identify any Italian or Southern European (or West Asian, for that matter) on chromosome 8 at all. It estimates one side is entirely French/German (light blue), and the rest (grey) is mostly Broadly NW European with a small portion in Scandinavia.

So the portion AncestryDNA deems Italian, 23andMe says is Germanic.

Chromosomes 9-22

The rest of my chromosomes probably aren't worth going into visual detail, but here's a quick summary:

Chromosome 9 - AncestryDNA estimates the full length of one side is Northern Italian while 23andMe says only half of that side is Italian/Southern European.
Chromosome 10 - AncestryDNA claims about the first third of one side is Southern Italian, but 23andMe puts that portion (which is more like the first half of the chromosome) in Cyprus.
Chromosome 11 - AncestryDNA puts the full length of one side in Northern Italy, and 23andMe says most of that is Anatolian.
Chromosome 12 - AncestryDNA reports no Italian ancestry at all, but 23andMe says about half of one side is Italian.
Chromosome 13 - Again, nothing Italian from AncestryDNA and this time, 23andMe agrees (nothing from Southern Europe of West Asia).
Chromosome 14 - Ancestry estimates the full (tested) length of one side is Southern Italian. 23andMe says most of one side is either Italian or Arab/Egyptian/Levantine.
Chromosome 15 and 16 - Both companies agree the full (tested) length of one side is from Italy (specifically Southern Italy at AncestryDNA).
Chromosome 17 and 18 - Both companies agree there's no sign of Southern European or West Asian ancestry at all.
Chromosomes 19, 20, 21 - Both companies agree the full (tested) length of one side is from Italy (specifically Southern Italy at AncestryDNA).
Chromosome 22 - Both companies agree the full (tested) length of one side is from Italy (specifically Northern Italy at AncestryDNA).

Although there's some variations on a few chromosomes, overall I'd say the AncestryDNA and 23andMe are very consistent with each other. FTDNA was so inconsistent I literally gave up comparing it.

Here it's worth noting that 23andMe include ethnicity on the X chromosome where neither AncestryDNA or FTDNA do. To my knowledge, 23andMe are the only ones to use the X chromosome for ethnicity, though admittedly I don't know about MyHeritage since they neither offer a white paper or a chromosome painter. At 23andMe, it identifies one side of my X chromosome as French/German (my mom's side) and the other side as mostly Italian (dark blue) from my dad's Italian mother. The small portion at the end of that side is classed as Broadly Northwest European (lightest blue).

For the record, X-DNA makes up only about 5% of all your chromosomes. Some people point out that at 23andMe, a man's ethnicity report will include more DNA from his mother than his father because men only get X-DNA from their mother, not their father. Women get one X chromosome from their mother, one from their father, meaning it's still 50/50 just like with the autosomal chromosomes. Instead, men get one X chromosome from their mother and one Y chromosome from their father, but Y chromosomes aren't used for ethnicity (ever), so they will have slightly more DNA from their mother than their father on the ethnicity report. This is true, but it's worth noting that one X chromosome only amounts to about 2.5%, which is also within "noise" level amounts. So we're not talking about a significant or noteworthy difference. 

Wednesday, April 13, 2022

More Ethnicity Updates from AncestryDNA

AncestryDNA is maintaining their annual ethnicity updates, and it's a little early this year. But it's a new kind of update - rather than the usual changes to either the reference panel, or algorithms, or both, this one introduces a new feature called SideView. It is essentially phasing our DNA with our DNA matches to determine which ethnicities come from one parent or the other. It also means adjustments to our individual percentages, which should theoretically be an improvement. Phasing is usually done with parents or other very close family members, so I was skeptical about AncestryDNA doing it with our more distant matches. Your parents don't have to have tested for this new feature to work, but I was hopeful that my parents having tested would make it more accurate.

I find the parental breakdown (shown above) is very reliable - at least, it's as reliable as it can be given how accurate (or not) each of my kits are to begin with. For example, it correctly identified that my Norwegian and Italian ancestry are from opposite sides of my tree, and that is true: Norwegian is on my mom's side, Italian is on my dad's side. But it puts all of my Germanic ancestry on my dad's side because my mom's results still don't include Germanic despite having a great grandfather of full German descent (dozens of DNA matches on this branch confirm there's no NPE) and several other German branches further back. 

Looking at my mom's parental breakdown, shown above, (neither of her parents having tested), there is less reliability, that's partly due to the fact that her Norwegian ancestry is grossly exaggerated. She now gets a whopping 47% in Norway despite only having had one Norwegian (or Scandinavian) grandparent (so she should be about 25%, although it may vary, it shouldn't be more than about 36%). The majority of her Norwegian results does get put on one side, but that means there's not much room left for the other 25% on her mom's side that should be mostly English. Most of her English results get put on her other side, which isn't exactly wrong, she does have some English ancestry on that side too. But her dad's side should be mostly Germanic, and again, she gets no results in Germanic. If the percentages were more reliable to begin with, the split up would be more reliable too.

My dad's parental breakdown is very accurate, probably partly because his father tested but also because there is more genetic distinction between his mom and dad's sides - his mom was Italian, his dad mostly German and some Scottish and English. The split up (shown above) correctly shows all his Italian (Southern and Northern even though his ancestry is only Southern) plus trace amounts in Cyprus and Levant (obviously coming from his Italian ancestry) on one side, equaling exactly 50%. On the other side it correctly places all the rest of his ethnicities, although they are not all accurate - he wrongly gets results in Scandinavia where he has no known ancestry.

My paternal grandfather's parental breakdown is surprisingly very consistent with his tree, considering neither of his parents tested. On his paternal side, he is German with some English. On his maternal side, he's German and Scottish, with some English. Although his percentages are overall off (too much English, not enough German), the split up is accurately reflected here. English on both side, German on both sides (though barely), and Scottish on only one side.

My husband's parental breakdown (shown above) is also as accurate as possible given his percentage results and the fact that neither parent tested. It correctly identifies the majority of his Irish ancestry on one side and all of his English ancestry on the other side. His father was Irish, his mother was mostly English. He overall gets 40% in Ireland (a decrease from previous 47% which was much more accurate), and 36% is assigned to one side, his dad's side (shown below). His mother does have one Irish branch from much further back, which would amount to about 3%, and interestingly it puts 4% Ireland on his mom's side. Not bad. It then splits his Scottish results up more evenly on both sides - he does indeed have one Scottish 2nd great grandparent on his mother's side, so the Scottish portion being assigned to his father's side is obviously just due to the genetic overlap between Ireland and Scotland. His Scottish percentage is exaggerated to begin with: 22% when it should be more like 6% and probably no more than 12%, but interestingly the amount that is put on his mom's side is 9%, which is consistent with the Scottish 2nd great grandfather on his mom's side. Again, not bad, AncestryDNA, not bad. However, he has no Welsh or Norwegian ancestry, so those are obviously coming from genetic overlap with England.

So overall, the split ups among most of my kits were very reliable, but I can't say the percentages have benefited from the phasing. For example, my Scottish results wrongly shot up from 12% to 29% - based on my tree, the former is more accurate. And as mentioned, my mom is still lacking any Germanic results at all when she should be at least 12%, while her Norwegian results were already too high to begin with (43%) and just went up even more (47%). My dad's results didn't change by much, but he's now getting small percentages in incorrect regions that he didn't get before. In fact, most of my kits have seen this too - most of them now have small percentages in Ireland which they didn't have before. To my knowledge, all of my so-called "Irish" ancestors were actually Scots-Irish. So previous results were more accurate and the sudden appearance of Irish in results is disappointing (only because it's not accurate, not because there's anything wrong with being Irish, lol - obviously, my husband is half Irish).

Sunday, February 27, 2022

What exactly is the AGBI and how do we use it?

By now, you've probably occasionally come across a source known as AGBI, or American Genealogical Biographical Index, and maybe you've even attached it to your tree because it comes up as a hint for your ancestor, and everyone else has attached it to the same ancestor, and you don't want to miss out, right? But the details are usually vague, what is it even referencing and how do you know the records are for the right person?

The AGBI on Ancestry is basically an index of an index. It's referencing a big series of books that indexes tons of sources on early Americans. I don't know why Ancestry's index doesn't include all the data included in the book's index, but it doesn't. So to find the original source, you first have to look up the AGBI book index. You can find the books in a number of places online, I usually use the one at FamilySearch because it's free and accessible from home (it's not a restricted collection) - scroll all the way down (passed the listings that say off-site storage). 

Ancestry's index should include a volume and page number, although weirdly the books don't include page numbers, that's okay, because they're in alphabetical order. So simply open the volume you're looking for, and then find the name you're researching in alphabetical order. There will likely be several entries for the name you're looking for, but you can usually tell which one you need from the location and/or time period included in Ancestry's index. Even so, the AGBI books can be seemingly as vague as Ancestry's index is, and sometimes it takes some understanding and/or Googling of what it's referencing. 

For example, "Pa. Archives" is not a reference to the Pennsylvania State Archives, it's a reference to another series of books that includes primary records from early Pennsylvania called the Pennsylvania Archives - there should be a series number, a volume number, and a page number. The Pennsylvania Archives are also available online at various sites, Google Books, Archive.org, Ancestry, FamilySearch, Fold3, etc.

Another example is a source just called "Transcript" - this is a reference to the Boston Evening Transcript, a newspaper that ran a genealogy column from 1906-1941, including details on ancestors not exclusive to Boston or Massachusetts. Obviously, it's very much a secondary source, so I'd be careful with it, but it's available from Newspapers.com covering the years 1848-1914, and at American Ancestors covering 1911-1941 (select the Boston Evening Transcript from "Database").

You'll also see references to Revolutionary War Rolls and Pensions, those are fairly self explanatory. There's also states with "Heads Fams" which is referring to the names of the Heads of Families listed on the 1790 census. Since the 1790 census is already widely available online and probably already attached to your tree when appropriate, this isn't a very useful citation anymore. There's lots of other sources included in the AGBI, but usually they are self explanatory, or you can find out what they mean with a little bit of Googling.

It's really important that you find the original record the AGBI is referencing because the index is so vague, there's really no way to know from it whether it's for the right person you're researching or not. You may often find that once you look up the original record, it's not actually a reference to the person you're researching after all. Probably, researchers on Ancestry just attached it to their tree because the name and perhaps location and/or time period fit, without looking into it further probably because they simply don't know how to find the original. But particularly with common names, you can't assume that means it's for the correct individual, and now you don't have to be one of those people.

Tuesday, February 1, 2022

TellMeGen Review

New DNA companies with the option to upload raw DNA data from other companies keep popping up, and honestly, it's hard to keep track of them. But recently, I tested one called TellMeGen out of curiosity. They offer reports on disease risk, traits, wellness, ethnicity, and even offer matching with other testers, all for free. But you know the saying, "you get what you pay for"? That's a little bit true here.

I can't really complain about the health and traits reports, they are easy to understand but also include the technical data if you want to explore that. They include reports on a lot of common health issues people want to know about, like cancer and heart problems. They correctly identified me as probably lactose intolerant, and having decreased levels of vitamin D. There aren't many Monogenic Diseases included, but that may just be because I uploaded from another company, so the data may not be there for some reports. It's always best to test with the company when they offer their own kit, but I can't afford to be buying all the DNA tests available out there.

But what we're focusing on is the ethnicity report, and I have to say it was not very consistent with my known ancestry at all.



 French 43.7%
 Scandinavian 37.7%
 Turkish, Caucasian and Iranian 9.5%
 Bedouin 4%
 Egyptian, Levantine and Arab 3.2%
 Basque 1.1%
 Sardinian 0.5%
 Ashkenazi Jew 0.3%

The only location/population here that's accurate is Scandinavian. I do have Norwegian ancestry, but it is not this high - more like 12.5% (one great grandparent), and other companies usually peg it even lower than that, suggesting I may have gotten than expected from my Norwegian great grandfather. I'm guessing that my inflated Scandinavian percentage includes my British ancestry, knowing there is genetic overlap between them.

I do have some very early colonial French Huguenot ancestry too, from the 1600s - but it amounts to less than 1% of my tree, so I do not consider it relevant to DNA ethnicity reports. Probably, the high amounts in France are coming from my neighboring Germanic ancestry.

Adding up the Middle Eastern results, I get 16.7%, which I can only imagine is coming from my Italian ancestry, though why it didn't come up Italian, I can't say. But even adding the Basque and Sardinian results in for 18.3%, it still doesn't add up to my expected amount of Italian ancestry, which I've detailed here many times as being about 32%. 

Although the 0.3% Ashkenazi is small enough to just be noise, knowing how endogamous the Ashkenazi population is and how reliable results in this category normally are, and should be, getting any results at all in this population when I have no known Jewish ancestry and get no results for it at any other company, is just another point against TellMeGen.

In short, my results simply do not make much sense. While it's not totally unreasonable to get some results in neighboring regions, this is a bit extreme, and if I have to jump through hoops to make sense of my results, it's not a reliable report.

Wednesday, January 5, 2022

How Far Back Can We Go?

How far back can we research our family tree? It's a question that comes up periodically, especially from beginners who are sometimes overwhelmed with finding other people's trees going back very far. In practice, the answer will vary greatly depending on your tree. One branch might dead end in the 18th century, another might go back to the 16th century, and another yet might link to royalty and date back to Charlemagne (8th-9th century). But how far back is it plausible or realistic? At what point exactly do all these trees that date back to ancient times, mythical figures, Adam and Eve, etc become impossible? 

In general, the simplest answer for European research that has no known connection to royalty or nobility is that the 1500s is the end of the line. Like I say, not all your branches will likely even go back that far. Many times, the trail simply runs cold well before that point. For example, if you're American, you may never be able to find the specific origins of an immigrant ancestor. But if you're lucky, you may find a few branches here and there that go back to the 1500s.

Why the 1500s? Because that's when parish records began to be mandated in Europe. England was among the first to do so. In 1534, England separated from the Catholic Church and formed the Church of England, a protestant church, all so that Henry VIII could divorce his wife and marry his mistress. A mere 4 years later, England required that their brand new church begin keeping parish records of baptisms, marriages, and deaths/burials in 1538. Around the same time in 1540, the Lutheran Church also started requiring parish records be kept throughout their rapidly expanding churches in central Europe. In 1563 at the Council of Trent, the Catholic Church ordered that parishes keep baptism records, and not long after, other orders in various countries required marriages and deaths/burials. This meant essentially most churches in Catholic or Protestant Europe were expected to keep parish records from that point onward.

However, not all churches began adhering to these requirements right away. Many were slow to start keeping records, so depending on the location, you may not find parish records going back quite this far. In England, only 14.8% of parishes were keeping records by 1555, and that had risen to 54% by 1600. Most parishes in Italy didn't start keeping records until about 1595, but at the same time, a few Italian parishes (namely Palermo and Firenze) had taken it upon themselves to keep records long before the Catholic Church mandated it, sometimes going back as far as the 14th century! In France, general compliance wasn't until around the mid-1600s, and most Reformed churches were keeping records by 1650 as well. So in many places, you may only be able to go back to the 1600s.

Additionally, even when records were kept from this early on, not all have not survived to today. Many were damaged, lost, or destroyed over time, through natural disasters or war, or simply deteriorated over time. Some from the 16th century may have survived but there might be large gaps, making it impossible to connect the dots. 

So, how far back parish records go, and whether they've survived to today or not really depends on the specific location in Europe, but in general, it's safe to say the 1500s are the furthest plausible cut off point. Unless a branch has genuine links to royalty or nobility (and there's a lot of false links out there, so be careful), or you're among one of those rare exceptions of parish records going back to the 14th century, a tree extending beyond the 1500s is probably not accurate or reliable.

That doesn't mean every tree going back to the 1500s is reliable though, just that you would have to look more deeply to determine that. As I mentioned earlier, in some cases, your trail may dead end with your immigrant ancestor. If you can't find the specific origins of your immigrant European ancestor, it doesn't even matter how far back European parish records might go. And just because parish records may go back this far doesn't necessarily mean you can use them to reliably trace your lineage. Parish records are notoriously vague, containing very little information that can often make it impossible to say for sure if the records you're looking at are for the right person you're researching. Especially when you only have access to an index and not the original documents (which is common for early parish records like this). All it takes is more than one person with the same name born around the same time and location to completely throw their identity into question. Or one ancestor moving across the country with no record of it, and having no idea where to find them. Records can be so scarce, it's safe to say that if you're not descended from a somewhat notable lineage that was better documented, like wealthy land owners, merchants, or holding some sort of position like a sheriff (not necessarily nobility), there's a very good chance you'll never be able to reliably research back as far as the 1500s, even if the parish records exist.

Now, I keep saying "unless you have a genuine connection to royalty or nobility". So what if you do? Despite the amount of false links out there to royalty, some of them are genuine, and in those cases, it is possible (likely, even) to go back much further than the 1500s. Most royal and noble lines are well documented even before parish records were kept, because their titles were inherited, so documenting their lineages, especially male lineages, was very important. How far back they go depends entirely on the lineage, but many royal lines go back to Charlemagne, who ruled much of Western Europe in the late 8th century and early 9th century. Charlemagne's ancestry has also been traced back to his 5th great grandfather, a 6th century nobleman named Ansbert, whose wife, Blithilde (or alternate spellings), has been claimed to be the daughter of Chlothar I, but this is highly debatable. Ansbert is generally considered the end of the royal line, and not all lines will go back that far.

As you can see, even royal lines only go back to about the 6th century at the most, so proving European descent from BC is just not possible. There are many theories out there, but none are proven. Any tree that goes back to BC is highly speculative at best. That's not to say the family trees of people who lived in ancient history can not be traced within Antiquity, just that there's a known genealogical gap in between Antiquity and the Middle Ages.

Sources:

Thursday, December 30, 2021

Antenati's New Site Design - is it Actually Better?

Not too long ago, the Italian Archives website, Antenati, or "Ancestors Portal" got a face lift. At first, everyone raved about what an improvement it was, and admittedly, the ability to find and navigate to the records you're looking for has been a great improvement. Unfortunately, it has come at the cost of the Archives no longer supplying an inherent way to download full resolution images, which means we can't save copies of the records for our personal reference. We can take a screenshot, but to get the whole document, it will be too small to read. And if we zoom in to take a screenshot, we won't get the full document.

There is a way around this - but it's basically a hack, and who knows if it will remain available forever. It's also complicated and includes several steps involving the html code. But if you're brave enough, here's how to do it:

Step 1: Navigate to the image you wish to download, and click the icon with 3 horizontal lines located in the upper left corner of the image viewer window (see screenshot below, the icon is highlighted in yellow, click to enlarge).

Step 2: This will bring up a side bar on the left with information. Note the page number listed here (highlighted in yellow in screenshot below), because you'll need that later.

Step 3: Scroll down the side bar to the bottom where you'll see a link just below where it says "IIIF manifest". Click the link (highlighted in yellow in the screenshot below).

Step 4: Here's where it gets tricky. The link opens a page with a bunch of html coding. Different browsers seem to display it differently - if you're lucky, it will be organized with nested lines and different colors, making it easier to find what you're looking for, and the URLs will be clickable links. If you're unlucky like me, you'll see a big long block of text/coding with no links, no colors (shown below). What you're look for first is the page number you took note of in step 2. In the code, it will say "label":"pag. 31" (or whichever page number you're looking for). If you're having trouble finding it, you can use your browser's "Find" or "Find in Page" option to search for it (the screenshot below shows the page number 31 highlighted because I searched for it).

Step 5: Look just above your page number in the code for a URL that looks like this: https://iiif-antenati.san.beniculturali.it/iiif/2/wrZgxjz/full/full/0/default.jpg (URL is highlighted in grey and shows relation to the page number in the screenshot below) - the part that says "wrZgxjz" in my URL will be different for you. That's okay, that's what you want. That's the specific image code you're looking for. Copy and paste the whole URL (or click on it if it's clickable) into a new browser tab.

Step 6: If you're unlucky like me and the URL you copied and pasted includes duplicate slashes so you're getting a "Page not found" result, remove the duplicate slashes. The URL should look like this: https://iiif-antenati.san.beniculturali.it/iiif/2/wrZgxjz/full/full/0/default.jpg, not like this: https:\/\/iiif-antenati.san.beniculturali.it\/iiif\/2\/wrZgxjz\/full\/full\/0\/default.jpg or like this: https://iiif-antenati.san.beniculturali.it//iiif//2//wrZgxjz//full//full//0//default.jpg. If it's annoying to delete all those extra slashes every time, you can always just bookmark the proper URL and then just copy and paste the image code into the URL.

Step 7: Once you get the correct image to load, you can right click it and save the full resolution image.

Although the new site might be faster and easier to navigate, the inability to save crucial documents (which you'd think was the entire purpose of the site) is a huge step backwards. This hack is cumbersome, but for now, it's the only option. Good luck.

Saturday, December 18, 2021

Outlander and the Development of Surnames

For the record, I have never read the Outlander book series, but I have watched the TV show, and as a genealogist, there was one scene in particular about surnames that struck me as hugely unrealistic. It was the suggestion that a somewhat abandoned child never had a surname. There's a few reasons why this seems rather ridiculous to me.

This isn't really a spoiler, because it's not really a scene that's crucial to the plot, though I guess it's relevant to character development, I still don't consider it very important but other's might, so you've been warned, read at your own risk.

In the TV show, our male lead character, Jamie, comes across a young pickpocket, Claudel, while in Paris, who never knew who his parents were (his father was unknown, and his mother was probably one of the prostitutes at the brothel where he lives, but he's never been told which one). They have a conversation in which they both agree Claudel isn't a very good name, suggesting that it isn't a very strong or masculine name. Jamie recommends the name Fergus instead, and Claudel, now Fergus, agrees, and they return to Scotland with Fergus as Jamie's foster child. Fast forward to when Fergus is all grown up, now a young man marrying his sweetheart (who happens to be Jamie's step-daughter - it's best not to examine that too closely), and the officiant asks what his last name is. Everyone pauses as they remember he doesn't have a last name, before Jamie graciously steps in and gives Fergus his own surname, Fraser. It was a touching acknowledgement of their father/son relationship. While it's maybe understandable Fergus may not have been given a surname at birth (though even this is a stretch, more on that later), it doesn't make much sense that by this point in his life, he would not have developed an informal surname that he likely would have used on a marriage record.

Fergus gets married with Jamie's surname

Remember, at this point in history, there were no birth certificates and forms of identification. There were parish records, so there would have been a baptism record, but people didn't keep or travel with copies of their baptism record for identification. Names were more fluid and the concept of an official or legal name hadn't really developed yet. So even though no one gave him a surname at birth doesn't mean one wouldn't have developed over time. Claudel was probably a fairly common name in France at the time, and Paris was a big city, so there were likely other young boys named Claudel, and people would have needed a way to identify or distinguish them. If one of them didn't have a last name, they would have used a description, which would have then naturally been shortened overtime into a surname. Especially as Claudel/Fergus was a bit of a troublemaker, certainly, people would have been talking about him periodically, and needed a way to identify him.

The same would have happened in Scotland, if it never had a chance to happen in Paris, or if the names didn't follow him from France. There was likely more than one Fergus in the area of Scotland they lived in, so people would have identified him in the most obvious ways, probably either something like, "Jamie's ward", "Jamie's foster child", or "the French boy." Overtime, they likely would have gotten shortened to just "Ward", "Foster", or "French," and all three of them could have even been in use at the same time. Therefore, when asked what his surname was, he probably would have picked whichever one of the three he liked better, or was more common. When you think about it, these three names are all very real surnames in use today, and they sometimes come from situations exactly like this one. 

But that fact doesn't have to have ruined the moment. A name like "Ward" or "Foster" kind of identifies one's undesirable origins (and a name like French among a bunch of Scots kind of does too), so it could have been a source of pain or embarrassment for Fergus, and Jamie still could have stepped in and said, "No, it's not Ward, it's Fraser."

Also, let's not forget how easy it was to change or make up a new name. Again, there were no birth certificates and remember how Claudel simply changed his name to Fergus just by deciding that's what he wanted to be called now? So, if he's able to just make up a new first name, there's no reason he couldn't have just made up a surname at the same time. I'm not saying he definitely would have done that, just that if his lack of a surname was a source of embarrassment for him, he could have just given himself a surname, and picked whatever he wanted. This idea that he could change his first name on a whim, but oh no, he's stuck without a surname his whole life until Jamie steps in to save the day, is a little bit silly. 

And I'll even go so far as to say the fact that he didn't have a surname at birth was silly to begin with. Children born out of wedlock were typically given their mother's surname. Fergus didn't know who his mother was, which is weird to begin with, because she's supposedly the reason he was living at the brothel (so was he even actually "abandoned" to begin with if he was living with his mother?). I've been told that he had no surname because "nobody cared enough about him to give him a surname". But what about the fact that someone cared enough about him to let him live at the brothel that his mother worked at? Someone cared enough about him to feed and clothe him, a financial burden to them, when they didn't have to, when they could have dropped him off at an orphanage or foundling hospital (the first of which opened in Paris in 1670), which would have taken him in, and given him a surname. Foundlings were a common part of history, and they were given surnames by this point in time because by the 18th century, surnames had long been in full use in France and Western Europe. But that's not what happened. He was kept at the brothel, supposedly because someone wanted him there, probably his mother, so why would she not identify herself and give him her surname? Even if she died when he was an infant, someone else at the brothel must have cared enough to keep him there even when they didn't have to, and that person surely would have known who his mother was and told him, and therefore he would have used her name. 

So it just doesn't make any sense that even an abandoned child in 18th century Western Europe would not have a surname to begin with, and that even if for some reason he didn't, that one wouldn't have developed out of a nickname over the course of his young life.