I didn’t set out to build an AI genealogy assistant. It started, as these things often do, with a simple question I couldn’t answer.
One evening, my wife and I were staring at the Ancestry page for a distant relative – one of the 36,000 people we’ve painstakingly mapped over the years. We were trying to explain to our kids how this person was related to them. Not the vague “some kind of cousin”, but the actual relationship. The proper one. The one that makes genealogists feel quietly triumphant.
Except… we couldn’t. Not without a whiteboard, a cup of tea, and a level of mental gymnastics that felt disproportionate for a Tuesday night.
And that’s when it hit me.
We have all this data, decades of family history, thousands of relationships, migrations, marriages, mysteries – but the tools we use don’t help us think with it. They store facts beautifully. They visualise trees elegantly. But they don’t interpret. They don’t reason. They don’t answer the questions real people actually ask:
- “How exactly are we related?”
- “Which counties should I be searching if the borders changed?”
- “Where may this new DNA match fit in my tree?”
- “How are these two people related?”
- “Which cousin moved to Canada?”
- “What am I missing?”
- “What’s are the most common names?”
- “Which people are missing sources?”
That gap, between data and understanding is where this project began.
Not as a commercial idea. Not as an AI experiment. Just as a frustrated genealogist who wanted to ask a question and get a straight answer.
Over time, though, the idea grew legs. And arms. And a surprisingly opinionated brain.
Because once you start imagining a genealogy assistant that can actually reason, you realise how transformative it could be – not just for hobbyists like me, but for anyone trying to make sense of their family’s past.
So I decided to build it.
Not as a toy. Not as a “weekend hack”. But as a proper, thoughtfully‑engineered system that respects privacy, handles messy data, and gives accurate, trustworthy answers even to the questions you didn’t know you needed to ask.
This series is the story of that build.
What This Project Is (and Isn’t)
This isn’t a “look what I built in an afternoon with ChatGPT” post. It’s a deep dive into designing an AI‑powered genealogy assistant that:
- ingests GEDCOM files
- understands family relationships
- reasons about DNA matches
- interprets geography and historical boundaries
- answers questions in plain English
- and does all of this reliably, safely, and cost‑effectively
It’s also not open‑sourced (yet). This one has commercial potential, and I’m treating it accordingly.
But it is a transparent look at the thinking behind the system – the architecture, the trade‑offs, the constraints, the failures, the “why on earth does GEDCOM do it like that?” moments, and the design principles that shaped the assistant.
Why Build This At All?
Because genealogy is a perfect storm of:
- messy data
- inconsistent formats
- historical quirks
- fuzzy relationships
- and users who range from tech‑savvy to “I only use the iPad when the grandchildren visit”
It’s a domain crying out for an AI that can bridge the gap between information and insight.
And because let’s be honest, it’s fun. There’s something deeply satisfying about watching an LLM explain that your great‑grandmother’s cousin’s son is your “first cousin once removed”, and doing it with more confidence than any human ever has.
What’s Coming in This Series
Across the next few posts, I’ll walk through:
- the architecture
- the reasoning engine
- the normalisation pipeline
- the UI decisions
- the privacy model
- the cost constraints
- the regulatory considerations
- and the commercial thinking behind the product
I’ll also share the weird edge cases, the unexpected wins, and the moments where the assistant confidently hallucinated people whilst providing their fictious backstory.
A Note on Revenue, Regulation, and Reality
Yes, there’s a business model here. Yes, there are regulatory hurdles. Yes, there are costs to control. And yes, those parts need more work, which I’ll cover honestly as the series progresses.
But the heart of this project is simple:
I want to build an AI assistant that helps people understand their families.
And I want to do it properly.
A Teaser on What to Expect
Before we dive into the architecture in Part I, here’s a quick glimpse of what the assistant looks like today in requirement form; I’ll share the actual product later in the series.
Easy to use interface

Integration with Ancestry

I want this to be something everyone finds useful. Please add a comment if you’re interested in the idea, have questions you’d like it to help with.
Next part: Generative AI for Genealogy – Data vs. GEDCOM files
