This follows on from:
- Generative AI for Genealogy – Introduction
- Generative AI for Genealogy – Data vs. GEDCOM files
- Generative AI for Genealogy – Part I
- Generative AI for Genealogy – Part II
- Generative AI for Genealogy – Part III
- Generative AI for Genealogy – Part IV
- Generative AI for Genealogy – Part V
- Generative AI for Genealogy – Part VI
- Generative AI for Genealogy – Part VII
- Generative AI for Genealogy – Part VIII
- Generative AI for Genealogy – Part IX
Tools, Code or Both?
LLMs are magical, versatile, and quite often on their worst behaviour. Give them a simple task and they’ll perform beautifully… until the exact moment you need them most, at which point they’ll rebel like teenagers who’ve just discovered sarcasm. They’ll invent new tools, ignore the syntax you lovingly crafted, or suddenly decide that JavaScript is the answer to everything — despite you explicitly telling them not to use JavaScript.
Basic Tools
I currently have 23 skills that mostly behave. Getting them to that point required:
- prompt engineering
- iteration
- trial
- error
- more error
- and the kind of patience normally reserved for parents of toddlers
They work best when the syntax is consistent:
get-{noun}
For example:
get-mother
Of course, a tool with no parameters is about as useful as a chocolate teapot in genealogy. Our tools need to talk about people and relationships, not perform unconditional actions. Even make-coffee has parameters — strength, size, and whether you’re feeling brave enough for decaf.
After experimentation, I settled on:
get-{noun}:parameters
And when multiple parameters are needed:
get-{noun}:param1:param2:param3
This defines the output grammar — the structured language the LLM must speak.
But grammar alone isn’t enough. Without an operating manual (the prompt), the LLM is like a new hire who shows up on day one, sits at their desk, and stares blankly at the wall until someone explains what the job actually is.
We must set expectations:
- what the LLM is responding to
- what we expect it to respond with
- how it should behave when data is missing
- how to avoid drifting into unrelated nonsense
Don’t assume it will magically do what you hoped. GPT might — sometimes — but with models like Reasoner v1 or Llama, you have better odds of winning the lottery while being struck by lightning and bitten by a shark.
To get deterministic behaviour (or at least behaviour that doesn’t resemble interpretive dance), we give it constraints like:
- “wait for data before answering”
- “if no data is returned, respond politely”
- “you are a genealogy assistant”
That last one is surprisingly important. Without anchoring, an LLM will happily wander off into astrophysics, baking tips, or a 17‑paragraph essay on the socio‑economic impact of turnips.
The schema concept was one of the biggest surprises. When I ask:
mother: { person }
The LLM uses the schema to interpret the data correctly.
But schemas alone aren’t enough. To overcome limitations, I introduced different skills — segregated behaviours — each with their own examples.
And the examples matter a lot.
Example
User: Who is Dave’s mother?
Response:
get-mother:Dave
Response format:
mother: person-data
Example response to “Who is Bart’s mother?”
mother: { name: "Marge Bouvier" }
LLM replies:
“Marge Bouvier is Bart’s mother.”
How it works:
- User asks: “Who is Jane Hill’s mother?”
- LLM responds:
get-mother:Jane Hill - My tool parses it, looks up Jane, and returns:
mother: { name: "Kathy Smith" } - The LLM knows to reply: “Kathy Smith is Jane Hill’s mother.”
It’s elegant — and far more reliable than trying to parse free‑form human language.
Now… Let’s contrast this with the code tool approach.
