Generative AI for Genealogy – Part V

The Next Challenge: Age vs. “Age If Alive”

Two questions:

  1. “How old is Bart?”
  2. “How old would Bart be if still alive?”

Humans instantly see the difference. One is factual age at death; the other is hypothetical age if alive.

So I tried:

Classify into: AGE, STILL ALIVE, OTHER.
Pick the most specific.

And Llama confidently returned:

AGE

Of course it did.

Because why would we expect nuance from a model that earlier tried to solve classification with Python code it invented on the spot?

The Conclusion: Small Models Are Useful… Until They Aren’t

Here’s what I learned:

  • My goal was to reduce tokens sent to the main LLM
  • A small LLM seemed like the perfect tool
  • But small LLMs are brittle
  • They fail unpredictably
  • They can’t reliably do semantic similarity
  • They can barely do nuanced classification
  • And you can’t test every possible input

So for now, I’m stubbing out the “narrow down examples” feature. It’s a good idea, just not feasible with the current tiny models.

In the next part, we’ll giving the LLM control, because AI isn’t going to want to kill mankind personkind, after the way we treat it. #cough.

Next part: Generative AI for Genealogy – Part VI

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *