Generative AI for Genealogy – Part VII

The Basic Observability Process

At the start of each request, the tracking resets. If this were a production system, I’d store everything in a NoSQL database for telemetry. But for now, the logs live in memory – raw, honest, and occasionally judgemental.

Here’s what gets tracked:

  • LLM input
  • LLM output
  • Any data transformations (normalisation, trimming, removing rogue Unicode goblins)
  • Before vs. after snapshots
  • Skill classification decisions
  • File‑switch detection
  • Tool usage
  • Execution order and timing

Below is an excerpt from a real log. I’ve redacted the long prompts (no one wants to read 20k tokens of SYSTEM instructions), and {some-data} represents tool output.

This was triggered by the question: “who is Bart’s mo…” (Yes, I cut it off mid‑word. Authenticity.)

001 @0ms ADJUSTMENT
   LLM Input Normalisation
     Before:
       |who is Bart's mom?|
     After:
       |who is Bart's mother?|

002 @8ms INTERPRETATION
   SWITCH_FILE: No existing indicators matched, will require LLM to decide

003 @27ms LLM_QUESTION
   
   ============================================================
   LLM REQUEST:
   Model: Llama 3.2 1B Instruct
   PROMPT:
      You must reply with whether the user is asking to load a new file (gedcom) or not.
      If the user appears to want to load a new file, reply "[YES]", else reply "[NO]". Do not attempt to answer the actual question.
      Example - "new gedcom" reply "[YES]"
      Example - "new file" reply "[YES]"
      Example - "load" reply "[YES]"
      Example - "different file" reply "[YES]"
      Example - "who is the pope?" reply "[NO]"
      Example - "where was bart born?" reply "[NO]"
      
   QUESTION:
      who is bart's mother?
   
004 @4320ms LLM_RESPONSE
   LLM RESPONSE:
      Llama 3.2 1B Instruct | prompt tokens: 158 | completion tokens: 3 | total tokens: 161
      #0 | finish reason: stop
         -> [NO]
   
005 @4333ms ADJUSTMENT
   LLM Input Normalisation
     Before:
       |Your role is a skill classifier. You must decide which skills apply ... [11, 10].|
     After:
       |Your role is a skill classifier. You must decide which skills apply ... [11, 10].|

006 @4336ms LLM_QUESTION
   
   ============================================================
   LLM REQUEST:
   Model: Llama 3.2 1B Instruct
   PROMPT:
      Your role is a skill classifier. You must decide which skills apply ... [11, 10].
   QUESTION:
      who is Bart's mother?
   
007 @6776ms LLM_RESPONSE
   LLM RESPONSE:
      Llama 3.2 1B Instruct | prompt tokens: 499 | completion tokens: 3 | total tokens: 502
      #0 | finish reason: stop
         -> [18]
   
008 @6777ms INTERPRETATION
   Llama 3.2 1B Instruct Skills required: 18. answers queries about parentage, including retrieving information about mothers, fathers, and both parents of a specified individual.

009 @6786ms ADJUSTMENT
   LLM Input Normalisation
     Before:
       |## Role Definition<LF>You are a genealogy assistant. Your job is to interpret human ..<LF>|
     After:
       |## Role Definition You are a genealogy assistant. Your job is to interpret human .. .|

010 @6787ms LLM_QUESTION
   
   ============================================================
   LLM REQUEST:
   Model: Reasoner v1
   PROMPT:
      ## Role Definition You are a genealogy assistant. Your job is to interpret human .. .
   QUESTION:
      who is Bart's mother?
   USER: 
      who is Bart's mother?
   ASSISTANT: 
      ``` get-mother:Bart ```
   SYSTEM: 
      Execution Result: mother: { some-data }
   

011 @17710ms LLM_RESPONSE
   LLM RESPONSE:
      Reasoner v1 | prompt tokens: 2085 | completion tokens: 7 | total tokens: 2092
      #0 | finish reason: stop
         -> ``` get-mother:Bart ```   

012 @17712ms TOOL_USE
   Tool Use:
      ``` get-mother:Bart ```   

013 @17723ms INTERPRETATION
   Tool was not code

014 @17738ms TOOL_USE
   Tool Use:
      get-mother:Bart
   Result:
      mother: { some-data }
   
015 @17783ms LLM_QUESTION
   
   ============================================================
   LLM REQUEST:
   Model: Reasoner v1
   PROMPT:
      ## Role Definition You are a genealogy assistant. Your job is to interpret human.. .
   QUESTION:
      who is Bart's mother?
   USER: 
      who is Bart's mother?
   ASSISTANT: 
      ``` get-mother:Bart ```
   SYSTEM: 
      Execution Result: mother: { some-data }
   
016 @19277ms LLM_RESPONSE
   LLM RESPONSE:
      Reasoner v1 | prompt tokens: 2159 | completion tokens: 8 | total tokens: 2167
      #0 | finish reason: stop
         -> Marge Simpson is Bart's mother.
   
TIME PER STEP:
001 | ADJUSTMENT 0ms
002 | INTERPRETATION 8ms
003 | LLM_QUESTION 19ms
004 | LLM_RESPONSE 4293ms
005 | ADJUSTMENT 13ms
006 | LLM_QUESTION 3ms
007 | LLM_RESPONSE 2440ms
008 | INTERPRETATION 1ms
009 | ADJUSTMENT 9ms
010 | LLM_QUESTION 1ms
011 | LLM_RESPONSE 10923ms
012 | TOOL_USE 2ms
013 | INTERPRETATION 11ms
014 | TOOL_USE 15ms
015 | LLM_QUESTION 45ms
016 | LLM_RESPONSE 1494ms

Notice the line:

SWITCH_FILE: No existing indicators matched…

That tiny message is gold. It tells me the system didn’t detect a file‑switch request, so it escalated to the LLM to decide. Without observability, I’d be guessing.

You’ll also see the wonderfully blunt:

Tool was not code

That’s because my LLM code supports C# code generation, including attempts to fix its own errors. (We’ll get into that later in the series. Bring popcorn.)

And then there’s the timing. Step 011 took 10.9 seconds. That’s… not ideal. It may eventually push me away from Reasoner v1. Slow models are like slow baristas: charming at first, infuriating by the third visit.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *