Building a ReAct Agent from Scratch: MockLLM vs Real LLM

NOTE

This post walks through Assignment 3 from the Engineering GenAI course, where we build a ReAct Agent that acts as a Dungeon Master. Weโ€™ll compare two implementations: MockLLM (for testing) and Real LLM (Groq, for production).

Introduction: What is an AI Agent?

Traditional chatbots simply respond to inputs. AI Agents are differentโ€”they can:

  • Reason about what to do
  • Act by using external tools
  • Observe the results and incorporate them into their reasoning

The ReAct pattern (Reason + Act) formalizes this into a structured loop:

Thought: <reasoning about what to do>
Action: <tool to call>
Observation: <result from tool>
... (repeat as needed)
Final Answer: <response to user>

In this tutorial, weโ€™ll build a Dungeon Master agent that uses a dice-rolling tool to determine success/failure of player actions. Weโ€™ll implement it two ways to understand the difference between testing and production approaches.


Part 1: The Foundation (Same for Both Approaches)

Both MockLLM and RealLLM share the same foundation: tools, system prompt, parser, and agent loop structure.

The Tool: roll_dice

Every agent needs toolsโ€”functions it can call to interact with the world.

import random
 
def roll_dice(sides=20):
    """
    Simulates rolling a die.
    Args:
        sides (int): The number of sides on the die (default 20).
    Returns:
        int: A random number between 1 and \"sides\".
    """
    return random.randint(1, sides)

Why it matters: When roll_dice() returns 17, thatโ€™s a real, deterministic resultโ€”not something the LLM imagined.

The System Prompt

The LLM needs explicit instructions on how to behave:

SYSTEM_PROMPT = \"\"\"
You are a text adventure Dungeon Master. Your goal is to narrate an 
exciting, immersive story for the player.
 
## Your Capabilities
You have access to the following tool:
- **roll_dice**: Rolls a 20-sided die and returns a number between 1 and 20.
 
## When to Use Tools
When the player attempts an action with an UNCERTAIN outcome 
(attacking, jumping, persuading, etc.), you MUST:
1. First, explain your reasoning using \"Thought:\"
2. Then, call the tool using \"Action: roll_dice\"
 
Example:

Thought: The player is trying to attack the goblin. This requires a roll to determine success. Action: roll_dice


## How to Interpret Results
After receiving the Observation (dice result), narrate the outcome:
- **1-5 (Critical Fail):** Action fails spectacularly
- **6-10 (Fail):** Action fails, but not catastrophically  
- **11-15 (Success):** Action succeeds
- **16-20 (Critical Hit):** Action succeeds spectacularly

## When NOT to Use Tools
For simple conversation, exploration with no risk, or describing the 
environment, respond directly without using tools.

Stay in character as a dramatic, engaging Dungeon Master!
\"\"\"

The Parser

Detects if the LLM is requesting a tool:

def parse_response(llm_output):
    """Returns True if LLM requested roll_dice, False otherwise."""
    return \"action: roll_dice\" in llm_output.lower()

The Agent Loop

The generic structure that works with any LLM:

def run_turn(user_input, history=[]):
    """Executes a single turn of the agent."""
    # Build context
    context = \"\\n\".join(history) + f\"\\nUser: {user_input}\\n\"
 
    # First LLM call
    response = llm.generate(context)
    print(f\"๐Ÿค– Agent: {response}\")
 
    # Check for tool use
    if parse_response(response):
        # Execute tool
        dice_result = roll_dice()
        print(f\"๐ŸŽฒ [Tool Executed] roll_dice() returned: {dice_result}\")
        
        # Add observation
        observation = f\"Observation: {dice_result}\"
        context = context + f\"\\nAgent: {response}\\n{observation}\\n\"
        
        # Second LLM call with result
        response = llm.generate(context)
        print(f\"๐Ÿค– Agent (Final): {response}\")
 
    return response

Key point: The loop doesnโ€™t care whether itโ€™s talking to MockLLM or RealLLMโ€”it just calls llm.generate().


Part 2: The Two Approaches - MockLLM vs RealLLM

Now hereโ€™s where things diverge. Letโ€™s compare how each approach implements the llm object and why they behave differently.

Approach 1: MockLLM (Testing Implementation)

What It Is

A fake LLM that uses pattern matching to simulate responses. No API required, no costs, purely for testing your agent loop logic.

Complete Code

class MockLLM:
    \"\"\"A fake LLM that responds predictably for testing logic.\"\"\"
    
    def generate(self, prompt):
        # If the prompt shows we just rolled a die (Observation), narrate
        if \"Observation:\" in prompt:
            return \"The die rolls... a critical hit! The goblin falls.\"
        
        # If the user says \"attack\", the LLM decides to roll
        if \"attack\" in prompt.lower():
            return \"Thought: The player is attacking. I need to check if they hit. Action: roll_dice\"
        
        # Default conversation
        return \"What do you want to do next?\"
 
# Initialize
llm = MockLLM()

How It Works

MockLLM uses simple pattern matching:

  1. Checks if \"attack\" is in the input โ†’ returns action request
  2. Checks if \"Observation:\" is in the prompt โ†’ returns hardcoded narrative
  3. Otherwise โ†’ returns generic response

No actual reasoning, just if/else logic.

Execution Flow: โ€œI attack the goblin!โ€

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ User: \"I attack the goblin!\"    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
             โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ llm.generate(\"User: I attack...\")       โ”‚
โ”‚ Checks: \"attack\" in prompt.lower()?     โ”‚
โ”‚ โ†’ YES                                   โ”‚
โ”‚ Returns: \"Thought:... Action: roll_dice\"โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
             โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ parse_response() โ†’ True         โ”‚
โ”‚ Execute: dice_result = 3        โ”‚ โ† Let's say we roll a 3
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
             โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ llm.generate(\"...Observation: 3\")        โ”‚
โ”‚ Checks: \"Observation:\" in prompt?        โ”‚
โ”‚ โ†’ YES                                    โ”‚
โ”‚ Returns: \"...critical hit! Goblin falls.\"โ”‚ โ† Hardcoded! Ignores the 3!
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The Problem

MockLLM doesnโ€™t read the dice value. Whether the observation is 3 (critical fail) or 20 (critical hit), it always returns:

\"The die rolls... a critical hit! The goblin falls.\"

It just checks for the keyword \"Observation:\" and returns a fixed string.


Approach 2: Real LLM (Production Implementation with Groq)

What It Is

A real neural network (Llama 3.3, 70 billion parameters) that:

  • Actually reads and understands the conversation
  • Follows the system prompt rules dynamically
  • Generates contextual responses based on dice results

Complete Code

from groq import Groq
import os
 
class RealLLM:
    \"\"\"A real LLM using Groq API for fast inference\"\"\"
    
    def __init__(self, api_key, model=\"llama-3.3-70b-versatile\"):
        self.client = Groq(api_key=api_key)
        self.model = model
        # Initialize conversation with system prompt
        self.messages = [{\"role\": \"system\", \"content\": SYSTEM_PROMPT}]
    
    def generate(self, prompt):
        \"\"\"Generate a response from Groq.\"\"\"
        # Add user message to conversation history
        self.messages.append({\"role\": \"user\", \"content\": prompt})
        
        # Get response from Groq
        response = self.client.chat.completions.create(
            model=self.model,
            messages=self.messages
        )
        
        assistant_message = response.choices[0].message.content
        
        # Add assistant response to conversation history
        self.messages.append({\"role\": \"assistant\", \"content\": assistant_message})
        
        return assistant_message
 
# Initialize
GROQ_API_KEY = os.getenv(\"GROQ_API_KEY\")
llm = RealLLM(api_key=GROQ_API_KEY)

How It Works

RealLLM uses API calls to a neural network:

  1. Maintains full conversation history in self.messages
  2. Sends entire conversation to Groq servers
  3. Llama 3.3 reads and understands the context
  4. Generates response following system prompt rules
  5. Stores response in conversation history

Actual reasoning, not pattern matching.

Execution Flow: โ€œI attack the goblin!โ€ (with dice = 3)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ User: \"I attack the goblin!\"    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
             โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ llm.generate(\"User: I attack...\")              โ”‚
โ”‚ Sends to Groq:                                 โ”‚
โ”‚ [{\"role\": \"system\", \"content\": SYSTEM_PROMPT},  โ”‚
โ”‚  {\"role\": \"user\", \"content\": \"User: I attack...\"}]โ”‚
โ”‚                                                โ”‚
โ”‚ Llama 3.3 reads:                               โ”‚
โ”‚ - System: \"For uncertain outcomes, roll dice\" โ”‚
โ”‚ - User: \"I attack the goblin\"                 โ”‚
โ”‚ - Reasoning: \"Attack = uncertain, need roll\"  โ”‚
โ”‚                                                โ”‚
โ”‚ Returns: \"Thought:... Action: roll_dice\"      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
             โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ parse_response() โ†’ True         โ”‚
โ”‚ Execute: dice_result = 3        โ”‚ โ† Let's say we roll a 3
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
             โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ llm.generate(\"...Observation: 3\")               โ”‚
โ”‚ Sends to Groq:                                  โ”‚
โ”‚ [...previous messages...,                       โ”‚
โ”‚  {\"role\": \"user\", \"content\": \"Observation: 3\"}]  โ”‚
โ”‚                                                 โ”‚
โ”‚ Llama 3.3 reads:                                โ”‚
โ”‚ - Previous: \"I need to roll\"                   โ”‚
โ”‚ - Observation: 3                                โ”‚ โ† Actually reads this!
โ”‚ - System rules: \"1-5 = Critical Fail\"           โ”‚
โ”‚ - Reasoning: \"3 is in 1-5 range, spectacular fail\" โ”‚
โ”‚                                                 โ”‚
โ”‚ Returns: \"You swing wildly and miss! Your      โ”‚
โ”‚ blade clatters harmlessly against the cave     โ”‚
โ”‚ wall. The goblin cackles with glee!\"           โ”‚ โ† Contextual!
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Why Itโ€™s Different

RealLLM reads the dice value (3) and applies the system prompt rule (1-5 = Critical Fail), generating a narrative that matches the roll.


Part 3: Side-by-Side Comparison

Code Differences

AspectMockLLMRealLLM
ImportNonefrom groq import Groq
Initializationllm = MockLLM()llm = RealLLM(api_key=...)
StorageNo state (stateless)self.messages (conversation history)
LogicPattern matching (if \"attack\" in prompt)Neural network inference
API CallNoneself.client.chat.completions.create(...)
CostFreePay-per-token
SpeedInstant~1-2 seconds (Groq is fast!)

Behavior Differences

Same scenario: โ€œI attack the goblin!โ€ with dice roll = 3

StepMockLLMRealLLM
First Responseโ€Thought:โ€ฆ Action: roll_dice""Thought:โ€ฆ Action: roll_diceโ€
Dice Roll33
Reads Value?โŒ No (ignores it)โœ… Yes (actually reads โ€œ3โ€)
Applies Rules?โŒ Noโœ… Yes (sees 3 โ†’ Critical Fail)
Final Narrativeโ€โ€ฆcritical hit! Goblin falls.โ€ โŒ WRONGโ€You swing wildly and missโ€ฆโ€ โœ… CORRECT

When to Use Which

Use MockLLM when:

  • ๐ŸŽ“ Learning agent architecture
  • ๐Ÿงช Testing agent loop logic
  • ๐Ÿ”ง Debugging prompt parsing
  • ๐Ÿ’ฐ Avoiding API costs during development
  • โšก Need instant responses for unit tests

Use RealLLM when:

  • ๐Ÿš€ Building production applications
  • ๐ŸŽฎ Demonstrating dynamic AI behavior
  • ๐Ÿ“Š Showcasing true agent capabilities
  • ๐ŸŒŸ Creating actual user experiences
  • ๐ŸŽฏ Need contextual, varied responses

Part 4: How RealLLM Actually Works (Deep Dive)

Letโ€™s trace through a complete interaction to see what happens under the hood.

Scenario: โ€œI attack the goblin!โ€ (dice rolls 17)

Step 1: Initialization

llm = RealLLM(api_key=GROQ_API_KEY)

What happens:

  • Creates Groq client connection
  • Initializes self.messages with system prompt

State:

self.messages = [
    {\"role\": \"system\", \"content\": \"You are a Dungeon Master...\"}
]

Step 2: First LLM Call

response = llm.generate(\"User: I attack the goblin!\")

Inside generate():

  1. Add user message:

    self.messages.append({
        \"role\": \"user\",
        \"content\": \"User: I attack the goblin!\"
    })
  2. Call Groq API:

    response = self.client.chat.completions.create(
        model=\"llama-3.3-70b-versatile\",
        messages=self.messages  # [system, user]
    )
  3. Groq processes:

    • Llama 3.3 reads system prompt + user input
    • Recognizes โ€œattackโ€ needs dice roll
    • Generates: \"Thought: The player is attacking... Action: roll_dice\"
  4. Store response:

    self.messages.append({
        \"role\": \"assistant\",
        \"content\": \"Thought: The player is attacking... Action: roll_dice\"
    })

State:

self.messages = [
    {\"role\": \"system\", \"content\": \"You are a DM...\"},
    {\"role\": \"user\", \"content\": \"User: I attack the goblin!\"},
    {\"role\": \"assistant\", \"content\": \"Thought:... Action: roll_dice\"}
]

Step 3: Tool Execution

dice_result = roll_dice()  # Returns 17
observation = \"Observation: 17\"

Step 4: Second LLM Call

response = llm.generate(\"...\\nObservation: 17\")

Inside generate() (second time):

  1. Add observation:

    self.messages.append({
        \"role\": \"user\",
        \"content\": \"Observation: 17\"
    })
  2. Call Groq API:

    response = self.client.chat.completions.create(
        model=\"llama-3.3-70b-versatile\",
        messages=self.messages  # [system, user, assistant, observation]
    )
  3. Groq processes:

    • Llama 3.3 reads full conversation
    • Sees Observation: 17
    • Applies rule: โ€œ16-20 = Critical Hitโ€
    • Generates contextual narrative: \"Your blade arcs through the air with devastating precision...\"
  4. Store response:

    self.messages.append({
        \"role\": \"assistant\",
        \"content\": \"Your blade arcs through the air...\"
    })

Final State:

self.messages = [
    {\"role\": \"system\", \"content\": \"You are a DM...\"},
    {\"role\": \"user\", \"content\": \"User: I attack the goblin!\"},
    {\"role\": \"assistant\", \"content\": \"Thought:... Action: roll_dice\"},
    {\"role\": \"user\", \"content\": \"Observation: 17\"},
    {\"role\": \"assistant\", \"content\": \"Your blade arcs through the air...\"}
]

This conversation history persistsโ€”if the player makes another action, the entire history is sent to Groq, giving the LLM full context!

Why Conversation History Matters

Without history (MockLLM):

# Each call is independent, no memory
generate(\"User: I attack\")       # Returns action request
generate(\"Observation: 17\")     # Doesn't know about previous attack!

With history (RealLLM):

# Each call builds on previous
self.messages:
  1. System: \"You're a DM\"
  2. User: \"I attack\"
  3. Assistant: \"Action: roll_dice\"
  4. User: \"Observation: 17\"          โ† LLM knows this relates to the attack!
  5. Assistant: \"Critical hit! ...\"

The LLM can reason: โ€œThey attacked, rolled 17, thatโ€™s a crit hit, I should narrate accordingly.โ€


Part 5: Practical Examples

MockLLM: Always the Same

No matter what we roll, MockLLM gives the same output:

llm = MockLLM()
 
# Roll 1 (critical fail)
run_turn(\"I attack the goblin!\")
# ๐ŸŽฒ [Tool Executed] roll_dice() returned: 2
# ๐Ÿค– Agent (Final): The die rolls... a critical hit! The goblin falls.
 
# Roll 2 (critical fail again)
run_turn(\"I attack the goblin!\")
# ๐ŸŽฒ [Tool Executed] roll_dice() returned: 4
# ๐Ÿค– Agent (Final): The die rolls... a critical hit! The goblin falls.
 
# Roll 3 (actual critical hit)
run_turn(\"I attack the goblin!\")
# ๐ŸŽฒ [Tool Executed] roll_dice() returned: 20
# ๐Ÿค– Agent (Final): The die rolls... a critical hit! The goblin falls.

Same narrative regardless of dice value! โŒ

RealLLM: Dynamic Responses

With Groq, narratives change based on the actual roll:

llm = RealLLM(api_key=GROQ_API_KEY)
 
# Critical Fail (dice = 3)
run_turn(\"I attack the dragon!\")
# ๐ŸŽฒ [Tool Executed] roll_dice() returned: 3
# ๐Ÿค– Agent (Final): You swing your sword with confidence, but your foot 
# catches on a stone! You stumble forward, your blade clattering harmlessly 
# against the cave wall. The dragon roars with amusement!
 
# Fail (dice = 8)
run_turn(\"I attack the dragon!\")
# ๐ŸŽฒ [Tool Executed] roll_dice() returned: 8
# ๐Ÿค– Agent (Final): Your strike is solid, but the dragon is quicker than 
# expected. It dodges, and your sword whooshes through empty air.
 
# Success (dice = 13)
run_turn(\"I attack the dragon!\")
# ๐ŸŽฒ [Tool Executed] roll_dice() returned: 13
# ๐Ÿค– Agent (Final): Your blade finds its mark! You slice across the dragon's 
# scales, drawing green blood. It screeches in pain and stumbles backward.
 
# Critical Hit (dice = 20)
run_turn(\"I attack the dragon!\")
# ๐ŸŽฒ [Tool Executed] roll_dice() returned: 20
# ๐Ÿค– Agent (Final): With perfect precision, your sword pierces through the 
# dragon's armor! It crumples to the ground with a final gasp. Victory is yours!

Different narratives for each dice value! โœ…


Conclusion

What We Learned

1. MockLLM vs RealLLM is about purpose:

  • MockLLM: Testing your agent architecture without API costs
  • RealLLM: Actual production behavior with dynamic reasoning

2. The agent loop stays the same:

response = llm.generate(context)  # Works with both!

Good abstraction means switching from Mock to Real is just changing one line.

3. Conversation history is crucial:

  • MockLLM: Stateless (no memory)
  • RealLLM: Stateful (self.messages tracks everything)

4. Real LLMs actually read and apply rules:

  • They see Observation: 17
  • Apply 16-20 = Critical Hit
  • Generate contextual narrative

Try It Yourself

See the full Groq implementation: EngGenAI_assignment_3_agents_grok.ipynb

The notebook includes:

  • โœ… Complete setup instructions
  • โœ… API key configuration
  • โœ… Interactive demos
  • โœ… Forced dice scenarios (test all outcome ranges)

Next Steps

To extend this agent:

  1. Multiple tools - Add check_inventory, cast_spell, search_room
  2. Memory persistence - Save conversation history to database
  3. Streaming responses - Show text as it generates
  4. Error handling - What if Groq API fails?
  5. Multi-turn context - Remember what happened 3 turns ago

The architecture you learned here powers real production systems like ChatGPTโ€™s Code Interpreter, Claudeโ€™s tool use, and autonomous research agents.

Happy adventuring! ๐Ÿ‰