Building a ReAct Agent from Scratch: A Dungeon Master Tutorial

[!NOTE] This post walks through Assignment 3 from the Engineering GenAI course, where we build a ReAct Agent that acts as a Dungeon Master, rolling dice to determine player outcomes.

Introduction: What is an AI Agent?

Traditional chatbots simply respond to inputs. AI Agents are differentβ€”they can:

  • Reason about what to do
  • Act by using external tools
  • Observe the results and incorporate them into their reasoning

The ReAct pattern (Reason + Act) formalizes this into a structured loop:

Thought: <reasoning about what to do>
Action: <tool to call>
Observation: <result from tool>
... (repeat as needed)
Final Answer: <response to user>

In this tutorial, we’ll build a Dungeon Master agent that uses a dice-rolling tool to determine success/failure of player actions.


Part 1: Defining the Tool 🎲

Task 1.1: The roll_dice Function

Every agent needs toolsβ€”functions it can call to interact with the world. Our DM needs to roll dice.

The Problem: Implement a function that returns a random integer between 1 and a given number of sides (default 20).

Solution:

import random
 
def roll_dice(sides=20):
    """
    Simulates rolling a die.
    Args:
        sides (int): The number of sides on the die (default 20).
    Returns:
        int: A random number between 1 and "sides".
    """
    return random.randint(1, sides)

Why this works: random.randint(a, b) returns a random integer N such that a <= N <= b. Simple, but essentialβ€”this is the β€œtool” our agent will call.

Verification:

results = [roll_dice(20) for _ in range(100)]
assert all(1 <= r <= 20 for r in results), "Dice roll out of bounds!"
assert len(set(results)) > 1, "Dice always returns the same number!"

Part 2: The System Prompt 🧠

Task 1.2: Teaching the Agent the ReAct Protocol

The LLM needs explicit instructions on how to think and act. This is done via a system prompt.

The Problem: Write a prompt that:

  1. Tells the LLM it’s a Dungeon Master
  2. Explains the available tool (roll_dice)
  3. Specifies the exact format for requesting actions

Solution:

SYSTEM_PROMPT = """
You are a text adventure Dungeon Master. Your goal is to narrate an 
exciting, immersive story for the player.
 
## Your Capabilities
You have access to the following tool:
- **roll_dice**: Rolls a 20-sided die and returns a number between 1 and 20.
 
## When to Use Tools
When the player attempts an action with an UNCERTAIN outcome 
(attacking, jumping, persuading, etc.), you MUST:
1. First, explain your reasoning using "Thought:"
2. Then, call the tool using "Action: roll_dice"
 
Example:

Thought: The player is trying to attack the goblin. This requires a roll to determine success. Action: roll_dice


## How to Interpret Results
After receiving the Observation (dice result), narrate the outcome:
- **1-5 (Critical Fail):** Action fails spectacularly
- **6-10 (Fail):** Action fails, but not catastrophically  
- **11-15 (Success):** Action succeeds
- **16-20 (Critical Hit):** Action succeeds spectacularly

## When NOT to Use Tools
For simple conversation, exploration with no risk, or describing the 
environment, respond directly without using tools.

Stay in character as a dramatic, engaging Dungeon Master!
"""

Key Design Decisions:

  1. Explicit format - The LLM must see Thought: and Action: so it knows the exact syntax
  2. Clear criteria - Specifies when to use tools vs. just respond
  3. Result interpretation - Gives the LLM context for narrating outcomes
  4. Examples - Concrete examples help LLMs follow patterns

Part 3: The Parser πŸ”

Task 2.1: Detecting Tool Requests

When the LLM responds, we need to detect if it’s requesting a tool.

The Problem: Write a function that returns True if the output contains "Action: roll_dice".

Solution:

def parse_response(llm_output):
    """
    Parses the LLM's output to look for "Action: roll_dice".
    
    Returns:
        bool: True if tool is requested, False otherwise.
    """
    return "action: roll_dice" in llm_output.lower()

Why case-insensitive? LLMs aren’t always consistent with capitalization. Using .lower() handles variations like:

  • Action: roll_dice
  • action: roll_dice
  • ACTION: ROLL_DICE

Verification:

assert parse_response("Thought: Attacking. Action: roll_dice") == True
assert parse_response("You walk down the hallway.") == False

Part 4: The Agent Loop πŸ”„

Task 2.2: Putting It All Together

This is the heart of any agentβ€”the execution loop that:

  1. Takes user input
  2. Queries the LLM
  3. Parses for tool requests
  4. Executes tools and feeds results back
  5. Returns the final response

The Problem: Implement run_turn() that handles the full ReAct cycle.

Solution:

def run_turn(user_input, history=[]):
    """
    Executes a single turn of the agent.
    """
    # Step 1: Build context from history + new input
    context = "\n".join(history) + f"\nUser: {user_input}\n"
 
    # Step 2: First call to LLM
    response = llm.generate(context)
    print(f"πŸ€– Agent: {response}")
 
    # Step 3: Check if LLM requested a tool
    if parse_response(response):
        # 3a. Execute the tool
        dice_result = roll_dice()
        print(f"🎲 [Tool Executed] roll_dice() returned: {dice_result}")
        
        # 3b. Create observation string
        observation = f"Observation: {dice_result}"
        
        # 3c. Append to context
        context = context + f"\nAgent: {response}\n{observation}\n"
        
        # 3d. Call LLM again with the observation
        response = llm.generate(context)
        print(f"πŸ€– Agent (Final): {response}")
 
    return response

How It Works:

User: "I attack the goblin!"
     β”‚
     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LLM Response:                           β”‚
β”‚ "Thought: Player attacking. Need roll." β”‚
β”‚ "Action: roll_dice"                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”‚
     β–Ό parse_response() β†’ True
     β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Execute Tool:                           β”‚
β”‚ roll_dice() β†’ 17                        β”‚
β”‚ Observation: 17                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”‚
     β–Ό Append to context, call LLM again
     β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LLM Final Response:                     β”‚
β”‚ "The die rolls... a critical hit!       β”‚
β”‚  The goblin falls!"                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Verification:

# Simple chat - no tools needed
output_simple = run_turn("Hello?")
assert "What do you want" in output_simple
 
# Action trigger - tools should execute
output_action = run_turn("I attack the goblin!")
assert "goblin falls" in output_action

Key Takeaways

1. Agents = LLM + Tools + Loop

The magic isn’t in any single componentβ€”it’s in the orchestration:

  • LLM provides reasoning
  • Tools provide capabilities
  • Loop connects them together

2. Prompt Engineering is Critical

The system prompt must be explicit about:

  • What tools exist
  • When to use them
  • What format to follow

3. The ReAct Pattern is Powerful

By structuring thoughts β†’ actions β†’ observations, we get:

  • Interpretability - we can see the reasoning
  • Controllability - we can intercept and modify
  • Reliability - deterministic tool execution

4. This Scales to Real Systems

Replace MockLLM with OpenAI/Claude/Llama, and this exact architecture powers:

  • ChatGPT with Code Interpreter
  • Claude with tool use
  • AutoGPT and similar autonomous agents

Complete Code

Here’s the full working implementation:

import random
 
# === TOOL ===
def roll_dice(sides=20):
    return random.randint(1, sides)
 
# === SYSTEM PROMPT ===
SYSTEM_PROMPT = """
You are a Dungeon Master. When outcomes are uncertain, use:
Thought: <reasoning>
Action: roll_dice
 
Then narrate based on the result (1-5: fail, 6-10: partial, 11-15: success, 16-20: crit).
"""
 
# === PARSER ===
def parse_response(llm_output):
    return "action: roll_dice" in llm_output.lower()
 
# === AGENT LOOP ===
def run_turn(user_input, history=[]):
    context = "\n".join(history) + f"\nUser: {user_input}\n"
    response = llm.generate(context)
    
    if parse_response(response):
        result = roll_dice()
        observation = f"Observation: {result}"
        context += f"\nAgent: {response}\n{observation}\n"
        response = llm.generate(context)
    
    return response

Next Steps

To extend this agent, consider:

  1. Multiple tools - Add check_inventory, cast_spell, search_room
  2. Memory - Persist conversation history across sessions
  3. Real LLM - Swap MockLLM for OpenAI’s API
  4. Streaming - Show responses as they generate
  5. Error handling - What if a tool fails?

Happy adventuring! πŸ‰