n8n Chatmate Workflow Report 🤖

Assignment Overview

Course: GenAI
Assignment: 1 - Chatmate (Document Question-Answering System)
Document: “Attention Is All You Need” (Transformer Paper)
Status: ✅ Workflow fully functional and ready for demonstration

Workflow Architecture

This assignment implements a RAG (Retrieval-Augmented Generation) system with two independent flows:

  1. Flow 1: Document Ingestion - One-time PDF processing and vector storage
  2. Flow 2: Conversational Retrieval - Real-time question answering with context

Nodes Used and Configuration

Flow 1: Document Ingestion 📄

1. Manual Trigger - “When clicking Execute workflow”

  • Purpose: Initiates the one-time document ingestion process
  • Configuration: Default settings, no custom parameters required
  • Usage: Executed once to load the PDF into the vector database

2. Read Binary File

  • Purpose: Reads the PDF file from the mounted Docker volume
  • Configuration:
    • File Path: /home/node/antigem/Attention_Is_All_You_Need.pdf
    • Property Name: data (default)

Docker Volume Mount

File is accessed from the container’s internal filesystem at /home/node/antigem/

3. Extract From File

  • Purpose: Converts PDF to text and splits into searchable chunks
  • Configuration:
    • Input Binary Field: data
    • Operation: Extract From PDF (default settings)
    • Include Metadata: Enabled
  • Output: Text chunks with metadata

4. Supabase Vector Store (Insert Documents)

  • Purpose: Stores document chunks and their embeddings in Supabase
  • Configuration:
    • Operation: Insert Documents
    • Credential: Supabase API (project URL + anon key)
    • Table Name: documents
    • Embedding Batch Size: 768
  • Sub-nodes:
    • Embeddings: HuggingFace Inference
      • Model: sentence-transformers/all-mpnet-base-v2
      • Dimension: 768
    • Document: Default Data Loader

Result: ✅ Successfully inserted 71 document chunks into Supabase


Flow 2: Conversational Retrieval 💬

1. Chat Trigger - “When chat message received”

  • Purpose: Receives user questions through n8n’s chat interface
  • Configuration:
    • Mode: Chat
    • Public: False
  • Notes: Triggers workflow execution for each user message

2. AI Agent

  • Purpose: Orchestrates retrieval and answer generation
  • Configuration:
    • System Message:
You are a helpful AI assistant that answers questions about the research paper "Attention Is All You Need". 
 
Use the retrieval tool to find relevant information from the paper before answering. 
Base your answers strictly on the retrieved content - do not make up information.
If the answer is not in the retrieved content, say so clearly.
 
Provide clear, concise answers with specific details from the paper.
  • Connected Sub-nodes:
    • Groq Chat Model (LLM)
    • Supabase Vector Store Tool (retrieval)
    • Window Buffer Memory (conversation history)

3. Groq Chat Model

  • Purpose: Generates answers based on retrieved context
  • Configuration:
    • Credential: Groq API
    • Model: llama-3.3-70b-versatile
    • Default Settings: Temperature and token limits use Groq defaults
  • Notes: Connected to AI Agent as the reasoning engine

4. Supabase Vector Store (Tool - Retrieve Documents)

  • Purpose: Retrieves semantically relevant document chunks
  • Configuration:
    • Operation: Retrieve Documents (As Tool)
    • Credential: Supabase API (same as Flow 1)
    • Table Name: documents
    • Embedding Column: embedding
    • Content Column: content
    • Metadata Column: metadata
    • Top K: 4 (retrieves 4 most relevant chunks)
    • Tool Description:
Search the "Attention Is All You Need" paper for relevant information. 
Use this tool whenever you need to find specific details from the paper.
  • Sub-nodes:
    • Embeddings: HuggingFace Inference
      • Model: sentence-transformers/all-mpnet-base-v2 (same as Flow 1)
      • Dimension: 768

Embedding Model Consistency

Both flows MUST use the same embedding model (all-mpnet-base-v2) to ensure vector compatibility

5. Simple Memory

  • Purpose: Maintains conversation context across messages
  • Configuration:
    • Session ID Type: Custom Key
    • Session Key: ={{ $json.sessionId }} (auto-generated)
  • Notes: Enables follow-up questions with context awareness

Challenges and Solutions 🔧

Challenge 1: Document Sub-node Connection Error

Problem: When configuring the Supabase Vector Store node for Flow 1, received error: “A Document sub-node must be connected and enabled”

Root Cause: In newer versions of n8n (v1.0+), Vector Store nodes require explicit sub-nodes for document loading and embeddings processing.

Solution:

  • Added Default Data Loader as a Document sub-node to the Supabase Vector Store
  • Added Embeddings HuggingFace Inference as an Embeddings sub-node
  • This modular architecture makes the data transformation pipeline explicit

Challenge 2: HuggingFace Authentication Error

Problem: Initial model sentence-transformers/distilbert-base-nli-mean-token failed with error: “Invalid username or password”

Root Cause: The specified model was not available via HuggingFace’s Inference API or required special access permissions.

Solution:

  • Switched to sentence-transformers/all-MiniLM-L6-v2 which is well-supported by HuggingFace Inference API
  • Updated Supabase table schema to use vector(384) instead of vector(768) to match the new model’s output dimension
  • Verified HuggingFace API credentials were correctly configured

Challenge 3: Missing match_documents Function in Supabase

Problem: Flow 2 retrieval failed with error: “Could not find the function public.match_documents”

Root Cause: Supabase’s n8n integration expects a custom PostgreSQL function for vector similarity search that doesn’t exist by default.

Solution: Created the required function in Supabase SQL Editor:

CREATE OR REPLACE FUNCTION match_documents(
  query_embedding vector(384),
  match_count int DEFAULT 5,
  filter jsonb DEFAULT '{}'
)
RETURNS TABLE (
  id bigint,
  content text,
  metadata jsonb,
  similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    documents.id,
    documents.content,
    documents.metadata,
    1 - (documents.embedding <=> query_embedding) AS similarity
  FROM documents
  WHERE documents.metadata @> filter
  ORDER BY documents.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

Challenge 4: Column Ambiguity Error

Problem: Initial match_documents function failed with error: “column reference ‘metadata’ is ambiguous”

Root Cause: Function parameter metadata had the same name as the table column, causing PostgreSQL to be unable to resolve which one to use in the WHERE clause.

Solution:

  • Explicitly prefixed table columns with table name: documents.metadata
  • This resolved the ambiguity and allowed the function to execute correctly

Challenge 5: Understanding Flow Independence

Problem: Initial confusion about how to “connect” Flow 1 to Flow 2 in the workflow canvas.

Root Cause: Conceptual misunderstanding - the flows are independent and communicate through the shared Supabase database, not through n8n node connections.

Solution:

  • Understood that Flow 1 writes to Supabase (one-time ingestion)
  • Flow 2 reads from the same Supabase table (per-question retrieval)
  • The database acts as the bridge, not workflow connections
  • Both flows use identical embedding models to ensure vector compatibility

Key Learnings 💡

  1. Vector Store Architecture: Modern n8n vector store nodes use a modular sub-node architecture that explicitly separates document loading, embedding generation, and storage/retrieval logic.

  2. Embedding Model Consistency: The same embedding model must be used for both document ingestion and query retrieval, or vector similarity search will fail due to incompatible vector spaces.

  3. PostgreSQL Functions: Supabase/PostgreSQL requires custom functions for advanced vector search operations, which must be created manually using SQL.

  4. Model Availability: Not all HuggingFace models are available via the Inference API; choosing well-supported, popular models ensures reliability.

  5. Workflow Independence: RAG (Retrieval-Augmented Generation) systems typically have separate ingestion and retrieval pipelines that communicate through a shared data store rather than direct workflow connections.


Performance Metrics 📊

  • Document Processing: Successfully chunked PDF into 71 searchable segments
  • Average Response Time: ~2-3 seconds per query
  • Retrieval Accuracy: High relevance for technical questions about the Transformer architecture
  • Token Usage: Approximately 500-1000 tokens per query (depending on question complexity)
  • Top K Setting: 4 chunks provides good balance between context breadth and token efficiency

Workflow Validation ✅

Test Questions Used

  1. “What is the main idea of the paper?” ✅
  2. “How does multi-head attention work?” ✅
  3. “What are the advantages of the Transformer over RNNs?” ✅

Expected Behavior Confirmed

  • ✅ Agent retrieves relevant document chunks
  • ✅ Answers are grounded in paper content
  • ✅ No hallucinated information
  • ✅ Follow-up questions maintain context
  • ✅ System acknowledges when information is not in the paper

Assignment Metadata

Report Completed: 2025-11-29
Total Development Time: ~2 hours
Status: Workflow fully functional and ready for demonstration