n8n Workflow Assignment - Complete Solution Guide

Chatmate: Document Question-Answering System

Assignment Overview

This solution provides a complete guide to building a document question-answering system using n8n. The system allows users to ask questions about the “Attention Is All You Need” paper and receive accurate, context-grounded answers.

Overview

This solution provides a complete guide to building a document question-answering system using n8n. The system allows users to ask questions about the “Attention Is All You Need” paper and receive accurate, context-grounded answers.

System Architecture

The workflow consists of two independent flows:

  1. Flow 1: Document Ingestion - Runs once to process and index the PDF
  2. Flow 2: Conversational Retrieval - Runs for every user question

Prerequisites

Required Services & API Keys

  1. Supabase Account

    • Create a free account at supabase.com
    • Create a new project
    • Note your project URL and API key (anon/public key)
  2. HuggingFace Account

    • Create account at huggingface.co
    • Generate API token from Settings → Access Tokens
    • Model: sentence-transformers/distilbert-base-nli-mean-token
  3. Groq Account

    • Create account at groq.com
    • Generate API key from console
    • Recommended model: llama-3.1-70b-versatile or mixtral-8x7b-32768
  4. n8n Setup

    • Docker installation with file system access
    • Version 1.0+ recommended for AI Agent node

Docker Setup

Run n8n with file system access:

docker run -it --rm \
  --name n8n \
  -p 5678:5678 \
  -v ~/.n8n:/home/node/.n8n \
  -v /path/to/your/pdfs:/data \
  n8nio/n8n

Replace /path/to/your/pdfs with the directory containing your PDF file.

Step-by-Step Implementation

A) Prepare the Data

1. Download the PDF

# Download the "Attention Is All You Need" paper
wget https://arxiv.org/pdf/1706.03762.pdf -O attention_is_all_you_need.pdf
 
# Or use curl
curl -L https://arxiv.org/pdf/1706.03762.pdf -o attention_is_all_you_need.pdf

Place the PDF in a directory accessible to your Docker container (e.g., /data/attention_is_all_you_need.pdf).

2. Set Up Supabase Vector Store

In your Supabase project, create a table for storing document embeddings:

-- Enable the pgvector extension
create extension if not exists vector;
 
-- Create the documents table
create table documents (
  id bigserial primary key,
  content text,
  metadata jsonb,
  embedding vector(768)
);
 
-- Create an index for faster similarity search
create index on documents 
using ivfflat (embedding vector_cosine_ops)
with (lists = 100);

Embedding Dimension

The embedding dimension must be 768 to match the HuggingFace model output.

B) Build Flow 1: Document Ingestion

This flow runs once to process and store the PDF.

Node 1: Manual Trigger

  • Node Type: Manual Trigger
  • Purpose: Starts the workflow when you click “Execute Workflow”
  • Configuration: No configuration needed

Node 2: Read Binary File

  • Node Type: Read Binary File
  • Purpose: Reads the PDF from disk
  • Configuration:
    • File Path: /data/attention_is_all_you_need.pdf
    • Property Name: data (default)

File Path Verification

Verify the file path matches your Docker volume mount. Use absolute paths.

Node 3: Extract From File

  • Node Type: Extract From File
  • Purpose: Converts PDF to text and splits into chunks
  • Configuration:
    • Input Binary Field: data
    • Operation: Extract From PDF
    • Options:
      • Chunk Size: 1000 (characters per chunk)
      • Chunk Overlap: 200 (overlap between chunks)
      • Include Metadata: true

Why these settings?

  • Chunk size of 1000 balances context and specificity
  • Overlap ensures important information isn’t split across boundaries
  • Metadata helps track source locations

Node 4: Embeddings HuggingFace Inference

  • Node Type: Embeddings HuggingFace Inference
  • Purpose: Converts text chunks into vector embeddings
  • Configuration:
    • Credential: Add HuggingFace API credentials
    • Model: sentence-transformers/distilbert-base-nli-mean-token
    • Output Dimension: 768

Credential Setup:

  1. Go to Credentials → Add Credential
  2. Select “HuggingFace API”
  3. Enter your API token
  4. Save as “HuggingFace Inference”

Node 5: Supabase Vector Store (Insert)

  • Node Type: Supabase Vector Store
  • Purpose: Stores embeddings in Supabase
  • Configuration:
    • Credential: Add Supabase credentials
    • Operation: Insert Documents
    • Table Name: documents
    • Embedding Column: embedding
    • Content Column: content
    • Metadata Column: metadata

Credential Setup:

  1. Go to Credentials → Add Credential
  2. Select “Supabase API”
  3. Enter:
    • Host: Your Supabase project URL (e.g., https://xxxxx.supabase.co)
    • Service Role Key: Your Supabase anon/public key
  4. Save as “Supabase”

Testing Flow 1:

  1. Click “Execute Workflow”
  2. Verify each node shows green checkmark
  3. Check the final node output - should show inserted document count
  4. Verify in Supabase: SELECT COUNT(*) FROM documents;

C) Build Flow 2: Conversational Retrieval

This flow runs every time a user asks a question.

Node 1: Chat Trigger

  • Node Type: When chat message received
  • Purpose: Receives user questions via n8n chat interface
  • Configuration:
    • Mode: Chat
    • Public: false (or true if you want public access)
    • Options: Default settings

Node 2: AI Agent

  • Node Type: AI Agent
  • Purpose: Orchestrates retrieval and answer generation
  • Configuration:
    • Chat Model: Connect to Groq Chat Model node
    • Tools: Connect to Supabase Vector Store (Retrieve) node
    • Memory: Connect to Window Buffer Memory node
    • Options:
      • System Message:

        You are a helpful AI assistant that answers questions about the research paper "Attention Is All You Need". 
        
        Use the retrieval tool to find relevant information from the paper before answering. 
        Base your answers strictly on the retrieved content - do not make up information.
        If the answer is not in the retrieved content, say so clearly.
        
        Provide clear, concise answers with specific details from the paper.
        

Node 3: Groq Chat Model

  • Node Type: Groq Chat Model
  • Purpose: LLM for generating answers
  • Configuration:
    • Credential: Add Groq API credentials
    • Model: llama-3.1-70b-versatile (recommended) or mixtral-8x7b-32768
    • Temperature: 0.3 (lower = more focused, higher = more creative)
    • Max Tokens: 1024

Credential Setup:

  1. Go to Credentials → Add Credential
  2. Select “Groq API”
  3. Enter your Groq API key
  4. Save as “Groq”

Model Selection:

  • llama-3.1-70b-versatile: Best quality, slower
  • mixtral-8x7b-32768: Good balance of speed and quality
  • llama-3.1-8b-instant: Fastest, lower quality

Node 4: Supabase Vector Store (Retrieve)

  • Node Type: Supabase Vector Store
  • Purpose: Retrieves relevant document chunks
  • Configuration:
    • Credential: Use same Supabase credential

    • Operation: Retrieve Documents (As Tool)

    • Table Name: documents

    • Embedding Column: embedding

    • Content Column: content

    • Metadata Column: metadata

    • Top K: 4 (number of chunks to retrieve)

    • Tool Description:

      Search the "Attention Is All You Need" paper for relevant information. 
      Use this tool whenever you need to find specific details from the paper.
      

Why Top K = 4?

  • Balances context breadth and token usage
  • 4 chunks × 1000 chars ≈ 4000 chars of context
  • Adjust based on question complexity

Node 5: Window Buffer Memory

  • Node Type: Window Buffer Memory
  • Purpose: Maintains conversation history
  • Configuration:
    • Session Key: {{ $json.sessionId }} (auto-generated by chat trigger)
    • Window Size: 5 (remember last 5 exchanges)

Connection Flow:

Chat Trigger → AI Agent
              ↓
         Groq Chat Model (connected to AI Agent)
              ↓
         Supabase Vector Store Tool (connected to AI Agent)
              ↓
         Window Buffer Memory (connected to AI Agent)

Testing Flow 2:

  1. Click “Test Workflow” or use the chat interface
  2. Ask a question: “What is the main idea of the paper?”
  3. Verify the agent retrieves documents and generates an answer
  4. Check the response is grounded in the paper content

Common Challenges and Solutions

Challenge 1: Docker File Access Issues

Problem: n8n can’t find the PDF file

Solution:

  • Verify Docker volume mount: docker inspect n8n | grep Mounts
  • Use absolute paths in the Read Binary File node
  • Check file permissions: chmod 644 attention_is_all_you_need.pdf

Challenge 2: Supabase Connection Errors

Problem: “Could not connect to Supabase”

Solutions:

  • Verify project URL format: https://xxxxx.supabase.co (no trailing slash)
  • Use the anon/public key, not the service role key
  • Check if pgvector extension is enabled: SELECT * FROM pg_extension WHERE extname = 'vector';

Challenge 3: Embedding Dimension Mismatch

Problem: “Dimension mismatch error”

Solution:

  • Ensure Supabase table uses vector(768)
  • Verify HuggingFace model is sentence-transformers/distilbert-base-nli-mean-token
  • Recreate table if dimension is wrong

Challenge 4: Poor Answer Quality

Problem: Answers are vague or incorrect

Solutions:

  • Increase Top K to retrieve more context (try 5-6)
  • Improve system message with more specific instructions
  • Lower temperature (0.1-0.3) for more focused answers
  • Verify chunks are being retrieved (check AI Agent execution logs)

Challenge 5: Rate Limiting

Problem: “Rate limit exceeded” errors

Solutions:

  • HuggingFace: Use a paid tier or wait between requests
  • Groq: Check your rate limits in the console
  • Add error handling and retry logic

Challenge 6: Memory Not Working

Problem: Agent doesn’t remember previous questions

Solution:

  • Verify Window Buffer Memory is connected to AI Agent
  • Check session ID is being passed correctly
  • Increase window size if needed

Workflow Optimization Tips

1. Chunk Size Optimization

Experiment with different chunk sizes based on your use case:

  • Small chunks (500-800): Better for specific facts
  • Medium chunks (1000-1500): Balanced approach (recommended)
  • Large chunks (2000+): Better for conceptual questions

2. Retrieval Tuning

Adjust Top K based on question type:

  • Simple factual questions: Top K = 2-3
  • Complex conceptual questions: Top K = 5-7
  • Exploratory questions: Top K = 8-10

3. Prompt Engineering

Enhance the system message for better results:

You are an expert AI assistant specializing in the "Attention Is All You Need" paper.

RETRIEVAL STRATEGY:
- Always use the retrieval tool before answering
- Retrieve multiple times if the question has multiple parts
- If initial results are insufficient, try rephrasing the search query

ANSWER GUIDELINES:
- Quote specific sections when relevant
- Explain technical concepts clearly
- Mention figure/table numbers if applicable
- If uncertain, express appropriate confidence levels

CONSTRAINTS:
- Never fabricate information not in the paper
- Clearly distinguish between paper content and general knowledge
- If the paper doesn't address the question, say so explicitly

4. Error Handling

Add error handling nodes:

  • IF node: Check if retrieval returned results
  • Set node: Provide fallback responses
  • HTTP Request node: Log errors to external service

5. Performance Monitoring

Track workflow performance:

  • Execution time per node
  • Token usage (for cost estimation)
  • Retrieval accuracy (manual spot-checks)
  • User satisfaction (if collecting feedback)

Testing Scenarios

Test Questions

Use these questions to verify your workflow:

  1. Basic Factual:

    • “What is the Transformer model?”
    • “Who are the authors of this paper?”
  2. Technical Details:

    • “How does multi-head attention work?”
    • “What is the architecture of the encoder?”
  3. Comparative:

    • “How does the Transformer differ from RNNs?”
    • “What are the advantages over previous models?”
  4. Numerical:

    • “What is the model dimension (d_model)?”
    • “How many attention heads are used?”
  5. Conceptual:

    • “Why is attention important in this model?”
    • “What problem does this paper solve?”

Expected Behavior

✅ Good Response:

  • Directly answers the question
  • Includes specific details from the paper
  • Cites relevant sections or concepts
  • Stays within paper scope

❌ Poor Response:

  • Generic answer not from the paper
  • Fabricated information
  • Overly vague or incomplete
  • Ignores retrieved context

Deliverables Checklist

1. Exported Workflow JSON

How to Export:

  1. Open your workflow in n8n
  2. Click the ”…” menu (top right)
  3. Select “Download”
  4. Save as chatmate_workflow.json

What to Include:

  • Both Flow 1 and Flow 2 in the same workflow
  • All node configurations
  • Credentials (will be exported as placeholders)

Credential Export

The exported JSON will NOT include your actual API keys. Document which credentials are needed separately.

2. One-Page Report

Template Structure:

# n8n Chatmate Workflow Report
 
## Nodes Used and Configuration
 
### Flow 1: Document Ingestion
1. **Manual Trigger**: [brief description]
2. **Read Binary File**: [configuration details]
3. **Extract From File**: [chunk size, overlap settings]
4. **Embeddings HuggingFace**: [model, dimensions]
5. **Supabase Vector Store**: [table, columns]
 
### Flow 2: Conversational Retrieval
1. **Chat Trigger**: [configuration]
2. **AI Agent**: [system message, settings]
3. **Groq Chat Model**: [model choice, temperature]
4. **Supabase Vector Store (Tool)**: [Top K, description]
5. **Window Buffer Memory**: [window size]
 
## Challenges and Solutions
 
### Challenge 1: [Name]
**Problem**: [Description]
**Solution**: [How you solved it]
 
### Challenge 2: [Name]
**Problem**: [Description]
**Solution**: [How you solved it]
 
## Key Learnings
 
[2-3 sentences about what you learned]
 
## Performance Notes
 
- Average response time: [X seconds]
- Typical token usage: [X tokens]
- Retrieval accuracy: [subjective assessment]

3. One-Minute Demo Video

Recording Checklist:

  • Show the n8n workflow canvas (both flows visible)
  • Execute Flow 1 (document ingestion) - show success
  • Switch to chat interface
  • Ask a test question (e.g., “What is the Transformer model?“)
  • Show the answer being generated
  • Highlight that the answer is grounded in the paper
  • (Optional) Ask a follow-up question to show memory

Recording Tools:

  • Screen Recording: OBS Studio, Loom, or built-in screen recorder
  • Format: MP4, WebM, or MOV
  • Resolution: 1080p recommended
  • Duration: 45-60 seconds

Script Example:

[0:00-0:10] "Here's my n8n workflow with two flows: document ingestion and conversational retrieval."
[0:10-0:20] "First, I run the ingestion flow to process the PDF." [Click Execute]
[0:20-0:30] "Now I can ask questions in the chat interface." [Type question]
[0:30-0:50] "The AI agent retrieves relevant sections and generates an accurate answer based on the paper."
[0:50-1:00] "The system successfully answers questions using vector search and LLM reasoning."

Evaluation Criteria Breakdown

Workflow Completeness and Correctness (10 points)

Full Credit (9-10 points):

  • Both flows work correctly
  • All nodes properly configured
  • Successful document ingestion
  • Accurate answer generation
  • Proper error handling

Partial Credit (6-8 points):

  • Workflows mostly work with minor issues
  • Some configuration suboptimal
  • Occasional errors but generally functional

Low Credit (3-5 points):

  • Significant functionality issues
  • Missing key components
  • Frequent errors

Minimal Credit (0-2 points):

  • Workflow doesn’t run
  • Major components missing

Report Clarity and Reflection (4 points)

Full Credit (4 points):

  • Clear node descriptions
  • Specific configuration details
  • Meaningful challenges and solutions
  • Insightful reflections

Partial Credit (2-3 points):

  • Basic descriptions
  • Generic challenges
  • Limited reflection

Low Credit (0-1 points):

  • Vague or missing information
  • No meaningful reflection

Workflow Efficiency and Clean Design (6 points)

Full Credit (5-6 points):

  • Optimal node configuration
  • Clean, organized layout
  • Efficient resource usage
  • Good naming conventions
  • Proper error handling

Partial Credit (3-4 points):

  • Functional but not optimized
  • Acceptable organization
  • Some inefficiencies

Low Credit (0-2 points):

  • Messy or confusing layout
  • Inefficient configuration
  • Poor organization

Advanced Enhancements (Optional)

1. Multi-Document Support

Extend the workflow to handle multiple papers:

-- Add document_id to track different papers
ALTER TABLE documents ADD COLUMN document_id text;
CREATE INDEX idx_document_id ON documents(document_id);

Modify ingestion to tag chunks with document ID.

2. Citation Tracking

Include page numbers and sections in responses:

// In metadata during ingestion
{
  "page": pageNumber,
  "section": sectionTitle,
  "document": "Attention Is All You Need"
}

Update system message to include citations.

Combine vector search with keyword search:

-- Add full-text search
ALTER TABLE documents ADD COLUMN content_tsv tsvector;
CREATE INDEX idx_content_tsv ON documents USING gin(content_tsv);

Use both similarity and keyword matching.

4. Answer Validation

Add a validation step to check answer quality:

AI Agent → Generate Answer → Validation Agent → Return to User

Validation agent checks:

  • Answer is grounded in retrieved text
  • No hallucinations
  • Appropriate confidence level

5. Feedback Loop

Collect user feedback to improve retrieval:

User Question → Retrieve → Generate → Get Feedback → Log for Analysis

Use feedback to tune Top K, chunk size, and prompts.

Troubleshooting Guide

Issue: “No documents retrieved”

Diagnosis:

  1. Check if documents were inserted: SELECT COUNT(*) FROM documents;

  2. Verify embedding column exists and has data

  3. Test similarity search manually:

    SELECT content, 1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS similarity
    FROM documents
    ORDER BY similarity DESC
    LIMIT 5;

Fix:

  • Re-run Flow 1 if no documents
  • Check embedding model consistency
  • Verify Supabase credentials

Issue: “Agent not using retrieval tool”

Diagnosis:

  1. Check AI Agent logs for tool calls
  2. Verify tool description is clear
  3. Check if system message encourages tool use

Fix:

  • Improve tool description
  • Update system message to explicitly instruct tool usage
  • Lower LLM temperature (makes it more likely to follow instructions)

Issue: “Answers are hallucinated”

Diagnosis:

  1. Check retrieved chunks - are they relevant?
  2. Review LLM temperature setting
  3. Examine system message constraints

Fix:

  • Strengthen system message constraints
  • Lower temperature to 0.1-0.2
  • Add explicit “do not hallucinate” instructions
  • Increase Top K to provide more context

Issue: “Slow response times”

Diagnosis:

  1. Check which node is slowest (execution times)
  2. Monitor API rate limits
  3. Check network latency

Fix:

  • Use faster LLM model (e.g., llama-3.1-8b-instant)
  • Reduce Top K to retrieve fewer chunks
  • Optimize chunk size
  • Consider caching frequent queries

Issue: “Out of memory errors”

Diagnosis:

  1. Check chunk size and count
  2. Monitor token usage
  3. Review memory window size

Fix:

  • Reduce chunk size
  • Lower Top K
  • Decrease memory window size
  • Use smaller LLM model

Best Practices Summary

✅ Do’s

  • Test incrementally: Build and test each flow separately
  • Use descriptive names: Name nodes clearly (e.g., “Retrieve Paper Chunks”)
  • Document configurations: Keep notes on why you chose specific settings
  • Monitor costs: Track API usage for HuggingFace and Groq
  • Version control: Export workflow regularly as backup
  • Validate answers: Spot-check responses against the paper
  • Optimize iteratively: Start simple, then tune performance

❌ Don’ts

  • Don’t skip testing: Always test Flow 1 before building Flow 2
  • Don’t use default prompts: Customize system messages for your use case
  • Don’t ignore errors: Address warnings and errors immediately
  • Don’t over-engineer: Start with basic setup, add complexity only if needed
  • Don’t expose credentials: Never commit API keys to version control
  • Don’t forget cleanup: Remove test data from Supabase when done
  • Don’t assume: Verify each step works before moving to the next

Additional Resources

Official Documentation

Community Resources

Conclusion

This solution provides a complete, production-ready implementation of a document question-answering system using n8n. The workflow demonstrates:

  • Document processing: PDF to searchable vector embeddings
  • Semantic search: Vector similarity for relevant chunk retrieval
  • AI reasoning: LLM-powered answer generation
  • Conversation memory: Multi-turn dialogue support

By following this guide, you should be able to:

  1. ✅ Build both ingestion and retrieval flows
  2. ✅ Configure all nodes correctly
  3. ✅ Troubleshoot common issues
  4. ✅ Optimize for performance and accuracy
  5. ✅ Complete all deliverables successfully

Key Takeaways:

  • RAG (Retrieval-Augmented Generation) grounds LLM answers in factual content
  • Vector embeddings enable semantic search beyond keyword matching
  • Proper chunking and retrieval tuning are critical for answer quality
  • n8n’s visual workflow builder makes complex AI systems accessible

Good luck with your assignment! 🚀

Appendix: Complete Node Configuration Reference

Flow 1 Node Details

NodeTypeKey Settings
TriggerManual Trigger-
Read FileRead Binary FilePath: /data/attention_is_all_you_need.pdf
ExtractExtract From FileChunk: 1000, Overlap: 200
EmbedHuggingFace EmbeddingsModel: distilbert-base-nli-mean-token, Dim: 768
StoreSupabase Vector StoreOperation: Insert, Table: documents

Flow 2 Node Details

NodeTypeKey Settings
TriggerChat TriggerMode: Chat
AgentAI AgentSystem message, connected to LLM + Tools + Memory
LLMGroq Chat ModelModel: llama-3.1-70b-versatile, Temp: 0.3
RetrievalSupabase Vector StoreOperation: Retrieve Tool, Top K: 4
MemoryWindow Buffer MemoryWindow: 5

Document Information

Version: 1.0
Last Updated: 2025-11-25
Course: GenAI - Assignment 1