n8n Chatmate - RAG Document QA System 🤖📚

Project Status

Status: ✅ Completed & Functional
Type: Academic Assignment + Learning Project
Timeline: November 2025
Course: GenAI (WiSe25)

📋 Project Overview

Chatmate is an AI-powered document question-answering system that uses Retrieval-Augmented Generation (RAG) to answer questions about the research paper “Attention Is All You Need” (the original Transformer paper). Built entirely using n8n workflow automation, it demonstrates the practical application of vector embeddings, semantic search, and LLM-based answer generation.

Core Concept

Instead of relying purely on an LLM’s pre-trained knowledge (which can hallucinate), this system:

Ingests the PDF document into a vector database
Retrieves semantically relevant chunks based on user questions
Generates accurate answers grounded in the actual document content

🎯 Project Goals

Build a working RAG pipeline using n8n’s visual workflow editor
Implement vector-based semantic search with Supabase
Integrate LLM (Groq) for natural language answer generation
Maintain conversation context across multiple questions
Ensure answers are grounded in source material (no hallucinations)
Document the learning process and challenges faced

🛠️ Technology Stack

Component	Technology	Purpose
Workflow Orchestration	n8n	Visual workflow automation platform
Vector Database	Supabase (PostgreSQL + pgvector)	Stores document embeddings for similarity search
Embedding Model	HuggingFace `all-MiniLM-L6-v2`	Converts text to 384-dim vectors
LLM	Groq (`llama-3.1-70b-versatile`)	Natural language answer generation
Document Processing	n8n Extract From File node	PDF chunking and text extraction
Memory	Window Buffer Memory	Maintains conversation context

🏗️ Architecture

Two-Flow Design

graph TB
    subgraph "Flow 1: Document Ingestion (One-Time)"
        A[PDF File] --> B[Read Binary File]
        B --> C[Extract From File]
        C --> D[Chunk Text]
        D --> E[Generate Embeddings]
        E --> F[Supabase Vector Store]
    end
    
    subgraph "Flow 2: Conversational Retrieval (Per Query)"
        G[User Question] --> H[Chat Trigger]
        H --> I[AI Agent]
        I --> J[Generate Query Embedding]
        J --> K[Supabase Similarity Search]
        K --> L[Retrieve Top 4 Chunks]
        L --> M[Groq LLM]
        M --> N[Grounded Answer]
    end
    
    F -.->|Shared Database| K

Key Design Decisions

Separate Ingestion & Retrieval Flows: One-time setup vs. per-query execution
Consistent Embedding Model: Same model for ingestion and retrieval ensures vector compatibility
Chunk Strategy: 1000 chars with 200 char overlap balances context preservation and token efficiency
Top K = 4: Retrieves enough context without overwhelming the LLM’s context window
Low Temperature (0.3): Prioritizes factual accuracy over creativity

📊 Performance Metrics

Document Size: 71 text chunks from the Transformer paper
Average Response Time: 2-3 seconds per query
Retrieval Accuracy: High relevance for technical questions
Token Efficiency: 500-1000 tokens per query
Embedding Dimension: 384 (lightweight and fast)

🔧 Key Technical Challenges Solved

Critical Learnings

See Challenges and Solutions for detailed problem-solving narratives

Sub-node Architecture: Modern n8n requires explicit Document and Embedding sub-nodes
Custom PostgreSQL Function: Created match_documents() for vector similarity search
Model Selection: Switched to well-supported HuggingFace model for reliability
Vector Dimension Alignment: Ensured database schema matches embedding model output
Flow Independence: Understood database-mediated communication pattern

📁 Project Files

Assignment_Report.md - Detailed technical report with node configurations
Assignment_Process_Log.md - Development process and iteration log
n8n_workflow.json - Exported workflow (importable into n8n)
demo_video.mp4 - One-minute demonstration

💡 Key Learnings

Technical Insights

RAG Architecture: Understanding the separation of ingestion and retrieval pipelines
Vector Databases: Practical experience with pgvector and similarity search
Embedding Consistency: Critical importance of using identical models across flows
n8n Sub-nodes: Modular architecture for vector store operations
PostgreSQL Functions: Writing custom functions for advanced vector operations

Broader Concepts

Grounded Generation: How RAG prevents hallucinations by anchoring LLM responses
Semantic Search: Vector embeddings capture meaning beyond keyword matching
Workflow Automation: Visual programming for AI/ML pipelines
LLM Integration: Practical API usage with Groq for cost-effective inference

🚀 Future Enhancements

Potential Improvements

Multi-document support (upload multiple papers)

Citation extraction (show which chunks were used)

Hybrid search (combine vector + keyword search)

Advanced chunking strategies (semantic splitting)

Query rewriting for better retrieval

Web UI deployment (public chat interface)

Metadata filtering (search by paper section)

Evaluation metrics (retrieval precision/recall)

Map of Content - Main navigation hub
See also: n8n automation workflows in n8n
Related concepts: ai, automation

📚 References

n8n Documentation
Supabase Vector Documentation
HuggingFace Inference API
Groq API
Original Paper: “Attention Is All You Need” (Vaswani et al., 2017)

Reflection

This project transformed a theoretical understanding of RAG into practical implementation. The visual workflow approach of n8n made complex AI pipelines accessible, while the challenges encountered provided deep insights into vector databases, embedding models, and LLM integration. A perfect blend of learning and building. 🌿→🌺

🧠 ज्ञान उद्यान

Explorer

Recent Notes

Building a ReAct Agent from Scratch: MockLLM vs Real LLM

Map of Content

Building a RAG Pipeline from Scratch: A Complete Tutorial

Building a Document Q&A System with n8n: A RAG Tutorial

Home

n8n Chatmate - RAG Document QA System

n8n Chatmate - RAG Document QA System 🤖📚

📋 Project Overview

Core Concept

🎯 Project Goals

🛠️ Technology Stack

🏗️ Architecture

Two-Flow Design

Key Design Decisions

📊 Performance Metrics

🔧 Key Technical Challenges Solved

📁 Project Files

💡 Key Learnings

Technical Insights

Broader Concepts

🚀 Future Enhancements

📚 References

Graph View

Table of Contents

🧠 ज्ञान उद्यान

Explorer

Recent Notes

Building a ReAct Agent from Scratch: MockLLM vs Real LLM

Map of Content

Building a RAG Pipeline from Scratch: A Complete Tutorial

Building a Document Q&A System with n8n: A RAG Tutorial

Home

n8n Chatmate - RAG Document QA System

n8n Chatmate - RAG Document QA System 🤖📚

📋 Project Overview

Core Concept

🎯 Project Goals

🛠️ Technology Stack

🏗️ Architecture

Two-Flow Design

Key Design Decisions

📊 Performance Metrics

🔧 Key Technical Challenges Solved

📁 Project Files

💡 Key Learnings

Technical Insights

Broader Concepts

🚀 Future Enhancements

🔗 Related Notes

📚 References

Graph View

Table of Contents