n8n RAG System Report

Architecture Overview

Flow 1: Document Ingestion 📄

NodeConfiguration
Manual TriggerInitiates one-time ingestion
Read Binary FilePath: /home/node/antigem/Attention_Is_All_You_Need.pdf
Extract From FileConverts PDF to text chunks (default settings)
Supabase Vector StoreOperation: Insert Documents
Table: documents
Embedding Batch Size: 768
Embeddings (Sub-node)Model: sentence-transformers/all-mpnet-base-v2
Dimension: 768
Document Loader (Sub-node)Default Data Loader

Flow 2: Conversational Retrieval 💬

NodeConfiguration
Chat TriggerReceives user questions via n8n chat interface
AI AgentOrchestrates retrieval and answer generation
Groq Chat ModelModel: llama-3.3-70b-versatile
Default settings
Supabase Vector Store (Tool)Operation: Retrieve Documents (As Tool)
Table: documents
Top K: 4 (default)
Tool Description: “Search the ‘Attention Is All You Need’ paper for relevant information”
Embeddings (Sub-node)Model: sentence-transformers/all-mpnet-base-v2
Dimension: 768
Simple MemorySession ID Type: Custom Key
Session Key: ={{ $json.sessionId }}

Challenges and Solutions 🔧

Challenge 1: Document Sub-node Connection Error

Problem: When configuring the Supabase Vector Store node for Flow 1, received error: “A Document sub-node must be connected and enabled”

Root Cause: In newer versions of n8n (v1.0+), Vector Store nodes require explicit sub-nodes for document loading and embeddings processing.

Solution:

  • Added Default Data Loader as a Document sub-node to the Supabase Vector Store
  • Added Embeddings HuggingFace Inference as an Embeddings sub-node
  • This modular architecture makes the data transformation pipeline explicit

Challenge 2: HuggingFace Model Selection

Problem: Initial model sentence-transformers/distilbert-base-nli-mean-token failed with authentication errors

Root Cause: The specified model was not available via HuggingFace’s Inference API or required special access permissions.

Solution:

  • Switched to sentence-transformers/all-mpnet-base-v2 which is well-supported by HuggingFace Inference API
  • Updated Supabase table schema to use vector(768) to match the model’s output dimension
  • Verified HuggingFace API credentials were correctly configured

Challenge 3: Supabase match_documents Function Implementation

Problem: Flow 2 retrieval failed with error: “Could not find the function public.match_documents”. After creating the function, encountered “column reference ‘metadata’ is ambiguous” error.

Root Cause:

  • Supabase’s n8n integration expects a custom PostgreSQL function for vector similarity search that doesn’t exist by default
  • Function parameter metadata had the same name as the table column, causing PostgreSQL to be unable to resolve which one to use

Solution:

  • Created the required function in Supabase SQL Editor with proper table prefixing:
  • Explicitly prefixed table columns with table name (documents.metadata, documents.id, etc.) to resolve ambiguity
CREATE OR REPLACE FUNCTION match_documents(
  query_embedding vector(768),
  match_count int DEFAULT 5,
  filter jsonb DEFAULT '{}'
)
RETURNS TABLE (
  id bigint,
  content text,
  metadata jsonb,
  similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    documents.id,
    documents.content,
    documents.metadata,
    1 - (documents.embedding <=> query_embedding) AS similarity
  FROM documents
  WHERE documents.metadata @> filter
  ORDER BY documents.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;