n8n Workflow Assignment - Complete Solution Guide
Chatmate: Document Question-Answering System
Assignment Overview
This solution provides a complete guide to building a document question-answering system using n8n. The system allows users to ask questions about the âAttention Is All You Needâ paper and receive accurate, context-grounded answers.
Overview
This solution provides a complete guide to building a document question-answering system using n8n. The system allows users to ask questions about the âAttention Is All You Needâ paper and receive accurate, context-grounded answers.
System Architecture
The workflow consists of two independent flows:
- Flow 1: Document Ingestion - Runs once to process and index the PDF
- Flow 2: Conversational Retrieval - Runs for every user question
Prerequisites
Required Services & API Keys
-
Supabase Account
- Create a free account at supabase.com
- Create a new project
- Note your project URL and API key (anon/public key)
-
HuggingFace Account
- Create account at huggingface.co
- Generate API token from Settings â Access Tokens
- Model:
sentence-transformers/distilbert-base-nli-mean-token
-
Groq Account
- Create account at groq.com
- Generate API key from console
- Recommended model:
llama-3.1-70b-versatileormixtral-8x7b-32768
-
n8n Setup
- Docker installation with file system access
- Version 1.0+ recommended for AI Agent node
Docker Setup
Run n8n with file system access:
docker run -it --rm \
--name n8n \
-p 5678:5678 \
-v ~/.n8n:/home/node/.n8n \
-v /path/to/your/pdfs:/data \
n8nio/n8nReplace /path/to/your/pdfs with the directory containing your PDF file.
Step-by-Step Implementation
A) Prepare the Data
1. Download the PDF
# Download the "Attention Is All You Need" paper
wget https://arxiv.org/pdf/1706.03762.pdf -O attention_is_all_you_need.pdf
# Or use curl
curl -L https://arxiv.org/pdf/1706.03762.pdf -o attention_is_all_you_need.pdfPlace the PDF in a directory accessible to your Docker container (e.g., /data/attention_is_all_you_need.pdf).
2. Set Up Supabase Vector Store
In your Supabase project, create a table for storing document embeddings:
-- Enable the pgvector extension
create extension if not exists vector;
-- Create the documents table
create table documents (
id bigserial primary key,
content text,
metadata jsonb,
embedding vector(768)
);
-- Create an index for faster similarity search
create index on documents
using ivfflat (embedding vector_cosine_ops)
with (lists = 100);Embedding Dimension
The embedding dimension must be 768 to match the HuggingFace model output.
B) Build Flow 1: Document Ingestion
This flow runs once to process and store the PDF.
Node 1: Manual Trigger
- Node Type:
Manual Trigger - Purpose: Starts the workflow when you click âExecute Workflowâ
- Configuration: No configuration needed
Node 2: Read Binary File
- Node Type:
Read Binary File - Purpose: Reads the PDF from disk
- Configuration:
- File Path:
/data/attention_is_all_you_need.pdf - Property Name:
data(default)
- File Path:
File Path Verification
Verify the file path matches your Docker volume mount. Use absolute paths.
Node 3: Extract From File
- Node Type:
Extract From File - Purpose: Converts PDF to text and splits into chunks
- Configuration:
- Input Binary Field:
data - Operation:
Extract From PDF - Options:
- Chunk Size:
1000(characters per chunk) - Chunk Overlap:
200(overlap between chunks) - Include Metadata:
true
- Chunk Size:
- Input Binary Field:
Why these settings?
- Chunk size of 1000 balances context and specificity
- Overlap ensures important information isnât split across boundaries
- Metadata helps track source locations
Node 4: Embeddings HuggingFace Inference
- Node Type:
Embeddings HuggingFace Inference - Purpose: Converts text chunks into vector embeddings
- Configuration:
- Credential: Add HuggingFace API credentials
- Model:
sentence-transformers/distilbert-base-nli-mean-token - Output Dimension:
768
Credential Setup:
- Go to Credentials â Add Credential
- Select âHuggingFace APIâ
- Enter your API token
- Save as âHuggingFace Inferenceâ
Node 5: Supabase Vector Store (Insert)
- Node Type:
Supabase Vector Store - Purpose: Stores embeddings in Supabase
- Configuration:
- Credential: Add Supabase credentials
- Operation:
Insert Documents - Table Name:
documents - Embedding Column:
embedding - Content Column:
content - Metadata Column:
metadata
Credential Setup:
- Go to Credentials â Add Credential
- Select âSupabase APIâ
- Enter:
- Host: Your Supabase project URL (e.g.,
https://xxxxx.supabase.co) - Service Role Key: Your Supabase anon/public key
- Host: Your Supabase project URL (e.g.,
- Save as âSupabaseâ
Testing Flow 1:
- Click âExecute Workflowâ
- Verify each node shows green checkmark
- Check the final node output - should show inserted document count
- Verify in Supabase:
SELECT COUNT(*) FROM documents;
C) Build Flow 2: Conversational Retrieval
This flow runs every time a user asks a question.
Node 1: Chat Trigger
- Node Type:
When chat message received - Purpose: Receives user questions via n8n chat interface
- Configuration:
- Mode:
Chat - Public:
false(ortrueif you want public access) - Options: Default settings
- Mode:
Node 2: AI Agent
- Node Type:
AI Agent - Purpose: Orchestrates retrieval and answer generation
- Configuration:
- Chat Model: Connect to Groq Chat Model node
- Tools: Connect to Supabase Vector Store (Retrieve) node
- Memory: Connect to Window Buffer Memory node
- Options:
-
System Message:
You are a helpful AI assistant that answers questions about the research paper "Attention Is All You Need". Use the retrieval tool to find relevant information from the paper before answering. Base your answers strictly on the retrieved content - do not make up information. If the answer is not in the retrieved content, say so clearly. Provide clear, concise answers with specific details from the paper.
-
Node 3: Groq Chat Model
- Node Type:
Groq Chat Model - Purpose: LLM for generating answers
- Configuration:
- Credential: Add Groq API credentials
- Model:
llama-3.1-70b-versatile(recommended) ormixtral-8x7b-32768 - Temperature:
0.3(lower = more focused, higher = more creative) - Max Tokens:
1024
Credential Setup:
- Go to Credentials â Add Credential
- Select âGroq APIâ
- Enter your Groq API key
- Save as âGroqâ
Model Selection:
llama-3.1-70b-versatile: Best quality, slowermixtral-8x7b-32768: Good balance of speed and qualityllama-3.1-8b-instant: Fastest, lower quality
Node 4: Supabase Vector Store (Retrieve)
- Node Type:
Supabase Vector Store - Purpose: Retrieves relevant document chunks
- Configuration:
-
Credential: Use same Supabase credential
-
Operation:
Retrieve Documents (As Tool) -
Table Name:
documents -
Embedding Column:
embedding -
Content Column:
content -
Metadata Column:
metadata -
Top K:
4(number of chunks to retrieve) -
Tool Description:
Search the "Attention Is All You Need" paper for relevant information. Use this tool whenever you need to find specific details from the paper.
-
Why Top K = 4?
- Balances context breadth and token usage
- 4 chunks Ă 1000 chars â 4000 chars of context
- Adjust based on question complexity
Node 5: Window Buffer Memory
- Node Type:
Window Buffer Memory - Purpose: Maintains conversation history
- Configuration:
- Session Key:
{{ $json.sessionId }}(auto-generated by chat trigger) - Window Size:
5(remember last 5 exchanges)
- Session Key:
Connection Flow:
Chat Trigger â AI Agent
â
Groq Chat Model (connected to AI Agent)
â
Supabase Vector Store Tool (connected to AI Agent)
â
Window Buffer Memory (connected to AI Agent)
Testing Flow 2:
- Click âTest Workflowâ or use the chat interface
- Ask a question: âWhat is the main idea of the paper?â
- Verify the agent retrieves documents and generates an answer
- Check the response is grounded in the paper content
Common Challenges and Solutions
Challenge 1: Docker File Access Issues
Problem: n8n canât find the PDF file
Solution:
- Verify Docker volume mount:
docker inspect n8n | grep Mounts - Use absolute paths in the Read Binary File node
- Check file permissions:
chmod 644 attention_is_all_you_need.pdf
Challenge 2: Supabase Connection Errors
Problem: âCould not connect to Supabaseâ
Solutions:
- Verify project URL format:
https://xxxxx.supabase.co(no trailing slash) - Use the anon/public key, not the service role key
- Check if pgvector extension is enabled:
SELECT * FROM pg_extension WHERE extname = 'vector';
Challenge 3: Embedding Dimension Mismatch
Problem: âDimension mismatch errorâ
Solution:
- Ensure Supabase table uses
vector(768) - Verify HuggingFace model is
sentence-transformers/distilbert-base-nli-mean-token - Recreate table if dimension is wrong
Challenge 4: Poor Answer Quality
Problem: Answers are vague or incorrect
Solutions:
- Increase Top K to retrieve more context (try 5-6)
- Improve system message with more specific instructions
- Lower temperature (0.1-0.3) for more focused answers
- Verify chunks are being retrieved (check AI Agent execution logs)
Challenge 5: Rate Limiting
Problem: âRate limit exceededâ errors
Solutions:
- HuggingFace: Use a paid tier or wait between requests
- Groq: Check your rate limits in the console
- Add error handling and retry logic
Challenge 6: Memory Not Working
Problem: Agent doesnât remember previous questions
Solution:
- Verify Window Buffer Memory is connected to AI Agent
- Check session ID is being passed correctly
- Increase window size if needed
Workflow Optimization Tips
1. Chunk Size Optimization
Experiment with different chunk sizes based on your use case:
- Small chunks (500-800): Better for specific facts
- Medium chunks (1000-1500): Balanced approach (recommended)
- Large chunks (2000+): Better for conceptual questions
2. Retrieval Tuning
Adjust Top K based on question type:
- Simple factual questions: Top K = 2-3
- Complex conceptual questions: Top K = 5-7
- Exploratory questions: Top K = 8-10
3. Prompt Engineering
Enhance the system message for better results:
You are an expert AI assistant specializing in the "Attention Is All You Need" paper.
RETRIEVAL STRATEGY:
- Always use the retrieval tool before answering
- Retrieve multiple times if the question has multiple parts
- If initial results are insufficient, try rephrasing the search query
ANSWER GUIDELINES:
- Quote specific sections when relevant
- Explain technical concepts clearly
- Mention figure/table numbers if applicable
- If uncertain, express appropriate confidence levels
CONSTRAINTS:
- Never fabricate information not in the paper
- Clearly distinguish between paper content and general knowledge
- If the paper doesn't address the question, say so explicitly
4. Error Handling
Add error handling nodes:
- IF node: Check if retrieval returned results
- Set node: Provide fallback responses
- HTTP Request node: Log errors to external service
5. Performance Monitoring
Track workflow performance:
- Execution time per node
- Token usage (for cost estimation)
- Retrieval accuracy (manual spot-checks)
- User satisfaction (if collecting feedback)
Testing Scenarios
Test Questions
Use these questions to verify your workflow:
-
Basic Factual:
- âWhat is the Transformer model?â
- âWho are the authors of this paper?â
-
Technical Details:
- âHow does multi-head attention work?â
- âWhat is the architecture of the encoder?â
-
Comparative:
- âHow does the Transformer differ from RNNs?â
- âWhat are the advantages over previous models?â
-
Numerical:
- âWhat is the model dimension (d_model)?â
- âHow many attention heads are used?â
-
Conceptual:
- âWhy is attention important in this model?â
- âWhat problem does this paper solve?â
Expected Behavior
â Good Response:
- Directly answers the question
- Includes specific details from the paper
- Cites relevant sections or concepts
- Stays within paper scope
â Poor Response:
- Generic answer not from the paper
- Fabricated information
- Overly vague or incomplete
- Ignores retrieved context
Deliverables Checklist
1. Exported Workflow JSON
How to Export:
- Open your workflow in n8n
- Click the ââŚâ menu (top right)
- Select âDownloadâ
- Save as
chatmate_workflow.json
What to Include:
- Both Flow 1 and Flow 2 in the same workflow
- All node configurations
- Credentials (will be exported as placeholders)
Credential Export
The exported JSON will NOT include your actual API keys. Document which credentials are needed separately.
2. One-Page Report
Template Structure:
# n8n Chatmate Workflow Report
## Nodes Used and Configuration
### Flow 1: Document Ingestion
1. **Manual Trigger**: [brief description]
2. **Read Binary File**: [configuration details]
3. **Extract From File**: [chunk size, overlap settings]
4. **Embeddings HuggingFace**: [model, dimensions]
5. **Supabase Vector Store**: [table, columns]
### Flow 2: Conversational Retrieval
1. **Chat Trigger**: [configuration]
2. **AI Agent**: [system message, settings]
3. **Groq Chat Model**: [model choice, temperature]
4. **Supabase Vector Store (Tool)**: [Top K, description]
5. **Window Buffer Memory**: [window size]
## Challenges and Solutions
### Challenge 1: [Name]
**Problem**: [Description]
**Solution**: [How you solved it]
### Challenge 2: [Name]
**Problem**: [Description]
**Solution**: [How you solved it]
## Key Learnings
[2-3 sentences about what you learned]
## Performance Notes
- Average response time: [X seconds]
- Typical token usage: [X tokens]
- Retrieval accuracy: [subjective assessment]3. One-Minute Demo Video
Recording Checklist:
- Show the n8n workflow canvas (both flows visible)
- Execute Flow 1 (document ingestion) - show success
- Switch to chat interface
- Ask a test question (e.g., âWhat is the Transformer model?â)
- Show the answer being generated
- Highlight that the answer is grounded in the paper
- (Optional) Ask a follow-up question to show memory
Recording Tools:
- Screen Recording: OBS Studio, Loom, or built-in screen recorder
- Format: MP4, WebM, or MOV
- Resolution: 1080p recommended
- Duration: 45-60 seconds
Script Example:
[0:00-0:10] "Here's my n8n workflow with two flows: document ingestion and conversational retrieval."
[0:10-0:20] "First, I run the ingestion flow to process the PDF." [Click Execute]
[0:20-0:30] "Now I can ask questions in the chat interface." [Type question]
[0:30-0:50] "The AI agent retrieves relevant sections and generates an accurate answer based on the paper."
[0:50-1:00] "The system successfully answers questions using vector search and LLM reasoning."
Evaluation Criteria Breakdown
Workflow Completeness and Correctness (10 points)
Full Credit (9-10 points):
- Both flows work correctly
- All nodes properly configured
- Successful document ingestion
- Accurate answer generation
- Proper error handling
Partial Credit (6-8 points):
- Workflows mostly work with minor issues
- Some configuration suboptimal
- Occasional errors but generally functional
Low Credit (3-5 points):
- Significant functionality issues
- Missing key components
- Frequent errors
Minimal Credit (0-2 points):
- Workflow doesnât run
- Major components missing
Report Clarity and Reflection (4 points)
Full Credit (4 points):
- Clear node descriptions
- Specific configuration details
- Meaningful challenges and solutions
- Insightful reflections
Partial Credit (2-3 points):
- Basic descriptions
- Generic challenges
- Limited reflection
Low Credit (0-1 points):
- Vague or missing information
- No meaningful reflection
Workflow Efficiency and Clean Design (6 points)
Full Credit (5-6 points):
- Optimal node configuration
- Clean, organized layout
- Efficient resource usage
- Good naming conventions
- Proper error handling
Partial Credit (3-4 points):
- Functional but not optimized
- Acceptable organization
- Some inefficiencies
Low Credit (0-2 points):
- Messy or confusing layout
- Inefficient configuration
- Poor organization
Advanced Enhancements (Optional)
1. Multi-Document Support
Extend the workflow to handle multiple papers:
-- Add document_id to track different papers
ALTER TABLE documents ADD COLUMN document_id text;
CREATE INDEX idx_document_id ON documents(document_id);Modify ingestion to tag chunks with document ID.
2. Citation Tracking
Include page numbers and sections in responses:
// In metadata during ingestion
{
"page": pageNumber,
"section": sectionTitle,
"document": "Attention Is All You Need"
}Update system message to include citations.
3. Hybrid Search
Combine vector search with keyword search:
-- Add full-text search
ALTER TABLE documents ADD COLUMN content_tsv tsvector;
CREATE INDEX idx_content_tsv ON documents USING gin(content_tsv);Use both similarity and keyword matching.
4. Answer Validation
Add a validation step to check answer quality:
AI Agent â Generate Answer â Validation Agent â Return to User
Validation agent checks:
- Answer is grounded in retrieved text
- No hallucinations
- Appropriate confidence level
5. Feedback Loop
Collect user feedback to improve retrieval:
User Question â Retrieve â Generate â Get Feedback â Log for Analysis
Use feedback to tune Top K, chunk size, and prompts.
Troubleshooting Guide
Issue: âNo documents retrievedâ
Diagnosis:
-
Check if documents were inserted:
SELECT COUNT(*) FROM documents; -
Verify embedding column exists and has data
-
Test similarity search manually:
SELECT content, 1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS similarity FROM documents ORDER BY similarity DESC LIMIT 5;
Fix:
- Re-run Flow 1 if no documents
- Check embedding model consistency
- Verify Supabase credentials
Issue: âAgent not using retrieval toolâ
Diagnosis:
- Check AI Agent logs for tool calls
- Verify tool description is clear
- Check if system message encourages tool use
Fix:
- Improve tool description
- Update system message to explicitly instruct tool usage
- Lower LLM temperature (makes it more likely to follow instructions)
Issue: âAnswers are hallucinatedâ
Diagnosis:
- Check retrieved chunks - are they relevant?
- Review LLM temperature setting
- Examine system message constraints
Fix:
- Strengthen system message constraints
- Lower temperature to 0.1-0.2
- Add explicit âdo not hallucinateâ instructions
- Increase Top K to provide more context
Issue: âSlow response timesâ
Diagnosis:
- Check which node is slowest (execution times)
- Monitor API rate limits
- Check network latency
Fix:
- Use faster LLM model (e.g., llama-3.1-8b-instant)
- Reduce Top K to retrieve fewer chunks
- Optimize chunk size
- Consider caching frequent queries
Issue: âOut of memory errorsâ
Diagnosis:
- Check chunk size and count
- Monitor token usage
- Review memory window size
Fix:
- Reduce chunk size
- Lower Top K
- Decrease memory window size
- Use smaller LLM model
Best Practices Summary
â Doâs
- Test incrementally: Build and test each flow separately
- Use descriptive names: Name nodes clearly (e.g., âRetrieve Paper Chunksâ)
- Document configurations: Keep notes on why you chose specific settings
- Monitor costs: Track API usage for HuggingFace and Groq
- Version control: Export workflow regularly as backup
- Validate answers: Spot-check responses against the paper
- Optimize iteratively: Start simple, then tune performance
â Donâts
- Donât skip testing: Always test Flow 1 before building Flow 2
- Donât use default prompts: Customize system messages for your use case
- Donât ignore errors: Address warnings and errors immediately
- Donât over-engineer: Start with basic setup, add complexity only if needed
- Donât expose credentials: Never commit API keys to version control
- Donât forget cleanup: Remove test data from Supabase when done
- Donât assume: Verify each step works before moving to the next
Additional Resources
Official Documentation
- n8n Documentation
- n8n AI Nodes Guide
- Supabase Vector Guide
- HuggingFace Inference API
- Groq Documentation
Community Resources
Related Papers
- Attention Is All You Need (Original Paper)
- BERT: Pre-training of Deep Bidirectional Transformers
- Retrieval-Augmented Generation (RAG)
Conclusion
This solution provides a complete, production-ready implementation of a document question-answering system using n8n. The workflow demonstrates:
- Document processing: PDF to searchable vector embeddings
- Semantic search: Vector similarity for relevant chunk retrieval
- AI reasoning: LLM-powered answer generation
- Conversation memory: Multi-turn dialogue support
By following this guide, you should be able to:
- â Build both ingestion and retrieval flows
- â Configure all nodes correctly
- â Troubleshoot common issues
- â Optimize for performance and accuracy
- â Complete all deliverables successfully
Key Takeaways:
- RAG (Retrieval-Augmented Generation) grounds LLM answers in factual content
- Vector embeddings enable semantic search beyond keyword matching
- Proper chunking and retrieval tuning are critical for answer quality
- n8nâs visual workflow builder makes complex AI systems accessible
Good luck with your assignment! đ
Appendix: Complete Node Configuration Reference
Flow 1 Node Details
| Node | Type | Key Settings |
|---|---|---|
| Trigger | Manual Trigger | - |
| Read File | Read Binary File | Path: /data/attention_is_all_you_need.pdf |
| Extract | Extract From File | Chunk: 1000, Overlap: 200 |
| Embed | HuggingFace Embeddings | Model: distilbert-base-nli-mean-token, Dim: 768 |
| Store | Supabase Vector Store | Operation: Insert, Table: documents |
Flow 2 Node Details
| Node | Type | Key Settings |
|---|---|---|
| Trigger | Chat Trigger | Mode: Chat |
| Agent | AI Agent | System message, connected to LLM + Tools + Memory |
| LLM | Groq Chat Model | Model: llama-3.1-70b-versatile, Temp: 0.3 |
| Retrieval | Supabase Vector Store | Operation: Retrieve Tool, Top K: 4 |
| Memory | Window Buffer Memory | Window: 5 |
Document Information
Version: 1.0
Last Updated: 2025-11-25
Course: GenAI - Assignment 1