Using Retrieval-Augmented Generation (RAG)
Enhance your agents with knowledge from external sources using RAG. This allows your agents to answer questions and complete tasks based on your own documents and data.
Overview
RAG (Retrieval-Augmented Generation) allows your agents to access and reference specific information from your documents, databases, and knowledge sources. When a user asks a question, the agent can search through your uploaded content to provide accurate, contextual responses based on your actual data.
Prerequisites
Before using RAG, ensure you have:
- Database and embedding provider configured - RAG stores vectors in PostgreSQL and needs credentials for the selected embedding provider. The default embedding model is OpenAI
text-embedding-3-small. - Agent with searchKnowledgeBase tool enabled - This is the critical step most users miss
Enable the searchKnowledgeBase Tool
IMPORTANT: You must enable the searchKnowledgeBase tool for your agent to access the knowledge base.
Via Admin UI (Recommended)
- Navigate to the Admin Console → Agents
- Select or create your agent
- Go to the Tools section
- Enable the
searchKnowledgeBasetool - Configure RAG sources in the Knowledge Base section
- Save your changes
Via Configuration File
Alternatively, manually edit your agent's config.json:
{
"toolsAllEnabled": false,
"toolsEnabled": ["searchKnowledgeBase", "otherTools..."],
"ragAllEnabled": true,
"ragEnabled": []
}Without enabling this tool, your agent cannot access the knowledge base, even if RAG sources are configured.
Setting Up Knowledge Sources
Creating Knowledge Bases
Knowledge bases are collections of related resources that you can selectively enable for different agents:
- Navigate to the Admin Console → Knowledge Base
- Click "Create Knowledge Base"
- Fill in the details:
- Name:
company-policies - Description:
Employee handbook and company policies
- Name:
- Save the knowledge base
Uploading Documents
Upload documents to your knowledge bases through the Admin Console:
- Go to your knowledge base
- Click "Upload File" or "Upload Files"
- Select your files (drag & drop or file picker)
- Wait for processing to complete
Supported file types:
- PDF documents (.pdf)
- Text files (.txt)
- Markdown files (.md)
- Word documents (.docx)
- Spreadsheets (.xlsx)
- CSV files (.csv)
- SubRip subtitle files (.srt)
- Rich Text Format files (.rtf)
- Images (.jpg, .jpeg, .png, .gif, .webp)
Processing Pipeline
When you upload documents:
- Text Extraction: Content is extracted from files
- Chunking: Documents are split into manageable chunks
- Embedding Generation: Each chunk is converted to vector embeddings with the current embedding model
- Storage: Embeddings, metadata, and the
embedding_modelused for each chunk are stored in your database - Indexing: Content becomes searchable by your agents
Embedding Models
Knowledge-base and offer embeddings share the same vector storage rules:
- Embeddings are stored as
halfvec(1536). - The default model is
text-embedding-3-small. - Supported knowledge-base models are
text-embedding-3-small,gemini-embedding-001, andgemini-embedding-2. - Gemini embedding requests are constrained to 1536 output dimensions.
text-embedding-3-largeis not available in the Knowledge Base embedding-model UI because the shared storage is fixed at 1536 dimensions.
Changing the embedding model affects newly generated or regenerated content. Existing chunks keep their stored embedding_model, so mixed content can continue to search safely. Resource APIs expose embeddingModelInfo with the chunk count, stored models, primary model, and none/single/mixed state. The Knowledge page and pathway knowledge-base picker use that metadata to warn when an item was embedded with a different model than the current setting, or when one item contains chunks from multiple embedding models.
Configuring Agents for RAG
Option 1: Enable All Knowledge Sources
Allow your agent to access all available knowledge:
{
"ragAllEnabled": true,
"ragEnabled": [],
"toolsEnabled": ["searchKnowledgeBase"]
}Option 2: Selective Knowledge Access
Limit your agent to specific knowledge bases:
{
"ragAllEnabled": false,
"ragEnabled": ["kb_company_policies_id", "kb_technical_docs_id"],
"toolsEnabled": ["searchKnowledgeBase"]
}Use knowledge-base IDs for ragEnabled. Older saved configs that contain names still resolve for compatibility, but new configs should store IDs.
Real-World Examples
HR Assistant with Company Policies
{
"id": "hr-assistant-001",
"name": "HR Assistant",
"chatDisplayName": "HR Bot",
"description": "Helps employees understand company policies and procedures",
"prompt": "You are an HR assistant. Use the searchKnowledgeBase tool to find relevant company policies and procedures when answering employee questions. Always cite the specific source when providing policy information.",
"toolsEnabled": ["searchKnowledgeBase"],
"ragAllEnabled": false,
"ragEnabled": ["kb_employee_handbook_id", "kb_benefits_guide_id", "kb_hr_policies_id"]
}Technical Support Agent
{
"id": "tech-support-001",
"name": "Technical Support",
"chatDisplayName": "TechBot",
"description": "Provides technical support using product documentation",
"prompt": "You are a technical support agent. Search the knowledge base for relevant troubleshooting guides, product documentation, and known solutions before responding to technical issues.",
"toolsEnabled": ["searchKnowledgeBase", "createTicket"],
"ragAllEnabled": false,
"ragEnabled": ["kb_product_docs_id", "kb_troubleshooting_guides_id", "kb_api_docs_id"]
}How RAG Search Works
Automatic Search Behavior
The searchKnowledgeBase tool is designed to be used automatically by your agent when:
- Users ask questions that might reference specific information
- The agent needs context to provide accurate answers
- Users mention documents, policies, or specific topics
- The agent is unsure about factual information
Search Process
- Query Analysis: User's question is analyzed and reformulated as a search query
- Candidate Model Discovery: Eligible chunks are grouped by their stored
embedding_model - Embedding Generation: The query is embedded once for each stored model found in scope
- Similarity Search: Each query vector is compared only with chunks stored for the same
embedding_model - Relevance Filtering: Only content above the agent's similarity threshold is returned. The default threshold is 0.3.
- Agent Integration: Retrieved content is provided to the agent for response generation
Search Results
Each search returns up to the agent's configured citation limit, defaulting to 6 relevant content chunks, with:
{
"content": "The actual text content from your documents",
"similarity": 0.85,
"source": {
"filename": "employee-handbook.pdf",
"fileType": "pdf",
"metadata": {
"page": 15,
"section": "Vacation Policy"
}
}
}Best Practices
1. Document Organization
Structure your knowledge base logically:
Knowledge Bases:
├── company-policies/
│ ├── employee-handbook.pdf
│ ├── code-of-conduct.pdf
│ └── benefits-guide.pdf
├── technical-docs/
│ ├── api-documentation.md
│ ├── troubleshooting-guide.pdf
│ └── installation-manual.pdf
└── procedures/
├── onboarding-checklist.pdf
├── expense-policy.pdf
└── security-guidelines.pdf2. Agent Prompt Engineering
Include RAG usage instructions in your agent prompts:
{
"prompt": "You are a customer service agent. When users ask questions about policies, products, or procedures, always use the searchKnowledgeBase tool first to find the most current and accurate information. Cite your sources and provide specific page numbers or section references when available."
}3. Content Quality
Ensure high-quality source material:
- Clear Structure: Use headings, bullet points, and clear formatting
- Current Information: Keep documents up-to-date
- Comprehensive Coverage: Include all relevant topics
- Consistent Terminology: Use consistent language across documents
4. Testing RAG Performance
Test your knowledge base regularly:
- Ask specific questions about content you know exists
- Test edge cases and ambiguous queries
- Verify accuracy of retrieved information
- Check source attribution in agent responses
5. Monitoring and Optimization
Monitor RAG performance:
- Search Success Rate: How often relevant content is found
- Response Accuracy: Quality of agent answers
- User Satisfaction: Feedback on knowledge-based responses
- Content Gaps: Topics where no relevant content is found
Troubleshooting
Agent Not Using Knowledge Base
Check these common issues:
- Tool Not Enabled: Ensure
searchKnowledgeBaseis intoolsEnabledarray - No RAG Access: Verify
ragAllEnabled: trueor specific sources inragEnabled - Empty Knowledge Base: Confirm documents are uploaded and processed
- Permissions: Check agent has access to the knowledge bases
Model Warning Badges
The Knowledge page and pathway picker can show a model warning badge for a resource:
- Current-model mismatch: The item was embedded with a different model than the current embedding setting. Search still uses the item's stored model until it is regenerated.
- Mixed chunk models: The item has chunks embedded with more than one model. Search compares each chunk with a query vector from its own stored model, but regenerating the item makes it consistent again.
Poor Search Results
Improve search quality:
- Content Quality: Ensure documents are well-structured and readable
- Query Refinement: Test different ways of asking questions
- Chunk Size: Consider document chunking strategy
- Similarity Threshold: Adjust if needed (default is 0.3)
Performance Issues
Optimize for speed:
- Selective Access: Use specific
ragEnabledinstead ofragAllEnabled - Content Volume: Monitor total content size
- Database Performance: Ensure proper indexing
- Embedding Cache: Embeddings are cached for efficiency
Advanced Configuration
Custom Search Behavior
For advanced use cases, you can extend the search functionality by modifying the searchKnowledgeBase tool or creating custom tools that integrate with the embedding system.
Integration with External Systems
RAG can be combined with external APIs and services to provide comprehensive information access:
{
"toolsEnabled": ["searchKnowledgeBase", "searchExternalAPI", "checkLiveData"]
}Multi-Language Support
Supported embedding models include multilingual options. Upload documents in different languages and search can work across language boundaries to some extent, depending on the selected embedding model and source quality.
Summary
RAG transforms your agents from general AI assistants into knowledgeable experts on your specific domain. By properly configuring the searchKnowledgeBase tool and organizing your knowledge sources, you can create agents that provide accurate, contextual responses based on your actual data and documentation.
Remember: Always enable the searchKnowledgeBase tool in your agent configuration - this is the key that unlocks your knowledge base for your agents.