Skip to main content

Tenant-Specific RAG Platform

Build AI-powered knowledge bases scoped to individual tenants, enabling each organization to upload, index, and query their own documents through the Knowledge Hub.

Overview

The Tenant-Specific RAG Platform provides every tenant in Olympus Cloud with an isolated, AI-powered knowledge base. Unlike the platform-wide RAG system that indexes shared documentation (support articles, release notes, user guides), the tenant RAG platform lets each organization maintain its own private knowledge corpus.

Platform-Wide RAG vs. Tenant RAG

FeaturePlatform-Wide RAGTenant RAG
ScopeShared across all tenantsPrivate per tenant
ContentOlympus docs, guides, FAQsTenant-uploaded documents
Index namingsupport-kb, sales-kb, docs-embeddingsminerva-knowledge-base-{tenant_id}
ManagementPlatform team maintainsTenant admins manage
Use caseAgent context (Maximus, Minerva)Tenant-specific Q&A, support, operations
ConfigurationSee RAG ConfigurationThis document

Key Capabilities

  • Multi-format document ingestion -- PDF, Markdown, video transcripts, support tickets, FAQs, release notes, articles, and troubleshooting guides
  • Hybrid search -- Combines semantic vector search with BM25-style keyword matching via Reciprocal Rank Fusion (RRF)
  • Per-tenant isolation -- Each tenant gets a dedicated Vectorize index with tenant-scoped queries
  • Answer generation with citations -- LLM-powered answers that cite source documents with confidence scoring
  • Usage metering -- Track document counts, chunk counts, word counts, and query volume per tenant

Knowledge Hub Architecture

The Knowledge Hub processes documents through a multi-stage pipeline before they become queryable:

                         Knowledge Hub Pipeline
+------------------------------------------------------------------+
| |
| Upload/API Extract Text Chunk Document |
| +---------+ +-----------+ +-------------+ |
| | PDF | | pypdf | | Semantic | |
| | Markdown| ---> | Markdown | ---> | Fixed-size | |
| | Ticket | | Cleanup | | Paragraph | |
| | Video | | Normalize | | (configurable)| |
| +---------+ +-----------+ +------+------+ |
| | |
| v |
| Query Pipeline Index Embed |
| +-----------+ +-----------+ +----------+ |
| | Hybrid | | Vectorize | <--- | BGE Base | |
| | Search | <--- | + Keyword | | 768-dim | |
| | + Answer | | Index | | Workers | |
| | Generation| +-----------+ | AI | |
| +-----------+ +----------+ |
| |
+------------------------------------------------------------------+

Component Responsibilities

ComponentClassSource File
Ingestion PipelineDocumentIngesterbackend/python/app/services/knowledge_base/service.py
Chunking StrategyChunkingStrategybackend/python/app/services/knowledge_base/service.py
Hybrid SearchHybridSearchEnginebackend/python/app/services/knowledge_base/service.py
Answer GeneratorAnswerGeneratorbackend/python/app/services/knowledge_base/service.py
API Routesrouterbackend/python/app/api/knowledge_base_routes.py
Vectorize ClientVectorizeClientbackend/python/app/clients/vectorize_client.py
Minerva KB ServiceMinervaKnowledgeBaseServicebackend/python/app/services/minerva/knowledge_base.py

Document Ingestion

Supported Document Types

The DocumentType enum defines the content types that the Knowledge Hub can ingest:

TypeEnum ValueProcessing MethodNotes
MarkdownmarkdownFrontmatter extraction, content cleanupDefault type; strips YAML frontmatter
PDFpdfText extraction via pypdfMulti-page support with per-page extraction
Video Transcriptvideo_transcriptTimestamp removal, speaker label cleanupNormalizes raw transcription output
Support Ticketsupport_ticketIssue/resolution formattingFormats resolved tickets as knowledge articles
Release Notesrelease_notesVersion taggingAutomatically titles with version number
FAQfaqDirect processingShort-answer optimized
TroubleshootingtroubleshootingDirect processingStep-by-step resolution content
ArticlearticleDirect processingGeneral knowledge content

Ingestion Pipeline Details

Each document goes through the following stages:

1. Content Extraction

The DocumentIngester class handles format-specific text extraction:

# Markdown: strips frontmatter, normalizes whitespace
document = await ingester.ingest_markdown(
content="# Getting Started\n...",
tenant_id="tenant-123",
title="Getting Started Guide",
category="onboarding",
tags=["setup", "quickstart"],
)

# PDF: extracts text from all pages using pypdf
document = await ingester.ingest_pdf(
pdf_content=pdf_bytes,
tenant_id="tenant-123",
title="Operations Manual",
)

# Support Ticket: formats issue + resolution as knowledge article
document = await ingester.ingest_support_ticket(
ticket_content={
"title": "POS Not Printing Receipts",
"issue": "Receipts stopped printing after firmware update...",
"resolution": "Reset printer spooler and re-pair Bluetooth...",
"category": "hardware",
"tags": ["printer", "pos", "bluetooth"],
},
tenant_id="tenant-123",
)

2. Content Hashing

Each document receives a SHA-256 content hash (first 16 hex characters) for change detection, enabling efficient re-indexing when content is updated.

3. Chunking

Documents are split into chunks using one of three strategies:

StrategyDescriptionBest For
Semantic (default)Splits on markdown headers and paragraphs, respects section boundariesStructured documents with headers
FixedFixed character window with overlap, avoids mid-word splitsUnstructured text, transcripts
ParagraphSplits on double newlinesSimple documents, articles

Chunking parameters:

ParameterDefaultRangePurpose
chunk_size512--Target characters per chunk
chunk_overlap50--Overlap between consecutive chunks
min_chunk_size100--Discard chunks smaller than this
max_chunk_size1024--Split sections exceeding this

4. Embedding

Chunks are embedded using Workers AI's BGE model (@cf/baai/bge-base-en-v1.5) producing 768-dimensional vectors. The AI Gateway client handles batch embedding for efficiency.

5. Indexing

Embedded chunks are upserted into Cloudflare Vectorize with metadata:

# Metadata stored with each vector
{
"document_id": "doc-uuid",
"tenant_id": "tenant-123",
"doc_type": "markdown",
"title": "Getting Started Guide",
"section_title": "Installation",
"category": "onboarding",
"tags": ["setup", "quickstart"],
}

A parallel keyword index is built in-memory for BM25-style retrieval, with stopword removal and term-frequency scoring.


External Data Connectors

The platform supports ingesting data from external systems through the Universal Data Ingestion Engine (Epic #1400) and the Knowledge Hub API. External data can be imported into the knowledge base through two primary patterns:

Supported Connector Types

ConnectorPatternData Flow
PostgreSQLETL pipelineExtract rows, transform to documents, ingest via KB API
MySQLETL pipelineExtract rows, transform to documents, ingest via KB API
SalesforceCRM integrationSync CRM records as knowledge articles
HubSpotCRM integrationSync contacts, deals, and knowledge content
WebhookPush-basedExternal systems push documents to the ingestion endpoint

Webhook Integration

External systems can push documents directly to the Knowledge Hub API:

# Webhook-style document ingestion
curl -X POST https://dev.api.olympuscloud.ai/v1/knowledge-base/ingest \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"tenant_id": "550e8400-e29b-41d4-a716-446655449100",
"title": "Product Update Q1 2026",
"content": "## New Features\n\n- Inventory auto-reorder...",
"doc_type": "article",
"category": "product-updates",
"tags": ["product", "q1-2026"],
"chunking_strategy": "semantic"
}'

Bulk Import Pattern

For large-scale data migration from external databases:

# Bulk ingest multiple documents
curl -X POST https://dev.api.olympuscloud.ai/v1/knowledge-base/ingest/bulk \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"tenant_id": "550e8400-e29b-41d4-a716-446655449100",
"documents": [
{
"title": "SOP: Opening Procedures",
"content": "## Opening Checklist\n...",
"doc_type": "article",
"category": "operations",
"tags": ["sop", "opening"]
},
{
"title": "SOP: Closing Procedures",
"content": "## Closing Checklist\n...",
"doc_type": "article",
"category": "operations",
"tags": ["sop", "closing"]
}
]
}'

The bulk endpoint returns per-document status including success/failure counts:

{
"total": 2,
"success": 2,
"failed": 0,
"results": [
{"document_id": "abc-123", "status": "success", "title": "SOP: Opening Procedures"},
{"document_id": "def-456", "status": "success", "title": "SOP: Closing Procedures"}
]
}

The HybridSearchEngine combines two retrieval strategies and fuses their results using Reciprocal Rank Fusion (RRF).

Search Methods

MethodWhen UsedStrengths
VectorSemantic similarity via VectorizeFinds conceptually similar content even with different wording
KeywordBM25-style term matchingPrecise matches for specific terms, product names, codes
Hybrid (default)Both vector + keyword, fused via RRFBest overall recall and precision

How Hybrid Search Works

  User Query: "How do I reset the printer?"
|
+---> Vector Search (Vectorize)
| Query embedding --> cosine similarity
| Returns: top_k * 2 results with scores
|
+---> Keyword Search (BM25-style)
| Tokenize, remove stopwords, TF-IDF scoring
| Returns: top_k * 2 results with scores
|
+---> Reciprocal Rank Fusion (k=60)
Merge & deduplicate results
RRF score = sum(1 / (k + rank_i)) for each method
Results appearing in both methods get "hybrid" tag
Return top_k final results

Search Request Parameters

ParameterTypeDefaultDescription
querystringrequiredNatural language search query
tenant_idstringrequiredScopes search to tenant's documents
top_kinteger5Maximum results (1-20)
categorystringnullFilter by document category
doc_typesarraynullFilter by DocumentType values
search_methodstring"hybrid"One of: vector, keyword, hybrid

Search Example

from app.services.knowledge_base.service import KnowledgeBaseService, DocumentType

service = await get_knowledge_base_service()

results = await service.search(
query="How do I configure inventory alerts?",
tenant_id="tenant-123",
top_k=5,
category="operations",
doc_types=[DocumentType.ARTICLE, DocumentType.TROUBLESHOOTING],
search_method="hybrid",
)

for result in results:
print(f"[{result.search_method}] {result.document_title}")
print(f" Section: {result.section_title}")
print(f" Score: {result.score:.4f}")
print(f" Content: {result.content[:200]}...")

Per-Tenant Isolation

Tenant isolation is enforced at multiple levels to ensure data privacy and prevent cross-tenant data leakage.

Isolation Architecture

  Tenant A                           Tenant B
+------------------+ +------------------+
| Documents | | Documents |
| Chunks | | Chunks |
| Keyword Index | | Keyword Index |
+--------+---------+ +--------+---------+
| |
v v
+------------------+ +------------------+
| Vectorize Index | | Vectorize Index |
| minerva-kb- | | minerva-kb- |
| tenant-a-uuid | | tenant-b-uuid |
+------------------+ +------------------+

Isolation Mechanisms

LayerMechanismImplementation
API Layerrequire_tenant_auth() dependencyAll Knowledge Hub routes require tenant authentication
Data Modeltenant_id field on Document and DocumentChunkEvery record is tagged with the owning tenant
Vector StoreTenant-scoped index namesminerva-knowledge-base-{tenant_id} naming convention
Query FilteringTenant ID filter on all searchesBoth vector and keyword searches filter by tenant_id
Vectorize Metadatatenant_id in vector metadataStored alongside each embedding for secondary filtering

Index Provisioning

Each tenant's Vectorize index is created on demand using the ensure_tenant_index method:

from app.clients.vectorize_client import VectorizeClient

client = VectorizeClient()

# Creates index if it doesn't exist, returns existing if it does
index = await client.ensure_tenant_index(
tenant_id="550e8400-e29b-41d4-a716-446655449100",
index_prefix="minerva-knowledge-base", # default
dimensions=1536, # vector dimensions
)
# Index name: minerva-knowledge-base-550e8400-e29b-41d4-a716-446655449100

Index Lifecycle

OperationEndpointWhen
ProvisionPOST /minerva/knowledge-base/{tenant_id}/provisionTenant onboarding / Minerva addon enabled
Status checkGET /minerva/knowledge-base/{tenant_id}/statusHealth monitoring
RebuildPOST /minerva/knowledge-base/{tenant_id}/rebuildFull re-index with fresh data
DeleteDELETE /minerva/knowledge-base/{tenant_id}Tenant offboarding / addon disabled

Usage Metering and Billing

The Knowledge Hub tracks usage metrics per tenant for billing and capacity planning.

Tracked Metrics

MetricSourceDescription
total_documentsStatsResponseNumber of documents ingested
total_chunksStatsResponseTotal chunks across all documents
total_wordsStatsResponseAggregate word count
documents_by_typeStatsResponseBreakdown by DocumentType
documents_by_categoryStatsResponseBreakdown by category label
indexed_documentsStatsResponseDocuments successfully embedded and indexed
pending_documentsStatsResponseDocuments awaiting processing

Querying Usage Stats

# Get tenant knowledge base statistics
curl -X GET "https://dev.api.olympuscloud.ai/v1/knowledge-base/stats?tenant_id=550e8400-e29b-41d4-a716-446655449100" \
-H "Authorization: Bearer $TOKEN"

Response:

{
"tenant_id": "550e8400-e29b-41d4-a716-446655449100",
"total_documents": 47,
"total_chunks": 312,
"total_words": 89450,
"documents_by_type": {
"markdown": 20,
"pdf": 12,
"support_ticket": 10,
"faq": 5
},
"documents_by_category": {
"operations": 15,
"menu": 12,
"training": 10,
"policies": 7,
"uncategorized": 3
},
"indexed_documents": 45,
"pending_documents": 2
}

Ingestion Metering

Each ingestion response includes processing metrics for cost attribution:

FieldTypeDescription
document_idstringUnique ID of the ingested document
statusstring"success" or "failed"
chunks_createdintegerNumber of chunks produced
chunks_indexedintegerNumber successfully embedded and stored
processing_time_msintegerTotal ingestion latency
error_messagestring (nullable)Error details if status is "failed"

API Endpoints

All Knowledge Hub endpoints are mounted under /v1/knowledge-base and require tenant authentication.

Endpoint Reference

MethodPathSummaryRequest Body
GET/knowledge-base/healthHealth check--
POST/knowledge-base/ingestIngest a text documentIngestDocumentRequest
POST/knowledge-base/ingest/pdfUpload and ingest a PDFMultipart file + query params
POST/knowledge-base/ingest/support-ticketIngest a resolved support ticketIngestSupportTicketRequest
POST/knowledge-base/ingest/bulkBulk ingest multiple documentsBulkIngestRequest
POST/knowledge-base/searchSearch the knowledge baseSearchRequest
POST/knowledge-base/answerGenerate an answer with citationsAnswerRequest
GET/knowledge-base/documentsList documents (paginated)Query params
GET/knowledge-base/documents/{document_id}Get document details--
DELETE/knowledge-base/documents/{document_id}Delete a document--
GET/knowledge-base/statsGet tenant statisticsQuery params

Ingest Document

curl -X POST https://dev.api.olympuscloud.ai/v1/knowledge-base/ingest \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"tenant_id": "550e8400-e29b-41d4-a716-446655449100",
"title": "Employee Handbook",
"content": "# Employee Handbook\n\n## Attendance Policy\n...",
"doc_type": "markdown",
"category": "hr",
"tags": ["handbook", "policies", "hr"],
"source_url": "https://internal.example.com/handbook",
"chunking_strategy": "semantic"
}'

Ingest PDF

curl -X POST "https://dev.api.olympuscloud.ai/v1/knowledge-base/ingest/pdf?tenant_id=550e8400-e29b-41d4-a716-446655449100&title=Safety%20Manual&category=compliance" \
-H "Authorization: Bearer $TOKEN" \
-F "file=@safety-manual.pdf"

Search Knowledge Base

curl -X POST https://dev.api.olympuscloud.ai/v1/knowledge-base/search \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "What is the dress code policy?",
"tenant_id": "550e8400-e29b-41d4-a716-446655449100",
"top_k": 5,
"category": "hr",
"search_method": "hybrid"
}'

Response:

{
"query": "What is the dress code policy?",
"results": [
{
"chunk_id": "chunk-uuid-1",
"document_id": "doc-uuid-1",
"content": "All staff must wear the approved uniform during shifts...",
"score": 0.032,
"document_title": "Employee Handbook",
"doc_type": "markdown",
"section_title": "Dress Code",
"source_url": "https://internal.example.com/handbook",
"search_method": "hybrid"
}
],
"total": 1,
"search_method": "hybrid"
}

Generate Answer with Citations

curl -X POST https://dev.api.olympuscloud.ai/v1/knowledge-base/answer \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"question": "What should I wear to work?",
"tenant_id": "550e8400-e29b-41d4-a716-446655449100",
"top_k": 5,
"category": "hr"
}'

Response:

{
"answer": "According to the Employee Handbook [Source 1], all staff must wear the approved uniform during shifts. This includes...",
"confidence": 0.82,
"sources": [
{
"source_number": 1,
"document_id": "doc-uuid-1",
"title": "Employee Handbook",
"section": "Dress Code",
"url": "https://internal.example.com/handbook",
"relevance_score": 0.89
}
],
"context_chunks": 5,
"model_tier": "T3",
"generation_latency_ms": 1250,
"search_latency_ms": 85,
"has_direct_answer": true,
"needs_human_review": false
}

List Documents

curl -X GET "https://dev.api.olympuscloud.ai/v1/knowledge-base/documents?tenant_id=550e8400-e29b-41d4-a716-446655449100&category=hr&limit=10&offset=0" \
-H "Authorization: Bearer $TOKEN"

Delete Document

curl -X DELETE https://dev.api.olympuscloud.ai/v1/knowledge-base/documents/doc-uuid-1 \
-H "Authorization: Bearer $TOKEN"
tip

Deleting a document removes it from both the Vectorize index and the keyword index. The operation cascades to all associated chunks.