Skip to main content

RAG Query Patterns

Query patterns, retrieval strategies, re-ranking approaches, and multi-index query handling for AI agent RAG retrieval.

Query Optimization

Score Thresholds by Agent

AgentMin ScoreTop-KRationale
Menu Assistant0.755High precision for menu items
Support Agent0.6510Broader coverage for troubleshooting
Maximus (Ops)0.708Balance for runbook retrieval
Minerva (Sales)0.6015Wider context for sales queries

Query Pipeline

async function ragQuery(
query: string,
agentType: AgentType,
tenantId?: string
): Promise<RetrievedContext[]> {
// 1. Get query embedding
const queryEmbedding = await embed(query);

// 2. Query with filters
const results = await index.query(queryEmbedding, {
topK: getTopK(agentType),
filter: tenantId ? { tenant_id: tenantId } : undefined,
returnMetadata: true
});

// 3. Filter by score threshold
const filtered = results.matches.filter(
m => m.score >= getScoreThreshold(agentType)
);

// 4. Re-rank if needed
if (agentType === 'sales') {
return rerank(filtered, query);
}

return filtered;
}

Re-Ranking Strategies

StrategyWhen to UseImplementation
Reciprocal Rank FusionMulti-index queriesScore = 1/(rank + k)
Cross-EncoderHigh-stakes queriesSeparate re-ranking model
Freshness BoostTime-sensitive contentscore * freshness_factor
Popularity BoostFAQ queriesscore * view_count_factor

Multi-Index Query Patterns

Cross-Index Query

async function multiIndexQuery(query: string): Promise<Context[]> {
// Query multiple indexes in parallel
const [menuResults, supportResults, opsResults] = await Promise.all([
menuIndex.query(embedding, { topK: 5 }),
supportIndex.query(embedding, { topK: 5 }),
opsIndex.query(embedding, { topK: 5 })
]);

// Reciprocal Rank Fusion
return rrfMerge([menuResults, supportResults, opsResults], k=60);
}

Fallback Chain

async function queryWithFallback(query: string): Promise<Context[]> {
// Try primary index
let results = await primaryIndex.query(embedding, { topK: 5 });

if (results.matches[0]?.score < 0.6) {
// Fallback to secondary
results = await secondaryIndex.query(embedding, { topK: 10 });
}

if (results.matches.length === 0) {
// Final fallback to general knowledge
results = await generalIndex.query(embedding, { topK: 15 });
}

return results;
}

Tenant Isolation

Namespace Filtering

All multi-tenant queries MUST include tenant filtering:

// CORRECT: Filtered by tenant
const results = await index.query(embedding, {
topK: 10,
filter: { tenant_id: { $eq: currentTenantId } }
});

// WRONG: No tenant filter - security vulnerability
const results = await index.query(embedding, { topK: 10 });

Permission-Aware Retrieval

async function permissionAwareQuery(
query: string,
user: User
): Promise<Context[]> {
const results = await index.query(embedding, {
topK: 20,
filter: {
$or: [
{ tenant_id: user.tenantId },
{ access_level: 'public' }
]
}
});

// Additional permission check
return results.filter(r =>
user.hasPermission(r.metadata.required_permission)
);
}

Agent Prompt Library

System Prompts by Agent Type

Maximus (Operations AI)

You are Maximus, the Olympus Cloud operations AI assistant. You help the operations team monitor systems, respond to incidents, and execute runbooks.

Your capabilities:
- Answer questions about system health and metrics
- Retrieve and explain runbook procedures
- Provide incident response guidance
- Execute approved automation tasks

Guidelines:
- Be concise and technical
- Always cite sources from runbooks
- Request approval for risky actions
- Escalate if unsure

Current context:
{retrieved_context}

Minerva (Sales AI)

You are Minerva, the Olympus Cloud sales AI assistant. You help the sales team qualify leads, prepare demos, and answer prospect questions.

Your capabilities:
- Answer product and pricing questions
- Provide ROI calculations
- Compare features with competitors
- Suggest demo scenarios

Guidelines:
- Be professional and persuasive
- Focus on business value
- Provide accurate pricing (never discount without approval)
- Escalate technical questions to SEs

Current context:
{retrieved_context}
You are a helpful restaurant menu assistant. You help customers and staff find menu items, understand ingredients, and accommodate dietary needs.

Your capabilities:
- Describe menu items and ingredients
- Identify allergens and dietary options
- Suggest items based on preferences
- Explain preparation methods

Guidelines:
- Be friendly and helpful
- Always mention allergen warnings
- Don't make health claims
- Suggest alternatives when items are unavailable

Current context:
{retrieved_context}

Few-Shot Examples

User: Is the Caesar salad gluten-free?
Assistant: The classic Caesar salad is NOT gluten-free because it contains croutons. However, I can suggest ordering it without croutons. Note that the Caesar dressing may contain trace amounts of gluten from Worcestershire sauce - please let the kitchen know about your allergy for the safest option.

Support Query Example

User: My order is stuck on "Preparing"
Assistant: I understand that's frustrating. Here's how to resolve a stuck order:

1. Check if the kitchen has the ticket (KDS should show it)
2. If visible on KDS, the kitchen may be behind - ask for an ETA
3. If NOT on KDS, try resending: Orders → [Order] → Actions → Resend to Kitchen
4. If still stuck, void the items and re-enter them

Would you like me to walk you through any of these steps?

  • Indexing - How documents are indexed
  • Entities - Entity definitions and relationships
  • Overview - RAG Knowledge Base overview