Skip to main content

Architecture Overview

Olympus Cloud is the world's first Autonomous Restaurant Operating System (aROS) -- a serverless-first, AI-native platform built on three pillars: Radical Flexibility, Generative AI Operations, and Total Operational Resilience.

Platform Overview

LayerTechnologiesPurpose
FrontendFlutter (Dart)13 cross-platform shells / 14 AppExperience values
API GatewayGo (Gin, gqlgen)GraphQL + REST routing, service orchestration
Core ServicesRust (Axum)Auth, Platform, Commerce, Creator, CMS, Chat, Alerting
AI/MLPython (FastAPI, LangGraph)68 route modules, AI agents, analytics
EdgeTypeScript (Hono), Cloudflare Workers6 workers: AI proxy, chat, vision, alerting, analytics, docs
DataCloud Spanner, ClickHouse, RedisTransactional (OLTP), analytics (OLAP), caching
Edge StorageVectorize, R2, D1, KVRAG vectors, media, edge state

System Diagram

┌─────────────────────────────────────────────────────────────────────┐
│ USER & DEVELOPER EXPERIENCE LAYER │
│ Flutter Apps (13 shells: Restaurant, Cockpit, Portal, KDS, ...) │
│ MCP Development Server (75 tools for AI-assisted development) │
└────────────────────────┬────────────────────────────────────────────┘

┌────────────────────────▼────────────────────────────────────────────┐
│ EDGE LAYER (Cloudflare - 6 Workers) │
│ AI Gateway │ Workers AI │ Vectorize RAG │ R2 Storage │ D1 / KV │
│ ai-proxy │ restaurant-ai-chat-v2 │ vision-inference │ docs-chat │
│ alert-ingest (Durable Objects) │ analytics-aggregator │
└────────────────────────┬────────────────────────────────────────────┘

┌────────────────────────▼────────────────────────────────────────────┐
│ API GATEWAY (Go - Cloud Run :8080) │
│ GraphQL + REST + WebSocket + Service Orchestration │
│ + Distribution :8011 │ ACP Server :8090 │ IoT :50052 (gRPC) │
└────────────────────────┬────────────────────────────────────────────┘

┌──────────────┴──────────────┐
│ │
┌─────────▼─────────┐ ┌─────────▼──────────┐
│ CORE SERVICES │ │ PY SERVICES CLUSTER│
│ (Rust - Cloud Run)│ │ (Python - Cloud Run)│
│ • Auth :8001 │ │ • Analytics :8004│
│ • Platform :8002 │ │ • ML :8005│
│ • Commerce :8003 │◄───────┤ • 68 route modules │
│ • Creator :8004 │ Events │ • LangGraph Agents │
│ • CMS :8005 │ │ • ACP AI Router │
│ • Chat :8007 │ └─────────┬──────────┘
│ • Alerting :8080 │ │
└─────────┬─────────┘ │
│ ┌─────────────────────────┘
│ │
┌─────────▼────▼──────────────────────────────────────────────────────┐
│ EVENT BUS (GCP Pub/Sub) │
└───────────────────────────┬─────────────────────────────────────────┘

┌───────────────────────────▼─────────────────────────────────────────┐
│ DATA & STATE LAYER │
│ Cloud Spanner (OLTP) │ ClickHouse (OLAP) │ Redis/Memorystore │ GCS│
│ Cloudflare Vectorize (RAG) │ Cloudflare R2 (Media) │ D1 (Edge) │
└─────────────────────────────────────────────────────────────────────┘

Service Inventory

Rust Services (7 deployed + 1 edge-only)

ServicePortPurpose
Auth8001Authentication, JWT issuance, SSO, user management
Platform8002Policy Engine (Gating), tenancy, roles, permissions
Commerce8003Orders, payments, inventory, MDM, menu management
Creator8004Creator studio, content management, audience tools
CMS8005Headless CMS, web builder, content versioning
Chat8007Real-time messaging, WebSocket channels
Alerting8080Alert rules, notification dispatch, escalation (internal only, not exposed via API Gateway)
IoT Gateway--Edge device communication (not Cloud Run deployed)

Plus 6 shared libraries: shared, acp-client, acp-server, analytics, edge, ml.

Go Services (4 deployed + 3 CLI tools)

ServicePortPurpose
API Gateway8080GraphQL + REST gateway, service orchestration (external entry point)
Distribution8011Content distribution, CDN orchestration
ACP Server8090AI Compute Platform, model routing
IoT Service50052 (gRPC)IoT device management, telemetry

CLI tools: devjwt, seed, seed_policy.

Python Services (2 deployed, 68 route modules)

ServicePortPurpose
Analytics8004Core analytics, reporting, forecasting, data pipelines
ML8005Machine learning models, predictions, recommendations

Both share the codebase at backend/python/app/api/ with 68 FastAPI route modules covering AI/ML, analytics, operations, communication, and NLP.

Edge Workers (6 - Cloudflare)

WorkerPurpose
ai-proxyOpenAI-compatible AI proxy, model routing
restaurant-ai-chat-v2Minerva Ultra AI chat (customer-facing)
vision-inferenceCLIP image analysis, vision AI
alert-ingestDurable Objects alerting pipeline
analytics-aggregatorEdge analytics aggregation
docs-chatDocumentation RAG chat

AI Architecture (ACP AI Router)

The ACP AI Router achieves 95%+ cost savings vs direct Vertex AI by routing queries to the cheapest capable model:

TierModelCost (per M tokens)Use For
T1Llama 4 Scout (Workers AI)FREESimple queries, classification
T2Gemini 2.0 Flash$0.10 / $0.40Standard tasks
T3Gemini 3 Flash$0.50 / $3.00Complex reasoning
T4Claude Haiku 4.5$1.00 / $5.00Code generation
T5Claude Sonnet 4.5$3.00 / $15.00Complex code, planning
T6Claude Opus 4.5$5.00 / $25.00Critical decisions

LangGraph Agents: Business Assistant, Support, Inventory, Voice Ordering, Content Suggestion, Minerva (messaging), Maximus (voice AI), Dispute Resolution.


Data Architecture

Transactional Data (Cloud Spanner)

Globally distributed, strongly consistent database:

├── Tenants (root table)
│ ├── Locations (interleaved)
│ │ ├── Orders (interleaved)
│ │ ├── Menu Items (interleaved)
│ │ └── Inventory (interleaved)
│ ├── Users (interleaved)
│ └── Roles (interleaved)

Analytics Data (ClickHouse Cloud)

Real-time OLAP analytics (60-65% cost savings vs BigQuery):

  • Real-time streaming from Pub/Sub
  • Sub-second query performance
  • Sales analytics, demand forecasting, anomaly detection
  • P&L reporting, labor optimization

Cache (Redis / Memorystore)

  • Gating Engine policy evaluation cache
  • Session management
  • Rate limiting, real-time counters

Vector Data (Cloudflare Vectorize)

Edge-native vector database for RAG (5M vectors/index):

  • Documentation, menu, policy, support knowledge bases
  • Semantic search
  • Customer preferences

Communication Patterns

Synchronous (gRPC + HTTP)

Client -> API Gateway -> Core Service -> Database
|
Response (< 100ms)

Asynchronous (Pub/Sub)

Service -> Pub/Sub Topic -> Subscribers
|
Background Processing

Real-Time (WebSocket)

Client <-> WebSocket Gateway <-> Redis Pub/Sub <-> Services

Security Architecture

Defense in Depth

┌─────────────────────────────────────────────────────┐
│ Layer 1: Edge Security (Cloudflare) │
│ - DDoS protection - WAF - Bot management │
├─────────────────────────────────────────────────────┤
│ Layer 2: Network Security (VPC) │
│ - Private networking - Firewall rules │
├─────────────────────────────────────────────────────┤
│ Layer 3: Application Security │
│ - JWT auth (RS256) - RBAC - Gating Engine │
├─────────────────────────────────────────────────────┤
│ Layer 4: Data Security │
│ - RLS on all tables - Encryption at rest │
│ - GCP Secret Manager - Parameterized queries only │
└─────────────────────────────────────────────────────┘

Tenant Isolation

  • Database-level row isolation (org_id / tenant_id on all tables)
  • API-level tenant verification
  • Network-level VPC separation

Scalability

Horizontal Scaling

ComponentScaling Strategy
API GatewayCloud Run auto-scaling based on requests
Core ServicesCloud Run auto-scaling based on CPU/memory
DatabaseSpanner automatic node addition
EdgeCloudflare global CDN (300+ locations)

Performance Targets

MetricTarget
API Latency (p99)< 100ms
Order Creation< 50ms
Gating Engine (cached)< 10ms
Gating Engine (uncached)< 50ms
WebSocket Latency< 20ms