Architecture Overview

Olympus Cloud is the world's first Autonomous Restaurant Operating System (aROS) -- a serverless-first, AI-native platform built on three pillars: Radical Flexibility, Generative AI Operations, and Total Operational Resilience.

Platform Overview

Layer	Technologies	Purpose
Frontend	Flutter (Dart)	13 cross-platform shells / 14 AppExperience values
API Gateway	Go (Gin, gqlgen)	GraphQL + REST routing, service orchestration
Core Services	Rust (Axum)	Auth, Platform, Commerce, Creator, CMS, Chat, Alerting
AI/ML	Python (FastAPI, LangGraph)	68 route modules, AI agents, analytics
Edge	TypeScript (Hono), Cloudflare Workers	6 workers: AI proxy, chat, vision, alerting, analytics, docs
Data	Cloud Spanner, ClickHouse, Redis	Transactional (OLTP), analytics (OLAP), caching
Edge Storage	Vectorize, R2, D1, KV	RAG vectors, media, edge state

System Diagram

┌─────────────────────────────────────────────────────────────────────┐
│                   USER & DEVELOPER EXPERIENCE LAYER                  │
│   Flutter Apps (13 shells: Restaurant, Cockpit, Portal, KDS, ...)   │
│   MCP Development Server (75 tools for AI-assisted development)      │
└────────────────────────┬────────────────────────────────────────────┘
                         │
┌────────────────────────▼────────────────────────────────────────────┐
│                     EDGE LAYER (Cloudflare - 6 Workers)              │
│  AI Gateway │ Workers AI │ Vectorize RAG │ R2 Storage │ D1 / KV     │
│  ai-proxy │ restaurant-ai-chat-v2 │ vision-inference │ docs-chat    │
│  alert-ingest (Durable Objects) │ analytics-aggregator              │
└────────────────────────┬────────────────────────────────────────────┘
                         │
┌────────────────────────▼────────────────────────────────────────────┐
│                   API GATEWAY (Go - Cloud Run :8080)                 │
│    GraphQL + REST + WebSocket + Service Orchestration                │
│    + Distribution :8011 │ ACP Server :8090 │ IoT :50052 (gRPC)     │
└────────────────────────┬────────────────────────────────────────────┘
                         │
          ┌──────────────┴──────────────┐
          │                              │
┌─────────▼─────────┐         ┌─────────▼──────────┐
│  CORE SERVICES    │         │  PY SERVICES CLUSTER│
│  (Rust - Cloud Run)│        │ (Python - Cloud Run)│
│ • Auth      :8001 │         │ • Analytics    :8004│
│ • Platform  :8002 │         │ • ML           :8005│
│ • Commerce  :8003 │◄───────┤ • 68 route modules  │
│ • Creator   :8004 │  Events │ • LangGraph Agents  │
│ • CMS       :8005 │         │ • ACP AI Router     │
│ • Chat      :8007 │         └─────────┬──────────┘
│ • Alerting  :8080 │                   │
└─────────┬─────────┘                   │
          │    ┌─────────────────────────┘
          │    │
┌─────────▼────▼──────────────────────────────────────────────────────┐
│                     EVENT BUS (GCP Pub/Sub)                          │
└───────────────────────────┬─────────────────────────────────────────┘
                            │
┌───────────────────────────▼─────────────────────────────────────────┐
│                      DATA & STATE LAYER                              │
│  Cloud Spanner (OLTP) │ ClickHouse (OLAP) │ Redis/Memorystore │ GCS│
│  Cloudflare Vectorize (RAG) │ Cloudflare R2 (Media) │ D1 (Edge)   │
└─────────────────────────────────────────────────────────────────────┘

Service Inventory

Rust Services (7 deployed + 1 edge-only)

Service	Port	Purpose
Auth	8001	Authentication, JWT issuance, SSO, user management
Platform	8002	Policy Engine (Gating), tenancy, roles, permissions
Commerce	8003	Orders, payments, inventory, MDM, menu management
Creator	8004	Creator studio, content management, audience tools
CMS	8005	Headless CMS, web builder, content versioning
Chat	8007	Real-time messaging, WebSocket channels
Alerting	8080	Alert rules, notification dispatch, escalation (internal only, not exposed via API Gateway)
IoT Gateway	--	Edge device communication (not Cloud Run deployed)

Plus 6 shared libraries: shared, acp-client, acp-server, analytics, edge, ml.

Go Services (4 deployed + 3 CLI tools)

Service	Port	Purpose
API Gateway	8080	GraphQL + REST gateway, service orchestration (external entry point)
Distribution	8011	Content distribution, CDN orchestration
ACP Server	8090	AI Compute Platform, model routing
IoT Service	50052 (gRPC)	IoT device management, telemetry

CLI tools: devjwt, seed, seed_policy.

Python Services (2 deployed, 68 route modules)

Service	Port	Purpose
Analytics	8004	Core analytics, reporting, forecasting, data pipelines
ML	8005	Machine learning models, predictions, recommendations

Both share the codebase at backend/python/app/api/ with 68 FastAPI route modules covering AI/ML, analytics, operations, communication, and NLP.

Edge Workers (6 - Cloudflare)

Worker	Purpose
ai-proxy	OpenAI-compatible AI proxy, model routing
restaurant-ai-chat-v2	Minerva Ultra AI chat (customer-facing)
vision-inference	CLIP image analysis, vision AI
alert-ingest	Durable Objects alerting pipeline
analytics-aggregator	Edge analytics aggregation
docs-chat	Documentation RAG chat

AI Architecture (ACP AI Router)

The ACP AI Router achieves 95%+ cost savings vs direct Vertex AI by routing queries to the cheapest capable model:

Tier	Model	Cost (per M tokens)	Use For
T1	Llama 4 Scout (Workers AI)	FREE	Simple queries, classification
T2	Gemini 2.0 Flash	$0.10 / $0.40	Standard tasks
T3	Gemini 3 Flash	$0.50 / $3.00	Complex reasoning
T4	Claude Haiku 4.5	$1.00 / $5.00	Code generation
T5	Claude Sonnet 4.5	$3.00 / $15.00	Complex code, planning
T6	Claude Opus 4.5	$5.00 / $25.00	Critical decisions

LangGraph Agents: Business Assistant, Support, Inventory, Voice Ordering, Content Suggestion, Minerva (messaging), Maximus (voice AI), Dispute Resolution.

Data Architecture

Transactional Data (Cloud Spanner)

Globally distributed, strongly consistent database:

├── Tenants (root table)
│   ├── Locations (interleaved)
│   │   ├── Orders (interleaved)
│   │   ├── Menu Items (interleaved)
│   │   └── Inventory (interleaved)
│   ├── Users (interleaved)
│   └── Roles (interleaved)

Analytics Data (ClickHouse Cloud)

Real-time OLAP analytics (60-65% cost savings vs BigQuery):

Real-time streaming from Pub/Sub
Sub-second query performance
Sales analytics, demand forecasting, anomaly detection
P&L reporting, labor optimization

Cache (Redis / Memorystore)

Gating Engine policy evaluation cache
Session management
Rate limiting, real-time counters

Vector Data (Cloudflare Vectorize)

Edge-native vector database for RAG (5M vectors/index):

Documentation, menu, policy, support knowledge bases
Semantic search
Customer preferences

Communication Patterns

Synchronous (gRPC + HTTP)

Client -> API Gateway -> Core Service -> Database
                      |
                Response (< 100ms)

Asynchronous (Pub/Sub)

Service -> Pub/Sub Topic -> Subscribers
                         |
                Background Processing

Real-Time (WebSocket)

Client <-> WebSocket Gateway <-> Redis Pub/Sub <-> Services

Security Architecture

Defense in Depth

┌─────────────────────────────────────────────────────┐
│ Layer 1: Edge Security (Cloudflare)                 │
│ - DDoS protection  - WAF  - Bot management          │
├─────────────────────────────────────────────────────┤
│ Layer 2: Network Security (VPC)                     │
│ - Private networking  - Firewall rules              │
├─────────────────────────────────────────────────────┤
│ Layer 3: Application Security                       │
│ - JWT auth (RS256)  - RBAC  - Gating Engine         │
├─────────────────────────────────────────────────────┤
│ Layer 4: Data Security                              │
│ - RLS on all tables  - Encryption at rest           │
│ - GCP Secret Manager  - Parameterized queries only  │
└─────────────────────────────────────────────────────┘

Tenant Isolation

Database-level row isolation (org_id / tenant_id on all tables)
API-level tenant verification
Network-level VPC separation

Scalability

Horizontal Scaling

Component	Scaling Strategy
API Gateway	Cloud Run auto-scaling based on requests
Core Services	Cloud Run auto-scaling based on CPU/memory
Database	Spanner automatic node addition
Edge	Cloudflare global CDN (300+ locations)

Performance Targets

Metric	Target
API Latency (p99)	< 100ms
Order Creation	< 50ms
Gating Engine (cached)	< 10ms
Gating Engine (uncached)	< 50ms
WebSocket Latency	< 20ms

Multi-Tenancy - Tenant isolation
Edge Infrastructure - Edge architecture
AI Agents Architecture - LangGraph agents
ClickHouse Analytics - OLAP analytics
Data Sync & IoT - Edge data sync
API Reference - API documentation

Platform Overview​

System Diagram​

Service Inventory​

Rust Services (7 deployed + 1 edge-only)​

Go Services (4 deployed + 3 CLI tools)​

Python Services (2 deployed, 68 route modules)​

Edge Workers (6 - Cloudflare)​

AI Architecture (ACP AI Router)​

Data Architecture​

Transactional Data (Cloud Spanner)​

Analytics Data (ClickHouse Cloud)​

Cache (Redis / Memorystore)​

Vector Data (Cloudflare Vectorize)​

Communication Patterns​

Synchronous (gRPC + HTTP)​

Asynchronous (Pub/Sub)​

Real-Time (WebSocket)​

Security Architecture​

Defense in Depth​

Tenant Isolation​

Scalability​

Horizontal Scaling​

Performance Targets​

Related Documentation​