Skip to main content

Cloud Run Deployment

Deploy and manage Olympus Cloud services on Google Cloud Run.

Overview

Cloud Run hosts all core backend services:

ServiceLanguagePortCPUMemoryMin Instances
API GatewayGo80801512Mi2
Auth ServiceRust80011512Mi1
Platform ServiceRust800221Gi2
Commerce ServiceRust800321Gi2
Creator ServiceRust80041512Mi1
CMS ServiceRust80051512Mi1
Chat ServiceRust80071512Mi1
Alerting ServiceRust80801512Mi1
Distribution ServiceGo80111512Mi1
ACP ServerGo80901512Mi1
IoT ServiceGo (gRPC)500521512Mi1
Analytics ServicePython800422Gi1
ML ServicePython800544Gi1

Service Configuration

Basic Deployment

# service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: platform-service
annotations:
run.googleapis.com/ingress: internal-and-cloud-load-balancing
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "2"
autoscaling.knative.dev/maxScale: "100"
run.googleapis.com/cpu-throttling: "false"
run.googleapis.com/startup-cpu-boost: "true"
spec:
containerConcurrency: 80
timeoutSeconds: 300
serviceAccountName: platform-service@olympuscloud-prod.iam.gserviceaccount.com
containers:
- image: gcr.io/olympuscloud-prod/platform-service:latest
ports:
- containerPort: 8080
resources:
limits:
cpu: "2"
memory: 1Gi
env:
- name: ENVIRONMENT
value: production
- name: SPANNER_INSTANCE
value: olympus-prod
startupProbe:
httpGet:
path: /health/startup
port: 8080
initialDelaySeconds: 0
periodSeconds: 1
failureThreshold: 30
livenessProbe:
httpGet:
path: /health/live
port: 8080
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 8080
periodSeconds: 5

Deploy Command

# Deploy service
gcloud run deploy platform-service \
--image gcr.io/olympuscloud-prod/platform-service:v1.2.3 \
--region us-central1 \
--platform managed \
--allow-unauthenticated \
--service-account platform-service@olympuscloud-prod.iam.gserviceaccount.com \
--set-env-vars "ENVIRONMENT=production,SPANNER_INSTANCE=prod-olympus-spanner" \
--min-instances 2 \
--max-instances 100 \
--cpu 2 \
--memory 1Gi \
--concurrency 80 \
--timeout 300

Auto-Scaling Configuration

Scaling Parameters

ParameterDefaultProductionDescription
minScale02Minimum instances
maxScale100100Maximum instances
concurrency8080Requests per instance
cpu-throttlingtruefalseCPU allocation

CPU Allocation

# Always-on CPU (recommended for production)
annotations:
run.googleapis.com/cpu-throttling: "false"

Always-on CPU Benefits:

  • Consistent performance
  • Background task processing
  • WebSocket connections
  • Lower cold start latency

Scaling Behavior

Requests → Load Balancer → Cloud Run

├── Instance 1 (80 concurrent)
├── Instance 2 (80 concurrent)
├── Instance 3 (80 concurrent)
└── ... up to maxScale

Traffic Management

Blue-Green Deployments

Best Practice

Always deploy new revisions with --no-traffic first and test the canary endpoint before shifting any production traffic. Start with 10% traffic split, monitor error rates for at least 5 minutes, then proceed to full rollout. This approach catches most regressions before they impact the majority of users.

# Deploy new revision without traffic
gcloud run deploy platform-service \
--image gcr.io/olympuscloud-prod/platform-service:v1.3.0 \
--no-traffic \
--tag canary

# Test canary endpoint
curl https://canary---platform-service-xyz.run.app/health

# Gradually shift traffic
gcloud run services update-traffic platform-service \
--to-tags canary=10

# Full rollout
gcloud run services update-traffic platform-service \
--to-latest

# Rollback if needed
gcloud run services update-traffic platform-service \
--to-revisions platform-service-v1-2-3=100

Traffic Splitting

# Split traffic between revisions
gcloud run services update-traffic platform-service \
--to-revisions \
platform-service-v1-2-3=90,\
platform-service-v1-3-0=10

Environment Variables

Configuration

env:
# Application config
- name: ENVIRONMENT
value: production
- name: LOG_LEVEL
value: info
- name: RUST_LOG
value: info,tower_http=debug

# Database
- name: SPANNER_PROJECT
value: olympuscloud-prod
- name: SPANNER_INSTANCE
value: olympus-prod
- name: SPANNER_DATABASE
value: olympus

# Service discovery
- name: COMMERCE_SERVICE_URL
value: https://commerce-service-xyz.run.app
- name: AI_SERVICE_URL
value: https://ai-service-xyz.run.app

Secrets

env:
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: database-password
key: latest
- name: API_KEY
valueFrom:
secretKeyRef:
name: api-key
key: latest
# Create secret
echo -n "secret-value" | gcloud secrets create api-key --data-file=-

# Grant access
gcloud secrets add-iam-policy-binding api-key \
--member serviceAccount:platform-service@olympuscloud-prod.iam.gserviceaccount.com \
--role roles/secretmanager.secretAccessor

Health Checks

Probe Configuration

// src/health.rs
use axum::{routing::get, Router, Json};
use serde::Serialize;

#[derive(Serialize)]
struct HealthResponse {
status: String,
version: String,
checks: Vec<HealthCheck>,
}

#[derive(Serialize)]
struct HealthCheck {
name: String,
status: String,
latency_ms: Option<u64>,
}

pub fn health_routes() -> Router {
Router::new()
.route("/health/startup", get(startup_check))
.route("/health/live", get(liveness_check))
.route("/health/ready", get(readiness_check))
}

async fn startup_check() -> &'static str {
"OK"
}

async fn liveness_check() -> &'static str {
"OK"
}

async fn readiness_check(
State(state): State<AppState>,
) -> Result<Json<HealthResponse>, StatusCode> {
let db_check = check_database(&state.db).await;
let cache_check = check_cache(&state.cache).await;

let all_healthy = db_check.status == "healthy"
&& cache_check.status == "healthy";

let response = HealthResponse {
status: if all_healthy { "healthy" } else { "degraded" }.into(),
version: env!("CARGO_PKG_VERSION").into(),
checks: vec![db_check, cache_check],
};

if all_healthy {
Ok(Json(response))
} else {
Err(StatusCode::SERVICE_UNAVAILABLE)
}
}

CI/CD Pipeline

Cloud Build Configuration

# cloudbuild.yaml
steps:
# Build container
- name: 'gcr.io/cloud-builders/docker'
args:
- 'build'
- '-t'
- 'gcr.io/$PROJECT_ID/platform-service:$SHORT_SHA'
- '-t'
- 'gcr.io/$PROJECT_ID/platform-service:latest'
- '-f'
- 'services/platform/Dockerfile'
- '.'

# Push to registry
- name: 'gcr.io/cloud-builders/docker'
args: ['push', '--all-tags', 'gcr.io/$PROJECT_ID/platform-service']

# Deploy to Cloud Run
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'deploy'
- 'platform-service'
- '--image'
- 'gcr.io/$PROJECT_ID/platform-service:$SHORT_SHA'
- '--region'
- 'us-central1'
- '--platform'
- 'managed'

options:
logging: CLOUD_LOGGING_ONLY

substitutions:
_ENVIRONMENT: production

GitHub Actions

# .github/workflows/deploy.yml
name: Deploy to Cloud Run

on:
push:
branches: [main]
paths:
- 'services/platform/**'

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: google-github-actions/auth@v2
with:
credentials_json: ${{ secrets.GCP_SA_KEY }}

- uses: google-github-actions/setup-gcloud@v2

- name: Build and Push
run: |
gcloud builds submit \
--config cloudbuild.yaml \
--substitutions SHORT_SHA=${{ github.sha }}

Monitoring

Key Metrics

MetricAlert ThresholdDescription
request_count-Total requests
request_latenciesp99 > 500msResponse time
instance_count> 80% maxScaling headroom
billable_instance_timeBudget alertsCost tracking

Cloud Monitoring Dashboard

{
"displayName": "Cloud Run Services",
"gridLayout": {
"widgets": [
{
"title": "Request Latency (p99)",
"xyChart": {
"dataSets": [{
"timeSeriesQuery": {
"timeSeriesFilter": {
"filter": "resource.type=\"cloud_run_revision\" AND metric.type=\"run.googleapis.com/request_latencies\"",
"aggregation": {
"perSeriesAligner": "ALIGN_PERCENTILE_99"
}
}
}
}]
}
}
]
}
}

Troubleshooting

Common Issues

IssueCauseSolution
Cold start latencyMin instances = 0Set minScale: 2
Request timeoutsLong operationsIncrease timeout, use async
OOM errorsMemory limitIncrease memory allocation
Connection limitsToo many DB connectionsUse connection pooling

Debug Commands

# View logs
gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=platform-service" \
--limit 100 \
--format "table(timestamp,textPayload)"

# List revisions
gcloud run revisions list --service platform-service

# Describe service
gcloud run services describe platform-service --format yaml

# View current traffic
gcloud run services describe platform-service \
--format "value(status.traffic)"