Cloud Run Deployment

Deploy and manage Olympus Cloud services on Google Cloud Run.

Overview

Cloud Run hosts all core backend services:

Service	Language	Port	CPU	Memory	Min Instances
API Gateway	Go	8080	1	512Mi	2
Auth Service	Rust	8001	1	512Mi	1
Platform Service	Rust	8002	2	1Gi	2
Commerce Service	Rust	8003	2	1Gi	2
Creator Service	Rust	8004	1	512Mi	1
CMS Service	Rust	8005	1	512Mi	1
Chat Service	Rust	8007	1	512Mi	1
Alerting Service	Rust	8080	1	512Mi	1
Distribution Service	Go	8011	1	512Mi	1
ACP Server	Go	8090	1	512Mi	1
IoT Service	Go (gRPC)	50052	1	512Mi	1
Analytics Service	Python	8004	2	2Gi	1
ML Service	Python	8005	4	4Gi	1

Service Configuration

Basic Deployment

# service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: platform-service
  annotations:
    run.googleapis.com/ingress: internal-and-cloud-load-balancing
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "2"
        autoscaling.knative.dev/maxScale: "100"
        run.googleapis.com/cpu-throttling: "false"
        run.googleapis.com/startup-cpu-boost: "true"
    spec:
      containerConcurrency: 80
      timeoutSeconds: 300
      serviceAccountName: platform-service@olympuscloud-prod.iam.gserviceaccount.com
      containers:
        - image: gcr.io/olympuscloud-prod/platform-service:latest
          ports:
            - containerPort: 8080
          resources:
            limits:
              cpu: "2"
              memory: 1Gi
          env:
            - name: ENVIRONMENT
              value: production
            - name: SPANNER_INSTANCE
              value: olympus-prod
          startupProbe:
            httpGet:
              path: /health/startup
              port: 8080
            initialDelaySeconds: 0
            periodSeconds: 1
            failureThreshold: 30
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
            periodSeconds: 5

Deploy Command

# Deploy service
gcloud run deploy platform-service \
  --image gcr.io/olympuscloud-prod/platform-service:v1.2.3 \
  --region us-central1 \
  --platform managed \
  --allow-unauthenticated \
  --service-account platform-service@olympuscloud-prod.iam.gserviceaccount.com \
  --set-env-vars "ENVIRONMENT=production,SPANNER_INSTANCE=prod-olympus-spanner" \
  --min-instances 2 \
  --max-instances 100 \
  --cpu 2 \
  --memory 1Gi \
  --concurrency 80 \
  --timeout 300

Auto-Scaling Configuration

Scaling Parameters

Parameter	Default	Production	Description
`minScale`	0	2	Minimum instances
`maxScale`	100	100	Maximum instances
`concurrency`	80	80	Requests per instance
`cpu-throttling`	true	false	CPU allocation

CPU Allocation

# Always-on CPU (recommended for production)
annotations:
  run.googleapis.com/cpu-throttling: "false"

Always-on CPU Benefits:

Consistent performance
Background task processing
WebSocket connections
Lower cold start latency

Scaling Behavior

Requests → Load Balancer → Cloud Run
                              │
                              ├── Instance 1 (80 concurrent)
                              ├── Instance 2 (80 concurrent)
                              ├── Instance 3 (80 concurrent)
                              └── ... up to maxScale

Traffic Management

Blue-Green Deployments

Best Practice

Always deploy new revisions with --no-traffic first and test the canary endpoint before shifting any production traffic. Start with 10% traffic split, monitor error rates for at least 5 minutes, then proceed to full rollout. This approach catches most regressions before they impact the majority of users.

# Deploy new revision without traffic
gcloud run deploy platform-service \
  --image gcr.io/olympuscloud-prod/platform-service:v1.3.0 \
  --no-traffic \
  --tag canary

# Test canary endpoint
curl https://canary---platform-service-xyz.run.app/health

# Gradually shift traffic
gcloud run services update-traffic platform-service \
  --to-tags canary=10

# Full rollout
gcloud run services update-traffic platform-service \
  --to-latest

# Rollback if needed
gcloud run services update-traffic platform-service \
  --to-revisions platform-service-v1-2-3=100

Traffic Splitting

# Split traffic between revisions
gcloud run services update-traffic platform-service \
  --to-revisions \
    platform-service-v1-2-3=90,\
    platform-service-v1-3-0=10

Environment Variables

Configuration

env:
  # Application config
  - name: ENVIRONMENT
    value: production
  - name: LOG_LEVEL
    value: info
  - name: RUST_LOG
    value: info,tower_http=debug

  # Database
  - name: SPANNER_PROJECT
    value: olympuscloud-prod
  - name: SPANNER_INSTANCE
    value: olympus-prod
  - name: SPANNER_DATABASE
    value: olympus

  # Service discovery
  - name: COMMERCE_SERVICE_URL
    value: https://commerce-service-xyz.run.app
  - name: AI_SERVICE_URL
    value: https://ai-service-xyz.run.app

Secrets

env:
  - name: DATABASE_PASSWORD
    valueFrom:
      secretKeyRef:
        name: database-password
        key: latest
  - name: API_KEY
    valueFrom:
      secretKeyRef:
        name: api-key
        key: latest

# Create secret
echo -n "secret-value" | gcloud secrets create api-key --data-file=-

# Grant access
gcloud secrets add-iam-policy-binding api-key \
  --member serviceAccount:platform-service@olympuscloud-prod.iam.gserviceaccount.com \
  --role roles/secretmanager.secretAccessor

Health Checks

Probe Configuration

// src/health.rs
use axum::{routing::get, Router, Json};
use serde::Serialize;

#[derive(Serialize)]
struct HealthResponse {
    status: String,
    version: String,
    checks: Vec<HealthCheck>,
}

#[derive(Serialize)]
struct HealthCheck {
    name: String,
    status: String,
    latency_ms: Option<u64>,
}

pub fn health_routes() -> Router {
    Router::new()
        .route("/health/startup", get(startup_check))
        .route("/health/live", get(liveness_check))
        .route("/health/ready", get(readiness_check))
}

async fn startup_check() -> &'static str {
    "OK"
}

async fn liveness_check() -> &'static str {
    "OK"
}

async fn readiness_check(
    State(state): State<AppState>,
) -> Result<Json<HealthResponse>, StatusCode> {
    let db_check = check_database(&state.db).await;
    let cache_check = check_cache(&state.cache).await;

    let all_healthy = db_check.status == "healthy"
        && cache_check.status == "healthy";

    let response = HealthResponse {
        status: if all_healthy { "healthy" } else { "degraded" }.into(),
        version: env!("CARGO_PKG_VERSION").into(),
        checks: vec![db_check, cache_check],
    };

    if all_healthy {
        Ok(Json(response))
    } else {
        Err(StatusCode::SERVICE_UNAVAILABLE)
    }
}

CI/CD Pipeline

Cloud Build Configuration

# cloudbuild.yaml
steps:
  # Build container
  - name: 'gcr.io/cloud-builders/docker'
    args:
      - 'build'
      - '-t'
      - 'gcr.io/$PROJECT_ID/platform-service:$SHORT_SHA'
      - '-t'
      - 'gcr.io/$PROJECT_ID/platform-service:latest'
      - '-f'
      - 'services/platform/Dockerfile'
      - '.'

  # Push to registry
  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', '--all-tags', 'gcr.io/$PROJECT_ID/platform-service']

  # Deploy to Cloud Run
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    entrypoint: gcloud
    args:
      - 'run'
      - 'deploy'
      - 'platform-service'
      - '--image'
      - 'gcr.io/$PROJECT_ID/platform-service:$SHORT_SHA'
      - '--region'
      - 'us-central1'
      - '--platform'
      - 'managed'

options:
  logging: CLOUD_LOGGING_ONLY

substitutions:
  _ENVIRONMENT: production

GitHub Actions

# .github/workflows/deploy.yml
name: Deploy to Cloud Run

on:
  push:
    branches: [main]
    paths:
      - 'services/platform/**'

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: google-github-actions/auth@v2
        with:
          credentials_json: ${{ secrets.GCP_SA_KEY }}

      - uses: google-github-actions/setup-gcloud@v2

      - name: Build and Push
        run: |
          gcloud builds submit \
            --config cloudbuild.yaml \
            --substitutions SHORT_SHA=${{ github.sha }}

Monitoring

Key Metrics

Metric	Alert Threshold	Description
`request_count`	-	Total requests
`request_latencies`	p99 > 500ms	Response time
`instance_count`	> 80% max	Scaling headroom
`billable_instance_time`	Budget alerts	Cost tracking

Cloud Monitoring Dashboard

{
  "displayName": "Cloud Run Services",
  "gridLayout": {
    "widgets": [
      {
        "title": "Request Latency (p99)",
        "xyChart": {
          "dataSets": [{
            "timeSeriesQuery": {
              "timeSeriesFilter": {
                "filter": "resource.type=\"cloud_run_revision\" AND metric.type=\"run.googleapis.com/request_latencies\"",
                "aggregation": {
                  "perSeriesAligner": "ALIGN_PERCENTILE_99"
                }
              }
            }
          }]
        }
      }
    ]
  }
}

Troubleshooting

Common Issues

Issue	Cause	Solution
Cold start latency	Min instances = 0	Set `minScale: 2`
Request timeouts	Long operations	Increase timeout, use async
OOM errors	Memory limit	Increase memory allocation
Connection limits	Too many DB connections	Use connection pooling

Debug Commands

# View logs
gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=platform-service" \
  --limit 100 \
  --format "table(timestamp,textPayload)"

# List revisions
gcloud run revisions list --service platform-service

# Describe service
gcloud run services describe platform-service --format yaml

# View current traffic
gcloud run services describe platform-service \
  --format "value(status.traffic)"

Environments - Environment configuration
Edge Workers - Cloudflare deployment
Metrics - Monitoring setup

Overview​

Service Configuration​

Basic Deployment​

Deploy Command​

Auto-Scaling Configuration​

Scaling Parameters​

CPU Allocation​

Scaling Behavior​

Traffic Management​

Blue-Green Deployments​

Traffic Splitting​

Environment Variables​

Configuration​

Secrets​

Health Checks​

Probe Configuration​

CI/CD Pipeline​

Cloud Build Configuration​

GitHub Actions​

Monitoring​

Key Metrics​

Cloud Monitoring Dashboard​

Troubleshooting​

Common Issues​

Debug Commands​

Related Documentation​