Authenticated API

This endpoint requires a valid JWT Bearer token. Accessible via the API gateway at /v1/ai/*.

AI Monitoring API

Real-time monitoring, performance dashboards, alerting, and operational controls for AI agents.

Overview

Attribute	Value
Base Path	`/api/v1/monitoring`
Authentication	Bearer Token
Required Roles	`platform_admin`, `system_admin`, `super_admin`, `ai_admin`, `tenant_admin`

Dashboard

Get Dashboard Summary

Retrieve the main monitoring dashboard with agent health and metrics.

GET /api/v1/monitoring/dashboard

Query Parameters

Parameter	Type	Description
`tenant_id`	uuid	Filter by tenant
`period`	string	`1h`, `24h`, `7d`

Response

{
  "generated_at": "2026-01-24T19:30:00Z",
  "period": "24h",
  "summary": {
    "total_agents": 12,
    "healthy": 10,
    "degraded": 1,
    "offline": 1,
    "health_score": 92
  },
  "metrics": {
    "total_requests": 45280,
    "successful_requests": 44835,
    "failed_requests": 445,
    "success_rate": 99.02,
    "avg_latency_ms": 285,
    "p95_latency_ms": 890,
    "p99_latency_ms": 1450
  },
  "costs": {
    "total_tokens": 12500000,
    "input_tokens": 8500000,
    "output_tokens": 4000000,
    "estimated_cost_usd": 125.50,
    "cost_by_model": [
      {"model": "llama-4-scout", "tokens": 8500000, "cost": 0.00},
      {"model": "gemini-2.0-flash", "tokens": 2500000, "cost": 45.00},
      {"model": "claude-haiku-4.5", "tokens": 1500000, "cost": 80.50}
    ]
  },
  "active_alerts": 2,
  "top_issues": [
    {
      "type": "high_latency",
      "agent": "voice_ai_agent",
      "description": "P95 latency above threshold",
      "severity": "warning"
    }
  ]
}

Agent Management

List Agents

GET /api/v1/monitoring/agents

Query Parameters

Parameter	Type	Description
`status`	string	`healthy`, `degraded`, `offline`, `paused`
`type`	string	Agent type filter
`tenant_id`	uuid	Filter by tenant

Response

{
  "agents": [
    {
      "id": "agent_voice_ai_001",
      "name": "Voice AI Agent - Drive Thru",
      "type": "voice_ai",
      "status": "healthy",
      "version": "2.5.0",
      "location_id": "loc_123",
      "metrics": {
        "uptime_percent": 99.95,
        "requests_24h": 5420,
        "avg_latency_ms": 180,
        "error_rate": 0.2
      },
      "last_heartbeat": "2026-01-24T19:29:55Z",
      "started_at": "2026-01-20T06:00:00Z"
    },
    {
      "id": "agent_maximus_001",
      "name": "Maximus AI Assistant",
      "type": "chat_agent",
      "status": "healthy",
      "version": "3.1.0",
      "metrics": {
        "uptime_percent": 99.99,
        "requests_24h": 12500,
        "avg_latency_ms": 450,
        "error_rate": 0.1
      }
    },
    {
      "id": "agent_minerva_001",
      "name": "Minerva Marketing Agent",
      "type": "marketing_ai",
      "status": "degraded",
      "version": "1.8.0",
      "metrics": {
        "uptime_percent": 98.5,
        "requests_24h": 890,
        "avg_latency_ms": 2500,
        "error_rate": 2.5
      },
      "issues": [
        {
          "type": "high_latency",
          "message": "Social API rate limiting causing delays"
        }
      ]
    }
  ],
  "total": 12
}

Agent Status Values

Status	Description
`healthy`	Operating normally
`degraded`	Performance issues
`offline`	Not responding
`paused`	Manually paused
`starting`	Initializing
`stopping`	Shutting down

Get Agent Details

GET /api/v1/monitoring/agents/{agent_id}

Response

{
  "id": "agent_voice_ai_001",
  "name": "Voice AI Agent - Drive Thru",
  "type": "voice_ai",
  "status": "healthy",
  "version": "2.5.0",
  "configuration": {
    "model_primary": "gemini-2.0-flash",
    "model_fallback": "claude-haiku-4.5",
    "max_concurrent": 10,
    "timeout_ms": 5000,
    "retry_count": 3
  },
  "health": {
    "status": "healthy",
    "last_check": "2026-01-24T19:29:55Z",
    "checks": [
      {"name": "model_connection", "status": "pass", "latency_ms": 45},
      {"name": "memory_usage", "status": "pass", "value": "68%"},
      {"name": "queue_depth", "status": "pass", "value": 3}
    ]
  },
  "metrics_detailed": {
    "requests": {
      "total": 5420,
      "successful": 5408,
      "failed": 12,
      "by_hour": [...]
    },
    "latency": {
      "avg_ms": 180,
      "p50_ms": 150,
      "p95_ms": 380,
      "p99_ms": 650,
      "by_hour": [...]
    },
    "tokens": {
      "input": 850000,
      "output": 420000,
      "by_model": {...}
    }
  },
  "recent_errors": [
    {
      "timestamp": "2026-01-24T18:45:00Z",
      "error": "timeout_exceeded",
      "message": "Model response timeout after 5000ms",
      "request_id": "req_abc123"
    }
  ],
  "dependencies": [
    {"service": "gemini-api", "status": "healthy"},
    {"service": "claude-api", "status": "healthy"},
    {"service": "speech-to-text", "status": "healthy"}
  ]
}

Agent Controls

Pause Agent

POST /api/v1/monitoring/agents/{agent_id}/pause

Request Body

{
  "reason": "Maintenance window",
  "duration_minutes": 30
}

Response

{
  "agent_id": "agent_voice_ai_001",
  "previous_status": "healthy",
  "current_status": "paused",
  "paused_at": "2026-01-24T19:30:00Z",
  "resume_at": "2026-01-24T20:00:00Z",
  "active_requests_drained": true
}

Resume Agent

POST /api/v1/monitoring/agents/{agent_id}/resume

Restart Agent

POST /api/v1/monitoring/agents/{agent_id}/restart

Request Body

{
  "reason": "Memory leak detected",
  "force": false
}

Scale Agent

POST /api/v1/monitoring/agents/{agent_id}/scale

Request Body

{
  "replicas": 5,
  "reason": "Peak traffic expected"
}

Alerts

List Alerts

GET /api/v1/monitoring/alerts

Query Parameters

Parameter	Type	Description
`status`	string	`active`, `acknowledged`, `resolved`
`severity`	string	`critical`, `warning`, `info`
`agent_id`	string	Filter by agent

Response

{
  "alerts": [
    {
      "id": "alert_001",
      "agent_id": "agent_minerva_001",
      "type": "high_latency",
      "severity": "warning",
      "status": "active",
      "title": "High Latency Detected",
      "message": "P95 latency (2500ms) exceeds threshold (1000ms)",
      "threshold": {
        "metric": "p95_latency_ms",
        "operator": "gt",
        "value": 1000,
        "current": 2500
      },
      "triggered_at": "2026-01-24T18:00:00Z",
      "last_occurrence": "2026-01-24T19:25:00Z",
      "occurrence_count": 15,
      "acknowledged_by": null
    },
    {
      "id": "alert_002",
      "agent_id": "agent_vision_001",
      "type": "offline",
      "severity": "critical",
      "status": "acknowledged",
      "title": "Agent Offline",
      "message": "No heartbeat received in 5 minutes",
      "triggered_at": "2026-01-24T19:20:00Z",
      "acknowledged_by": "user_001",
      "acknowledged_at": "2026-01-24T19:22:00Z"
    }
  ],
  "summary": {
    "critical": 1,
    "warning": 1,
    "info": 0
  }
}

Acknowledge Alert

POST /api/v1/monitoring/alerts/{alert_id}/acknowledge

Request Body

{
  "notes": "Investigating the root cause"
}

Resolve Alert

POST /api/v1/monitoring/alerts/{alert_id}/resolve

Request Body

{
  "resolution": "Restarted agent, latency back to normal",
  "root_cause": "Memory leak in image processing module"
}

Configure Alert Rules

PUT /api/v1/monitoring/alerts/rules

Request Body

{
  "rules": [
    {
      "name": "High Latency",
      "condition": {
        "metric": "p95_latency_ms",
        "operator": "gt",
        "threshold": 1000,
        "duration_minutes": 5
      },
      "severity": "warning",
      "channels": ["slack", "email"],
      "recipients": ["oncall@example.com"]
    },
    {
      "name": "Agent Offline",
      "condition": {
        "metric": "heartbeat_age_seconds",
        "operator": "gt",
        "threshold": 300
      },
      "severity": "critical",
      "channels": ["slack", "pagerduty"],
      "auto_remediate": {
        "action": "restart",
        "max_attempts": 2
      }
    },
    {
      "name": "High Error Rate",
      "condition": {
        "metric": "error_rate_percent",
        "operator": "gt",
        "threshold": 5,
        "duration_minutes": 10
      },
      "severity": "critical"
    }
  ]
}

Logging

Ingest Client Logs

Push logs from client applications.

POST /api/v1/monitoring/logs

Request Body

{
  "logs": [
    {
      "timestamp": "2026-01-24T19:30:00.123Z",
      "level": "error",
      "agent_id": "agent_voice_ai_001",
      "message": "Speech recognition failed",
      "context": {
        "session_id": "sess_001",
        "audio_duration_ms": 3500,
        "error_code": "AUDIO_TOO_NOISY"
      },
      "trace_id": "trace_abc123"
    }
  ]
}

Query Logs

GET /api/v1/monitoring/logs

Query Parameters

Parameter	Type	Description
`agent_id`	string	Filter by agent
`level`	string	`debug`, `info`, `warn`, `error`
`start_time`	datetime	Period start
`end_time`	datetime	Period end
`search`	string	Full-text search
`limit`	integer	Results limit

Metrics

Get Metrics

GET /api/v1/monitoring/metrics

Query Parameters

Parameter	Type	Description
`agent_id`	string	Filter by agent
`metric`	string	Specific metric name
`period`	string	`1h`, `24h`, `7d`, `30d`
`granularity`	string	`minute`, `hour`, `day`

Response

{
  "period": {
    "start": "2026-01-24T00:00:00Z",
    "end": "2026-01-24T19:30:00Z"
  },
  "metrics": {
    "requests": {
      "total": 45280,
      "by_agent": {...},
      "timeseries": [
        {"timestamp": "2026-01-24T00:00:00Z", "value": 1200},
        {"timestamp": "2026-01-24T01:00:00Z", "value": 980}
      ]
    },
    "latency": {
      "avg": 285,
      "p50": 220,
      "p95": 890,
      "p99": 1450,
      "timeseries": [...]
    },
    "tokens": {
      "input": 8500000,
      "output": 4000000,
      "timeseries": [...]
    },
    "errors": {
      "total": 445,
      "by_type": {
        "timeout": 180,
        "rate_limit": 120,
        "model_error": 85,
        "other": 60
      }
    }
  }
}

Export Metrics

POST /api/v1/monitoring/metrics/export

Request Body

{
  "metrics": ["requests", "latency", "tokens", "errors"],
  "period": {
    "start": "2026-01-01",
    "end": "2026-01-24"
  },
  "format": "csv",
  "granularity": "hour"
}

Health Checks

Get System Health

GET /api/v1/monitoring/health

Response

{
  "status": "healthy",
  "timestamp": "2026-01-24T19:30:00Z",
  "components": [
    {
      "name": "ai_gateway",
      "status": "healthy",
      "latency_ms": 12
    },
    {
      "name": "model_providers",
      "status": "healthy",
      "providers": [
        {"name": "anthropic", "status": "healthy"},
        {"name": "google", "status": "healthy"},
        {"name": "cloudflare_ai", "status": "healthy"}
      ]
    },
    {
      "name": "vector_db",
      "status": "healthy",
      "latency_ms": 8
    },
    {
      "name": "cache",
      "status": "healthy",
      "hit_rate": 0.85
    }
  ],
  "version": "2.5.0"
}

Webhooks

Event	Description
`monitoring.alert_triggered`	New alert triggered
`monitoring.alert_resolved`	Alert resolved
`monitoring.agent_status_changed`	Agent status change
`monitoring.agent_restarted`	Agent was restarted
`monitoring.threshold_exceeded`	Metric threshold exceeded

Error Responses

Status	Code	Description
400	`invalid_metric`	Metric name invalid
403	`insufficient_permissions`	Lacks admin scope
404	`agent_not_found`	Agent ID not found
404	`alert_not_found`	Alert ID not found
409	`agent_already_paused`	Agent already paused

AI Agents Guide - Agent setup
AI Safety - Safety guidelines
Alerting - Alerting configuration

Overview​

Dashboard​

Get Dashboard Summary​

Agent Management​

List Agents​

Agent Status Values​

Get Agent Details​

Agent Controls​

Pause Agent​

Resume Agent​

Restart Agent​

Scale Agent​

Alerts​

List Alerts​

Acknowledge Alert​

Resolve Alert​

Configure Alert Rules​

Logging​

Ingest Client Logs​

Query Logs​

Metrics​

Get Metrics​

Export Metrics​

Health Checks​

Get System Health​

Webhooks​

Error Responses​

Related Documentation​

Overview

Dashboard

Get Dashboard Summary

Agent Management

List Agents

Agent Status Values

Get Agent Details

Agent Controls

Pause Agent

Resume Agent

Restart Agent

Scale Agent

Alerts

List Alerts

Acknowledge Alert

Resolve Alert

Configure Alert Rules

Logging

Ingest Client Logs

Query Logs

Metrics

Get Metrics

Export Metrics

Health Checks

Get System Health

Webhooks

Error Responses

Related Documentation