Authenticated API
This endpoint requires a valid JWT Bearer token. Accessible via the API gateway at /v1/ai/*.
AI Monitoring API
Real-time monitoring, performance dashboards, alerting, and operational controls for AI agents.
Overview
| Attribute | Value |
|---|---|
| Base Path | /api/v1/monitoring |
| Authentication | Bearer Token |
| Required Roles | platform_admin, system_admin, super_admin, ai_admin, tenant_admin |
Dashboard
Get Dashboard Summary
Retrieve the main monitoring dashboard with agent health and metrics.
GET /api/v1/monitoring/dashboard
Query Parameters
| Parameter | Type | Description |
|---|---|---|
tenant_id | uuid | Filter by tenant |
period | string | 1h, 24h, 7d |
Response
{
"generated_at": "2026-01-24T19:30:00Z",
"period": "24h",
"summary": {
"total_agents": 12,
"healthy": 10,
"degraded": 1,
"offline": 1,
"health_score": 92
},
"metrics": {
"total_requests": 45280,
"successful_requests": 44835,
"failed_requests": 445,
"success_rate": 99.02,
"avg_latency_ms": 285,
"p95_latency_ms": 890,
"p99_latency_ms": 1450
},
"costs": {
"total_tokens": 12500000,
"input_tokens": 8500000,
"output_tokens": 4000000,
"estimated_cost_usd": 125.50,
"cost_by_model": [
{"model": "llama-4-scout", "tokens": 8500000, "cost": 0.00},
{"model": "gemini-2.0-flash", "tokens": 2500000, "cost": 45.00},
{"model": "claude-haiku-4.5", "tokens": 1500000, "cost": 80.50}
]
},
"active_alerts": 2,
"top_issues": [
{
"type": "high_latency",
"agent": "voice_ai_agent",
"description": "P95 latency above threshold",
"severity": "warning"
}
]
}
Agent Management
List Agents
GET /api/v1/monitoring/agents
Query Parameters
| Parameter | Type | Description |
|---|---|---|
status | string | healthy, degraded, offline, paused |
type | string | Agent type filter |
tenant_id | uuid | Filter by tenant |
Response
{
"agents": [
{
"id": "agent_voice_ai_001",
"name": "Voice AI Agent - Drive Thru",
"type": "voice_ai",
"status": "healthy",
"version": "2.5.0",
"location_id": "loc_123",
"metrics": {
"uptime_percent": 99.95,
"requests_24h": 5420,
"avg_latency_ms": 180,
"error_rate": 0.2
},
"last_heartbeat": "2026-01-24T19:29:55Z",
"started_at": "2026-01-20T06:00:00Z"
},
{
"id": "agent_maximus_001",
"name": "Maximus AI Assistant",
"type": "chat_agent",
"status": "healthy",
"version": "3.1.0",
"metrics": {
"uptime_percent": 99.99,
"requests_24h": 12500,
"avg_latency_ms": 450,
"error_rate": 0.1
}
},
{
"id": "agent_minerva_001",
"name": "Minerva Marketing Agent",
"type": "marketing_ai",
"status": "degraded",
"version": "1.8.0",
"metrics": {
"uptime_percent": 98.5,
"requests_24h": 890,
"avg_latency_ms": 2500,
"error_rate": 2.5
},
"issues": [
{
"type": "high_latency",
"message": "Social API rate limiting causing delays"
}
]
}
],
"total": 12
}
Agent Status Values
| Status | Description |
|---|---|
healthy | Operating normally |
degraded | Performance issues |
offline | Not responding |
paused | Manually paused |
starting | Initializing |
stopping | Shutting down |
Get Agent Details
GET /api/v1/monitoring/agents/{agent_id}
Response
{
"id": "agent_voice_ai_001",
"name": "Voice AI Agent - Drive Thru",
"type": "voice_ai",
"status": "healthy",
"version": "2.5.0",
"configuration": {
"model_primary": "gemini-2.0-flash",
"model_fallback": "claude-haiku-4.5",
"max_concurrent": 10,
"timeout_ms": 5000,
"retry_count": 3
},
"health": {
"status": "healthy",
"last_check": "2026-01-24T19:29:55Z",
"checks": [
{"name": "model_connection", "status": "pass", "latency_ms": 45},
{"name": "memory_usage", "status": "pass", "value": "68%"},
{"name": "queue_depth", "status": "pass", "value": 3}
]
},
"metrics_detailed": {
"requests": {
"total": 5420,
"successful": 5408,
"failed": 12,
"by_hour": [...]
},
"latency": {
"avg_ms": 180,
"p50_ms": 150,
"p95_ms": 380,
"p99_ms": 650,
"by_hour": [...]
},
"tokens": {
"input": 850000,
"output": 420000,
"by_model": {...}
}
},
"recent_errors": [
{
"timestamp": "2026-01-24T18:45:00Z",
"error": "timeout_exceeded",
"message": "Model response timeout after 5000ms",
"request_id": "req_abc123"
}
],
"dependencies": [
{"service": "gemini-api", "status": "healthy"},
{"service": "claude-api", "status": "healthy"},
{"service": "speech-to-text", "status": "healthy"}
]
}
Agent Controls
Pause Agent
POST /api/v1/monitoring/agents/{agent_id}/pause
Request Body
{
"reason": "Maintenance window",
"duration_minutes": 30
}
Response
{
"agent_id": "agent_voice_ai_001",
"previous_status": "healthy",
"current_status": "paused",
"paused_at": "2026-01-24T19:30:00Z",
"resume_at": "2026-01-24T20:00:00Z",
"active_requests_drained": true
}
Resume Agent
POST /api/v1/monitoring/agents/{agent_id}/resume
Restart Agent
POST /api/v1/monitoring/agents/{agent_id}/restart
Request Body
{
"reason": "Memory leak detected",
"force": false
}
Scale Agent
POST /api/v1/monitoring/agents/{agent_id}/scale
Request Body
{
"replicas": 5,
"reason": "Peak traffic expected"
}
Alerts
List Alerts
GET /api/v1/monitoring/alerts
Query Parameters
| Parameter | Type | Description |
|---|---|---|
status | string | active, acknowledged, resolved |
severity | string | critical, warning, info |
agent_id | string | Filter by agent |
Response
{
"alerts": [
{
"id": "alert_001",
"agent_id": "agent_minerva_001",
"type": "high_latency",
"severity": "warning",
"status": "active",
"title": "High Latency Detected",
"message": "P95 latency (2500ms) exceeds threshold (1000ms)",
"threshold": {
"metric": "p95_latency_ms",
"operator": "gt",
"value": 1000,
"current": 2500
},
"triggered_at": "2026-01-24T18:00:00Z",
"last_occurrence": "2026-01-24T19:25:00Z",
"occurrence_count": 15,
"acknowledged_by": null
},
{
"id": "alert_002",
"agent_id": "agent_vision_001",
"type": "offline",
"severity": "critical",
"status": "acknowledged",
"title": "Agent Offline",
"message": "No heartbeat received in 5 minutes",
"triggered_at": "2026-01-24T19:20:00Z",
"acknowledged_by": "user_001",
"acknowledged_at": "2026-01-24T19:22:00Z"
}
],
"summary": {
"critical": 1,
"warning": 1,
"info": 0
}
}
Acknowledge Alert
POST /api/v1/monitoring/alerts/{alert_id}/acknowledge
Request Body
{
"notes": "Investigating the root cause"
}
Resolve Alert
POST /api/v1/monitoring/alerts/{alert_id}/resolve
Request Body
{
"resolution": "Restarted agent, latency back to normal",
"root_cause": "Memory leak in image processing module"
}
Configure Alert Rules
PUT /api/v1/monitoring/alerts/rules
Request Body
{
"rules": [
{
"name": "High Latency",
"condition": {
"metric": "p95_latency_ms",
"operator": "gt",
"threshold": 1000,
"duration_minutes": 5
},
"severity": "warning",
"channels": ["slack", "email"],
"recipients": ["oncall@example.com"]
},
{
"name": "Agent Offline",
"condition": {
"metric": "heartbeat_age_seconds",
"operator": "gt",
"threshold": 300
},
"severity": "critical",
"channels": ["slack", "pagerduty"],
"auto_remediate": {
"action": "restart",
"max_attempts": 2
}
},
{
"name": "High Error Rate",
"condition": {
"metric": "error_rate_percent",
"operator": "gt",
"threshold": 5,
"duration_minutes": 10
},
"severity": "critical"
}
]
}
Logging
Ingest Client Logs
Push logs from client applications.
POST /api/v1/monitoring/logs
Request Body
{
"logs": [
{
"timestamp": "2026-01-24T19:30:00.123Z",
"level": "error",
"agent_id": "agent_voice_ai_001",
"message": "Speech recognition failed",
"context": {
"session_id": "sess_001",
"audio_duration_ms": 3500,
"error_code": "AUDIO_TOO_NOISY"
},
"trace_id": "trace_abc123"
}
]
}
Query Logs
GET /api/v1/monitoring/logs
Query Parameters
| Parameter | Type | Description |
|---|---|---|
agent_id | string | Filter by agent |
level | string | debug, info, warn, error |
start_time | datetime | Period start |
end_time | datetime | Period end |
search | string | Full-text search |
limit | integer | Results limit |
Metrics
Get Metrics
GET /api/v1/monitoring/metrics
Query Parameters
| Parameter | Type | Description |
|---|---|---|
agent_id | string | Filter by agent |
metric | string | Specific metric name |
period | string | 1h, 24h, 7d, 30d |
granularity | string | minute, hour, day |
Response
{
"period": {
"start": "2026-01-24T00:00:00Z",
"end": "2026-01-24T19:30:00Z"
},
"metrics": {
"requests": {
"total": 45280,
"by_agent": {...},
"timeseries": [
{"timestamp": "2026-01-24T00:00:00Z", "value": 1200},
{"timestamp": "2026-01-24T01:00:00Z", "value": 980}
]
},
"latency": {
"avg": 285,
"p50": 220,
"p95": 890,
"p99": 1450,
"timeseries": [...]
},
"tokens": {
"input": 8500000,
"output": 4000000,
"timeseries": [...]
},
"errors": {
"total": 445,
"by_type": {
"timeout": 180,
"rate_limit": 120,
"model_error": 85,
"other": 60
}
}
}
}
Export Metrics
POST /api/v1/monitoring/metrics/export
Request Body
{
"metrics": ["requests", "latency", "tokens", "errors"],
"period": {
"start": "2026-01-01",
"end": "2026-01-24"
},
"format": "csv",
"granularity": "hour"
}
Health Checks
Get System Health
GET /api/v1/monitoring/health
Response
{
"status": "healthy",
"timestamp": "2026-01-24T19:30:00Z",
"components": [
{
"name": "ai_gateway",
"status": "healthy",
"latency_ms": 12
},
{
"name": "model_providers",
"status": "healthy",
"providers": [
{"name": "anthropic", "status": "healthy"},
{"name": "google", "status": "healthy"},
{"name": "cloudflare_ai", "status": "healthy"}
]
},
{
"name": "vector_db",
"status": "healthy",
"latency_ms": 8
},
{
"name": "cache",
"status": "healthy",
"hit_rate": 0.85
}
],
"version": "2.5.0"
}
Webhooks
| Event | Description |
|---|---|
monitoring.alert_triggered | New alert triggered |
monitoring.alert_resolved | Alert resolved |
monitoring.agent_status_changed | Agent status change |
monitoring.agent_restarted | Agent was restarted |
monitoring.threshold_exceeded | Metric threshold exceeded |
Error Responses
| Status | Code | Description |
|---|---|---|
| 400 | invalid_metric | Metric name invalid |
| 403 | insufficient_permissions | Lacks admin scope |
| 404 | agent_not_found | Agent ID not found |
| 404 | alert_not_found | Alert ID not found |
| 409 | agent_already_paused | Agent already paused |
Related Documentation
- AI Agents Guide - Agent setup
- AI Safety - Safety guidelines
- Alerting - Alerting configuration