Skip to content

GCP Vertex AI Model Garden Health - Overview

Description

Troubleshooting and remediation tasks for GCP Vertex AI Model Garden using Google Cloud Monitoring Python SDK.

Required IAM Roles: - roles/monitoring.viewer (for metrics access) - roles/logging.privateLogViewer (for audit logs access) - roles/serviceusage.serviceUsageConsumer (for service status checks)

Required Permissions: - monitoring.timeSeries.list - logging.privateLogEntries.list - serviceusage.services.list

Available Pages

Tasks

  • Discover All Deployed Vertex AI Models in `${GCP_PROJECT_ID}`
  • Analyze Vertex AI Model Garden Error Patterns and Response Codes in `${GCP_PROJECT_ID}`
  • Investigate Vertex AI Model Latency Performance Issues in `${GCP_PROJECT_ID}`
  • Monitor Vertex AI Throughput and Token Consumption Patterns in `${GCP_PROJECT_ID}`
  • Check Vertex AI Model Garden API Logs for Issues in `${GCP_PROJECT_ID}`
  • Check Vertex AI Model Garden Service Health and Quotas in `${GCP_PROJECT_ID}`
  • Generate Vertex AI Model Garden Health Summary and Next Steps for `${GCP_PROJECT_ID}`
  • Generate Normalized Health Report Table for `${GCP_PROJECT_ID}`

Service Level Indicators (SLIs)

  • Quick Vertex AI Log Health Check for `${GCP_PROJECT_ID}`
  • Calculate Error Rate Score for `${GCP_PROJECT_ID}`
  • Calculate Latency Performance Score for `${GCP_PROJECT_ID}`
  • Calculate Throughput Usage Score for `${GCP_PROJECT_ID}`
  • Discover All Deployed Models for `${GCP_PROJECT_ID}`
  • Check Service Availability Score for `${GCP_PROJECT_ID}`
  • Generate Final Vertex AI Model Garden Health Score for `${GCP_PROJECT_ID}`