GCP Vertex AI Model Garden Health - Overview¶
Description¶
Troubleshooting and remediation tasks for GCP Vertex AI Model Garden using Google Cloud Monitoring Python SDK.
Required IAM Roles: - roles/monitoring.viewer (for metrics access) - roles/logging.privateLogViewer (for audit logs access) - roles/serviceusage.serviceUsageConsumer (for service status checks)
Required Permissions: - monitoring.timeSeries.list - logging.privateLogEntries.list - serviceusage.services.list
Available Pages¶
Tasks¶
- Discover All Deployed Vertex AI Models in `${GCP_PROJECT_ID}`
- Analyze Vertex AI Model Garden Error Patterns and Response Codes in `${GCP_PROJECT_ID}`
- Investigate Vertex AI Model Latency Performance Issues in `${GCP_PROJECT_ID}`
- Monitor Vertex AI Throughput and Token Consumption Patterns in `${GCP_PROJECT_ID}`
- Check Vertex AI Model Garden API Logs for Issues in `${GCP_PROJECT_ID}`
- Check Vertex AI Model Garden Service Health and Quotas in `${GCP_PROJECT_ID}`
- Generate Vertex AI Model Garden Health Summary and Next Steps for `${GCP_PROJECT_ID}`
- Generate Normalized Health Report Table for `${GCP_PROJECT_ID}`
Service Level Indicators (SLIs)¶
- Quick Vertex AI Log Health Check for `${GCP_PROJECT_ID}`
- Calculate Error Rate Score for `${GCP_PROJECT_ID}`
- Calculate Latency Performance Score for `${GCP_PROJECT_ID}`
- Calculate Throughput Usage Score for `${GCP_PROJECT_ID}`
- Discover All Deployed Models for `${GCP_PROJECT_ID}`
- Check Service Availability Score for `${GCP_PROJECT_ID}`
- Generate Final Vertex AI Model Garden Health Score for `${GCP_PROJECT_ID}`