This skill should be used when user asks about "GCloud logs", "Cloud Logging queries", "Google Cloud metrics", "GCP observability", "trace analysis", or "debugging production issues on GCP".
View on GitHubFebruary 5, 2026
Select agents to install to:
npx add-skill https://github.com/fcakyon/claude-codex-settings/blob/main/plugins/gcloud-tools/skills/gcloud-usage/SKILL.md -a claude-code --skill gcloud-usageInstallation paths:
.claude/skills/gcloud-usage/# GCP Observability Best Practices
## Structured Logging
### JSON Log Format
Use structured JSON logging for better queryability:
```json
{
"severity": "ERROR",
"message": "Payment failed",
"httpRequest": { "requestMethod": "POST", "requestUrl": "/api/payment" },
"labels": { "user_id": "123", "transaction_id": "abc" },
"timestamp": "2025-01-15T10:30:00Z"
}
```
### Severity Levels
Use appropriate severity for filtering:
- **DEBUG:** Detailed diagnostic info
- **INFO:** Normal operations, milestones
- **NOTICE:** Normal but significant events
- **WARNING:** Potential issues, degraded performance
- **ERROR:** Failures that don't stop the service
- **CRITICAL:** Failures requiring immediate action
- **ALERT:** Person must take action immediately
- **EMERGENCY:** System is unusable
## Log Filtering Queries
### Common Filters
```
# By severity
severity >= WARNING
# By resource
resource.type="cloud_run_revision"
resource.labels.service_name="my-service"
# By time
timestamp >= "2025-01-15T00:00:00Z"
# By text content
textPayload =~ "error.*timeout"
# By JSON field
jsonPayload.user_id = "123"
# Combined
severity >= ERROR AND resource.labels.service_name="api"
```
### Advanced Queries
```
# Regex matching
textPayload =~ "status=[45][0-9]{2}"
# Substring search
textPayload : "connection refused"
# Multiple values
severity = (ERROR OR CRITICAL)
```
## Metrics vs Logs vs Traces
### When to Use Each
**Metrics:** Aggregated numeric data over time
- Request counts, latency percentiles
- Resource utilization (CPU, memory)
- Business KPIs (orders/minute)
**Logs:** Detailed event records
- Error details and stack traces
- Audit trails
- Debugging specific requests
**Traces:** Request flow across services
- Latency breakdown by service
- Identifying bottlenecks
- Distributed system debugging
## Alert Policy Design
### Alert Best Practices
- **Avoid alert fatigue:** Only alert on actionable issues
- **Use multi-condition alerts:** Reduce noise fro