Optimize token usage when delegating to Gemini CLI. Covers token caching, batch queries, model selection (Flash vs Pro), and cost tracking. Use when planning bulk Gemini operations.
View on GitHubmelodic-software/claude-code-plugins
google-ecosystem
plugins/google-ecosystem/skills/gemini-token-optimization/SKILL.md
January 21, 2026
Select agents to install to:
npx add-skill https://github.com/melodic-software/claude-code-plugins/blob/main/plugins/google-ecosystem/skills/gemini-token-optimization/SKILL.md -a claude-code --skill gemini-token-optimizationInstallation paths:
.claude/skills/gemini-token-optimization/# Gemini Token Optimization ## 🚨 MANDATORY: Invoke gemini-cli-docs First > **STOP - Before providing ANY response about Gemini token usage:** > > 1. **INVOKE** `gemini-cli-docs` skill > 2. **QUERY** for the specific token or pricing topic > 3. **BASE** all responses EXCLUSIVELY on official documentation loaded ## Overview Skill for optimizing cost and token usage when delegating to Gemini CLI. Essential for efficient bulk operations and cost-conscious workflows. ## When to Use This Skill **Keywords:** token usage, cost optimization, gemini cost, model selection, flash vs pro, caching, batch queries, reduce tokens **Use this skill when:** - Planning bulk Gemini operations - Optimizing costs for large-scale analysis - Choosing between Flash and Pro models - Understanding token caching benefits - Tracking usage across sessions ## Token Caching Gemini CLI automatically caches context to reduce costs by reusing previously processed content. ### Availability | Auth Method | Caching Available | | --- | --- | | API key (Gemini API) | YES | | Vertex AI | YES | | OAuth (personal/enterprise) | NO | ### How It Works - System instructions and repeated context are cached - Cached tokens don't count toward billing - View savings via `/stats` command or JSON output ### Maximizing Cache Hits 1. **Use consistent system prompts** - Same prefix increases cache reuse 2. **Batch similar queries** - Group related analysis together 3. **Reuse context files** - Same files in same order ### Monitoring Cache Usage ```bash result=$(gemini "query" --output-format json) total=$(echo "$result" | jq '.stats.models | to_entries | map(.value.tokens.total) | add // 0') cached=$(echo "$result" | jq '.stats.models | to_entries | map(.value.tokens.cached) | add // 0') billable=$((total - cached)) savings=$((cached * 100 / total)) echo "Total: $total tokens" echo "Cached: $cached tokens ($savings% savings)" echo "Billable: $billable tokens" ``` ## Model Selection ### Model Compariso