Performance optimization patterns for Mem0 memory operations including query optimization, caching strategies, embedding efficiency, database tuning, batch operations, and cost reduction for both Platform and OSS deployments. Use when optimizing memory performance, reducing costs, improving query speed, implementing caching, tuning database performance, analyzing bottlenecks, or when user mentions memory optimization, performance tuning, cost reduction, slow queries, caching, or Mem0 optimization.
View on GitHubFebruary 1, 2026
Select agents to install to:
npx add-skill https://github.com/vanman2024/ai-dev-marketplace/blob/main/plugins/mem0/skills/memory-optimization/SKILL.md -a claude-code --skill memory-optimizationInstallation paths:
.claude/skills/memory-optimization/# Memory Optimization Performance optimization patterns and tools for Mem0 memory systems. This skill provides comprehensive optimization techniques for query performance, cost reduction, caching strategies, and infrastructure tuning for both Platform and OSS deployments. ## Instructions ### Phase 1: Performance Assessment Start by analyzing your current memory system performance: ```bash bash scripts/analyze-performance.sh [project_name] ``` This generates a comprehensive performance report including: - Query latency metrics (average, P95, P99) - Operation throughput (searches, adds, updates, deletes) - Cache performance statistics - Resource utilization (memory, storage, CPU) - Slow query identification - Cost analysis **Review the output to identify optimization priorities:** - Query latency > 200ms → Focus on query optimization - High costs → Focus on cost optimization - Low cache hit rate < 60% → Focus on caching - High resource usage → Focus on infrastructure tuning ### Phase 2: Query Optimization Optimize memory search operations for speed and efficiency. #### 2.1 Limit Search Results **Problem**: Retrieving too many results increases latency and costs. **Solution**: Use appropriate limit values based on use case. ```python # ❌ BAD: Using default or excessive limits memories = memory.search(query, user_id=user_id) # Default: 10 # ✅ GOOD: Optimized limits memories = memory.search(query, user_id=user_id, limit=5) # Chat apps memories = memory.search(query, user_id=user_id, limit=3) # Quick context memories = memory.search(query, user_id=user_id, limit=10) # RAG systems ``` **Impact**: 30-40% reduction in query time **Guidelines**: - Chat applications: 3-5 results - RAG context retrieval: 8-12 results - Recommendation systems: 10-20 results - Semantic search: 20-50 results #### 2.2 Use Filters to Reduce Search Space **Problem**: Searching entire index is slow and expensive. **Solution**: Apply filters to narrow search scope. ```python # ❌