Redis Caching Strategies: From Simple Key-Value to Distributed Caching

Caching is the easiest performance optimization that every team gets wrong at least once. Redis, with its sub-millisecond latency and versatile data structures, is the go-to choice for production caching. But throwing a Redis instance in front of your database without a coherent strategy leads to stale data, cache misses under load, and memory exhaustion.
Cache-Aside vs Write-Through
Cache-aside (lazy loading) is the simplest and most common pattern. The application checks the cache first, falls back to the database on a miss, and populates the cache with the result:
def get_user(user_id):
# Check cache first
user = redis.get(f"user:{user_id}")
if user:
return json.loads(user)
# Cache miss - fetch from database
user = db.query("SELECT * FROM users WHERE id = %s", [user_id])
if user:
# Populate cache with TTL
redis.setex(f"user:{user_id}", 3600, json.dumps(user))
return user
Write-through caching updates the cache synchronously when data changes:
def update_user(user_id, data):
# Update database first
db.execute("UPDATE users SET name = %s WHERE id = %s",
[data["name"], user_id])
# Then update cache
redis.setex(f"user:{user_id}", 3600, json.dumps(data))
Cache-aside is resilient—if Redis goes down, the application falls through to the database. Write-through keeps the cache consistent at the cost of write latency. Combine them: use write-through for critical data that must be fresh, cache-aside for everything else.
TTL Strategies and Cache Invalidation
Time-to-live (TTL) is your first line of defense against stale data. Choose TTL based on data volatility:
- Session data: TTL = session duration (15–60 minutes)
- User profiles: TTL = 15–30 minutes for moderate freshness
- Product catalog: TTL = 1–6 hours for e-commerce
- System configuration: TTL = 24 hours or no TTL with explicit invalidation
For explicit invalidation, delete cache keys when the underlying data changes:
def publish_post(post_id, data):
db.insert("posts", data)
# Invalidate related caches
redis.delete(f"post:{post_id}")
redis.delete("posts:recent")
redis.delete(f"user:{data['author_id']}:posts")
This pattern works well when you know exactly which keys to invalidate. For complex query caches where the cache key doesn't directly map to the updated entity, use a generational cache: include a version number in the cache key and increment it when any related data changes.
Eviction Policies
When Redis runs out of memory, it evicts keys based on the configured policy:
- noeviction: Return errors on writes (default). Only use if you never hit the memory limit.
- allkeys-lru: Evict the least-recently-used key regardless of TTL. Best general-purpose choice.
- allkeys-lfu: Evict the least-frequently-used key. Good for access patterns with hot spots.
- volatile-ttl: Evict keys with the shortest TTL first. Useful when you want short-lived data to be evictable.
Configure via maxmemory-policy in redis.conf. For most workloads, allkeys-lru provides the best hit rate:
maxmemory 4gb
maxmemory-policy allkeys-lru
Caching Aggregated and Computed Data
Some of the biggest performance wins come from caching expensive computations, not individual database rows:
def get_dashboard_stats():
cache_key = "dashboard:stats"
cached = redis.get(cache_key)
if cached:
return json.loads(cached)
# Expensive multi-query aggregation
stats = {
"total_users": db.query_one("SELECT COUNT(*) FROM users"),
"orders_today": db.query_one(
"SELECT COUNT(*) FROM orders WHERE created_at > NOW() - INTERVAL '1 day'"
),
"revenue_mtd": db.query_one(
"SELECT SUM(amount) FROM payments WHERE status = 'completed' "
"AND created_at > DATE_TRUNC('month', NOW())"
),
}
# Cache for 5 minutes - dashboard doesn't need real-time accuracy
redis.setex(cache_key, 300, json.dumps(stats))
return stats
For sorted result sets (leaderboards, top products, feed items), use Redis sorted sets instead of caching serialized arrays. Sorted sets support range queries, pagination, and incremental updates without rewriting the entire cache:
# Maintain a sorted set of top products by view count
redis.zincrby("trending:products", 1, "product:123")
top_products = redis.zrevrange("trending:products", 0, 9, withscores=True)
Redis Cluster for Distributed Caching
A single Redis instance handles ~100K ops/sec. Beyond that, Redis Cluster partitions data across shards using hash slots:
from redis.cluster import RedisCluster
rc = RedisCluster(
startup_nodes=[
{"host": "redis-node-0", "port": 6379},
{"host": "redis-node-1", "port": 6379},
{"host": "redis-node-2", "port": 6379},
],
decode_responses=True,
)
Cluster handles failover and resharding automatically. The trade-off: multi-key operations (MGET, transactions, Lua scripts) only work on keys in the same hash slot. Design your key namespaces so related keys share a hash tag: user:{123}:profile and user:{123}:orders live on the same node.
Production Monitoring for Redis
Watch these metrics to stay ahead of cache issues:
- Hit rate below 80% suggests missing cache warming or incorrect TTL
- Evicted keys > 0 means you need more memory or a different eviction policy
- Connected clients approaching
maxclients(default 10K) needs connection pooling - Replication lag > 1 second on replicas suggests insufficient network bandwidth
Set alerts on eviction rate and replication lag. Cache misses under load cascade quickly to database overload.
Optimize Your Caching with SoniNow
Redis caching done right can reduce database load by 90%+ and cut API latency from hundreds of milliseconds to single digits. Our engineers at SoniNow design Redis caching layers that maximize hit rates while keeping data fresh.
Related Insights

API Rate Limiting Strategies: Token Bucket, Leaky Bucket, and Sliding Window
A guide to implementing API rate limiting including token bucket, leaky bucket, sliding window, and distributed rate limiting with Redis for production APIs.

Caching Strategies for Web Applications: Browser Cache, CDN, and Application Cache
A complete guide to web caching strategies including browser cache control, CDN configuration, service worker caching, application-level caching, and cache invalidation patterns.

Code Splitting and Lazy Loading in React: Performance Optimization Guide
A comprehensive guide to code splitting and lazy loading in React applications including React.lazy, Suspense, route-based splitting, and component-level chunking.