Redis
- in-memory key-value store
- persistence options:
rdb redis database using snapshots: taken at a regular interval like every 5 minutes, data-loss can still occur
- append-only-file or
aof: all set operations are written to buffer which are then flushed to disk
always: synchronous (slow), eversec: flush every second (better performance), no unsave (best performance)
- persistence is usually too slow, instead replication is preferred method for resiliency
- supports different data types: strings, hashes, sets, sorted-set, JSON, counters
- TTL, important for ensuring capacity and performance, can be set to specific values using commands, or LRU
types
| type |
commands |
| String |
set, setex |
| List |
|
| Set |
sadd, smembers |
| Sorted Set |
zadd, zrange |
| Hash |
hget, hset |
| Zset |
zadd, zrange |
| Bitmaps |
|
| HyperLogLog |
pfadd, pfcount |
| Geo |
geoadd, geosearch |
| Stream |
xadd, xrange |
| Counter |
incr, decr |
| JSON |
|
AI
RedisVL: Redis Vector Library
Configurations
- single-node Redis is single-threaded
- replica reads from aof and and takes over on failures
- cluster is sharded amongst different nodes
- client retrieves map of keys-to-nodes when it starts and routes request to a particular node
- each node in a cluster usually has a replica
Operations
| Command |
notes |
| set mykey tom |
|
| set users:100 '{"name": "Tom", "age": 33}' |
keys with types and values as Json |
| hset users:100 name "Tom" |
hash-set a particular attribute |
setnx mykey <value> [<ttl>] |
set-if-not-exists |
| get mykey |
Get key/counter |
| hget users:100 name |
|
| del mykey |
|
| exists mykey |
|
| keys my* |
show all keys with wild-card |
| flushall |
clear all keys |
| setex mykey 10 tom |
set a key with expiry in seconds |
| sadd myset 1 2 3 |
Set-add elements to a Set |
| smembers myset |
Set-members get all members of a set |
| zadd myset 1 "one" |
sortedset-add element to a position in a set |
| zrange myset 0 -1 [REV] |
sortedset-range an ordered set |
| incr myctr |
increment counter |
| decr myctr |
decrement counter |
| xadd stream |
add data to stream |
geoadd <table> <long> <lat> <name> |
add an item at a specific GEO location |
geosearch <table> FROMLONLAT <long> <lat> <name> |
add an item at a specific GEO location |
Use cases
- cache
- rate-limiter: set/increment a counter on first request with TTL, if counter exceeds limit reject
- leaky bucket algorithm: no specific start time, but measure over last period
- pub-sub:
- Streaming: an ordered list of items to be processed by a consumer group
- only one consumer from a consumer can have a claim against a specific
- gives at-least-once guarantee (v/s exactly-once)
- leader-board: use
zadd to add an entry and use zrange to retrieve
- all items in the same leader-board use the same key, which may cause performance issues
- alternatively, keep as many leader-boards as shards, and merge them when retrieving
- indexed-lookups: build two data-stores; first the original, second which stores index-column to orignal-id mapping.
- distributed lock: use
setnx to obtain a lock
- counter
- global ID
- web session
- geo-spatial index: add locations by long-lat; and use
geosearch to search byradius
Transactions
- are not ACID compliant like databases; they are more like a batch of operations run synchronously
- steps:
- create pipeline with
multi operation
- add operations to pipeline
- execute the pipeline
HA
- sharding: split data across multiple nodes
- a key is hashed to a logical shard, which results in 16K hash-slots, a hash-slot is then mapped to a physical shard
- this helps with redistribution when a new shard is added
- replication: create multiple copies of data across multiple nodes
- by default asynchronously,
- use wait command to ensure data is written to certain number of replicas
- clustering: combine multiple nodes into a single logical unit
- All shards and replicas check health of each other to determine if a replica needs to be promoted to primary
- BP: Always keep odd number of shards and two replicas per shard to avoid split-brain scenarios
Persistence
- RDB: Redis-Database-Backup
- save a snapshot of the data to disk
- done by a forked process, since it requires a lot of IO
redis.conf options:
dbfilename, dir: name of the file and directory to save the snapshot to
save <sec> <changes>: one or more save options to specify when to trigger save
- can be configured to save at regular intervals or on demand
- AOF: Append-Only-File
- logs all write operations to a file
appendfsync how often to flush the buffer to disk: always, everysec (default), no (OS choice)
- when the file gets too big, redis can compact it by rewriting it to a new file