Redis 基础的数据结构

2019-02-16 redis cache Word Count: 2642 words Read Time: 6 minutes

An introduction to Redis data types and abstractions

Redis 的丰富的 data structrues:

Binary-safe strings 二进制安全的字符串
list 列表 sorted by insertion
Set 集合 unique & unsort
Hash 哈希表键值的映射类似Ruby or Python 的hash
Bit array(or simply bitmaps): 就是二进制位数组
Streams: 就是一个类似一个只允许后插入的一个map 实体(entries)集合?

Redis 的Key

redis 的键是一个二进制安全的字符串. 所以可以用string 甚至于JPEG 图片的content , empty string 也是有效的几条关于 key 的rules:

太长的 key 不好, 1024 byte 以上应该就是不仅仅是对于内存的影响, 并且在数据集里面查找键也是会有高昂的花费. 如果有太长的key 可以考虑hash(sha1)
太短了也不好. 这个主要就是关于可读性(readable)
尝试坚持使用 schema(摘要):For instance “object-type:id” is a good idea, as in “user:1000”. Dots or dashes are often used for multi-word fields, as in “comment🔢reply.to” or “comment🔢reply-to”.
key的最大程度512M

Redis Strings

redis 的string 是能和Redis 的key 关联的最简单的类型. 顺便提了一下 Memcached, String 是前者里面唯一的data type.

string value 也不能超过 512MB

set key val  [xx(exsist)|nx(not exsist)]

incr counter 

incrby counter 50

INCE 和 DECR 都是原子(atomic)操作

GETSET set a keys a new value and return old one

mset a 1 b 1 c 3
mget a b (return array) 
(mutil operate keys commnad can reduced latency )

del ke
exists key
type key

Redis expires: keys with limited time to live

一些关于 Redis 过期的 quick info

expires 可以试着两种精度秒级和毫秒级.
expires 的精度是1毫秒
expires 有效信息存储到磁盘上,所以如果你的redis 挂了, 但是你的这个时间还是会流过.

expire key 5

persist key #  change a  volatile to persistent

set key 100 ex 10 # set a key with expire

ttl key  # get key's remaining time

# milliseconds instead seconds
pexpire key
pttl key

Redis Lits

redis 的list 是实现(implement ) 不像 Python 是 Array 而是通过linked List 实现的所以重插入轻查询. 如果想要快速的查询一个大量数据的集合里的 element 的时候可以考虑使用 sorted sets. ps: 应该是一个双向linked list

常规指令

lrange list 0 -1 # all item
lpush list item
rpush list item
# both of them  are variadic commands  meaning that you are free to push mutilple elements into a list in a single command from left/right side

rpop list
lpop list

应用场景 :

在社交网络上记住最近更新的用户 post
两个进程使用生产-消费者时作为队列使用两个现实的例子

For example both the popular Ruby libraries resque and sidekiq use Redis lists under the hood in order to implement background jobs.

The popular Twitter social network takes the latest tweets posted by users into Redis lists. 具现化来说比如用户 po 的图片要展示最新的就可以将图片的id lpush 然后 lrange 0 9 就行了,不用查库.

capped lists (指定容量的容器)

> rpush mylist 1,2,3,5,6

> ltrim mylist 0 2

blocking operations on lists

tow command ; BRPOP/BLPOP 加强版 blocking 的 RPOP/LPOP

few note about BRPOP:

客户端被服务的,是按顺序等待的 ,就类似在银行拿号. 先拿先处理.
这俩指令是可以针对多个队列处理的, 所以在返回的结果里需要显示这个item 来自哪个队列,所以返回的是一个有两个item(队列名和 item 数据) 的list

# 原子操作
RPOPLPUSH source destination
BRPOPLPUSH (blocking version)
ex:
这个提供了一个 reliable 的队列, 其实就是放到一个 temp 队列里 ,如果处理完了,然后 将 temp 队列的对应这个 处理掉. 就是laravel 里面的重试队列.

# 队列的 remove 
LREM key count value

remove value from list key and  count(meaning how many item equal val will remove)

Automatic creation and removal of keys

Redis 不需要自己去创建 key 和remove empty key ,not specific to list ,it applues to all the Redis data type s composed of multiple elements – Streams , Sets, Sorted Sets and Hashes. Summarize the behavior with three rules:

when we add an element to an aggregate data type , if the target key does not exist, and empty aggregate data type is created before adding the element.( 当我们向一个数据类型添加一个元素的时候, 如果这个key 不存在就在添加之前创建一个空的类型的聚集 )
when we remove elements from an aggregate data type, if the value remains empty , the key is automatically

Redis Hashes

> hmset user:1000 username antirez birthyear 1977 verified 1
OK
> hget user:1000 username
"antirez"
> hget user:1000 birthyear
"1977"
> hgetall user:1000
1) "username"
2) "antirez"
3) "birthyear"
4) "1977"
5) "verified"
6) "1"

> HMGET is similar to HGET but returns an array of values

> Hincrby hashmap key value

Redis Sets

Redis Sets are unordered collections of strings ,

some command:

sadd myset 1 2 3

smembers myset

文档里显示是无序的, 但是本机测试是 ASC正序来的 (maybe因为我用的数字.

sismember mysql val 

#多个 set 集合的 交集
sinter  multiple-key 

sunionstore set1 set2
// 交集存储 到set1

SRANDMEMBER set [count]
// 随机去 count个数 不重复. 如果 countsi negative
// 提取的rule就会变 取一个 就放回一个
其实这种分配的方式并不随机, 因为底层的算法实现是 (dict.c) hash 表的存储桶 ,如果一个bucket 里面冲突了,应该就是使用的 linked list 解决冲突, 所以 就会链表里面有多个元素 , 他的随机是基于 bucket 的.

Redis Sorted sets

sorted set 类似 set 与 hash 的混合. 底层是用 skip list 和hash map 做的 skip list 存储的是指向 score 和obj 的指针排序. hash map 做的是 item 实例的数据存储.

zrevrange hackers 0 -1
zrange hackers 0 -1 withscores

// operating on ranges

ZRANGEBYSCORE key min max [WITHSCORES] [LIMIT offset count]


zrank key val 
//  in order to  get the  rank
Zrevrank key val

<= >=  <---> [] , <> <----> ()
ZRANGEBYLEX myzset - [c

Bitmaps

主要用在各种实时的分析
存储与对象ID相关联的空间高效并且高效率的 bool info

HyperLogLogs

A HyperLogLog is a probabilistic data structure used in order to count unique things (technically this is referred to estimating the cardinality of a set). hyperloglog 是一个基于概率的数据集结构被用作计算唯一的东西, (技术上来说就是被认为是去估计一个set 集合的基数) ??

Usually counting unique items requires using an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times 通常来讲统计计数唯一的 items 需要消耗你要统计的item 数量的内存空间.(简而言之就是统计多少item, 就需要多少item 的内存消耗) 因为你需要去记住你已经统计过的 elements 以免重复统计.

However there is a set of algorithms that trade memory for precision: you end with an estimated measure with a standard error, which in the case of the Redis implementation is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your HyperLogLog (We’ll just call them HLL from now) has seen very few elements. 不管怎么样, 这里有一个set 的算法, 通过精度和内存的交易吧.(博弈内存多, 精度准反之亦然)

Expires Keys

通常在2.4 以前 Redis 的过期的误差大概 0-1s 左右, 而 2.6 以后基本上也就 one milliseconds 了 Key 的expiring 是存储的 unix 时间戳 ,是个绝对值, 所以当Redis 实例没有启动这个时间也是flowing 的他的定时的原理有俩个

passive way(被动的方式): 当 key 被访问的时候判断是否过期(time out) 但是这样有大量的没有被访问的过期key 一直存在. 所以又有了另外一种方式
active way: 在一个expire set(设置了expire 的集合key里): 随机抽取一部分过期的删掉, 具体来说就是每秒10次, 取20个随机的key , 删除所有过期的, 如果过期的比例超过25% 然后重复执行不用再等100ms

Special encoding of small aggregate data types(对于小聚集的数据类型的特殊编码 ps 对内存的影响.)

hash-max-zipmap-entries 512 (hash-max-ziplist-entries for Redis >= 2.6)
hash-max-zipmap-value 64  (hash-max-ziplist-value for Redis >= 2.6)
list-max-ziplist-entries 512
list-max-ziplist-value 64
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
set-max-intset-entries 512
这种 redis.conf 的配置
5 - 10 倍的 内存的优化.

Bit and Byte Level Operations

使用 bit & byte level 的操作来处理业务, 会减少很多的内存的损耗

Use hashes when possible

Small hashes are encoded in a very small space, 使用一个hash 代替多个不同的key

Using hashes to abstract a very memory efficient plain key-value store on top of Redis

这个有点厉害了哦(使用hash做抽象的存储铭文的键值对是非常的节省内存在redis 里 (大意 ) 因为通常来说是可以用Redis 存储 Model 的 , 里面的明文键值对的存储会比用普通的 key-value 更节省空间, 比 Memcached 节省空间(暗搓搓的踩一 jio memcached , memcached 只有key-value 嘛) Why hashes are memory efficient, 因为 hash 设计来就允许内嵌其他的数据结构(data structure) only string.

we trust simplicity more than features, so nested data structures are not allowed, as expires of single fields are not allowed

Permalink: https://z8n24.top/2019/02/redis_data_type/

License:

z8n24

A Computer Engineer