redis rdb 加载的一些思考 · 王很水的笔记

是和mc团队聊天的一些记录，直接贴出来

rdb格式

https://github.com/sripathikrishnan/redis-rdb-tools/blob/d39c8e5127daf3e109c0f0e101af8ed0e5400493/docs/RDB_File_Format.textile

----------------------------# RDB is a binary format. There are no new lines or spaces in the file.
52 45 44 49 53              # Magic String "REDIS"
30 30 30 33                 # RDB Version Number in big endian. In this case, version = 0003 = 3
----------------------------
FE 00                       # FE = code that indicates database selector. db number = 00
----------------------------# Key-Value pair starts
FD $unsigned int            # FD indicates "expiry time in seconds". After that, expiry time is read as a 4 byte unsigned int
$value-type                 # 1 byte flag indicating the type of value - set, map, sorted set etc.
$string-encoded-key         # The key, encoded as a redis string
$encoded-value              # The value. Encoding depends on $value-type
----------------------------
FC $unsigned long           # FC indicates "expiry time in ms". After that, expiry time is read as a 8 byte unsigned long
$value-type                 # 1 byte flag indicating the type of value - set, map, sorted set etc.
$string-encoded-key         # The key, encoded as a redis string
$encoded-value              # The value. Encoding depends on $value-type
----------------------------
$value-type                 # This key value pair doesn't have an expiry. $value_type guaranteed != to FD, FC, FE and FF
$string-encoded-key
$encoded-value
----------------------------
FE $length-encoding         # Previos db ends, next db starts. Database number read using length encoding.
----------------------------
...                         # Key value pairs for this database, additonal database


FF                          ## End of RDB file indicator
8 byte checksum             ## CRC 32 checksum of the entire file.

大概是这样

magic-version-kv-checksum

kv

key - object - encoding - value

value - obejct - encoding

解析流程

   while(1) {
        sds key;
        robj *val;

        /* Read type. */
        if ((type = rdbLoadType(rdb)) == -1) goto eoferr;

        /* Handle special types. */
        if (type == RDB_OPCODE_EXPIRETIME) {

所有操作封装在rio中，rdbload底层调用rio的read write来读数据

可以拆成buffer + batch read模式，拆分出读和buffer， AIO，加快读取解析速度
如何并发？已知key大概率没有重复（除非rehash期间dump，倒霉了）
- 本身是导入逻辑，不考虑key因果关系

32M page读，并发读10个，分别记住被斩断的key value，处理中间的数据

存在value特别巨大的场景，32M正好在中间，只能pin住，极端场景所有buffer都没找到key，只有value，一整个白读
key value没有明显的区分定位手段

FF rdb结尾 - 读到buffer计算checksum 读完checksum 直接校验 FE db符号，默认就一个0，这是个设计错误，应该也没人使用

只要出现FC FD 就是key

如何快速找到一段字符串中的特殊的几个字符？感觉能SIMD
如果没有，说明是前一段buffer的一部分，pin住
buffer大小？怎么定buffer合理？
dag taskflow buffer parse任务存在依赖关系，类似sync_wait

结论是分片并发导入是能做的。不过一般用rdb大小也控制在8G以下，内存带宽足够，时间不算瓶颈

如果rdb需要下载的话，比如rdb从s3上拉回来，这种部分导入并发导入的需求就很大了