Redis pipeline and list

Redis

Redis 是一个开源的基于内存的数据结构存储器。通常可作为数据库，缓存和消息中介。它支持的数据结构有：字符串、哈希表、列表、集合、支持范围查询的有序集合、位图、hyperloglogs和带查询半径的地理空间索引。Redis有内置的复制、Lua脚本、LRU缓存、事务和不同层级的磁盘持久化功能，还通过Redis Sentinel提供了高可用性，通过Redis集群实现了自动化分割。

Pipeline

当业务需要发送多个相互独立的消息给redis，而不需要阻塞等待回答的时候，可以使用pipeline，将几个消息打包发送，这样可以减少网络的传输次数，提高性能。

#!/usr/bin/python3
import redis
import time

key = 'Redis:Test'
num = 100

def test_with_out_pipeline(count=1):
    r = redis.Redis(host='192.168.192.34',port=6379)
    for i in range(num):
        r.incr(key, count)


def test_with_pipeline(count=1):
    r = redis.Redis(host='192.168.192.34', port=6379)
    pipe = r.pipeline()
    for i in range(num):
        pipe.incr(count)
    res = pipe.execute()
    return res


def bench(desc):
    start = time.clock()
    desc()
    end = time.clock()
    cost = end - start
    print("function {} cost {}".format(desc.__name__, str(cost)))

if __name__ == '__main__':
    bench(test_with_out_pipeline)
    bench(test_with_pipeline)

使用场景，通过redis记录某种现象出现的次数，例如消费次数。pipline有优点，那就肯定有缺点，缺点是：

1.如果pipeline包含的信息太长，redis在处理的时候就会占用更多的内存。同时pipeline在处理的时候回独占链接（没仔细研究过），这期间的其他任何操作都会失败。建议就是给pipeline使用一个单独的Client。
2.pipeline只是将数据批量打包发送，很可能里面有部分请求处理失败的情况。这个时候使用LUA脚本会更加合适，而且LUA脚本可以保证原子性。

参考：

http://shift-alt-ctrl.iteye.com/blog/1863790
http://www.redis.cn/commands/eval.html

List

Redis列表是简单的字符串列表，按照插入顺序排序。你可以添加一个元素导列表的头部（左边）或者尾部（右边）
一个列表最多可以包含 232 - 1 个元素 (4294967295, 每个列表超过40亿个元素)。

#!/usr/bin/python3
import redis
import time

key = 'Redis:ListTest'


if __name__ == '__main__':
    r = redis.Redis(host='192.168.192.34',port=6379)
    r.lpush(key, '1')
    r.lpush(key, '2')
    r.rpush(key, '3')
    print(r.llen(key))
    res = r.rpop(key)
    print(res)
    res = r.lpop(key)
    print(res)

在做爬虫的过程中，需要保留从网页上分析之后的URL，那就可以使用列表。可以使用不同的key来保存不同的优先级队列，在爬取得时候先获取优先级最高的队列，如果该队列为空，获取优先级比较低的队列。