Scribe + Kestrel

Scribe是facebook开源的分布式日志系统。项目主页是https://github.com/facebook/scribe

Kestrel是Twitter开源的分布式消息队列系统,采用Scala实现,其原型是采用Ruby实现的starling。项目主页是https://github.com/robey/kestrel

Scribe的server和client之间通信采用Thrift实现,server内部有一个消息队列,接收来自client的log-ertry先进入消息队列,最终写到文件或hdfs。

example给了两种使用模式:

1)single Scribe server:服务端接收到log-entry,直接写到文件

2)multi Scribe sever:server之间有主从之分,master负责执行写操作,slaves负责将接收的log-entry转发给master

二者结合过程中,发现Scribe的最终输出是文件或hdfs,无法与Kestrel直接通信,必然要在他们之间搭个桥。

这里就利用到了上述的第二种模式,Scribe的network type。

slaves和master的通信过程同样是RPC调用,所以,解决方法依旧要靠RPC,

以Python为例,SDK的if目录下有.thrift接口描述文件,编译安装后会有现成的Scribe库,利用该库实现一个Thrift RPC server,在scribeHandler的Log方法将接收到的消息,投递到Kestrel,即完成中转。

消息类型是LogEntry,有两个重要属性category和message,category可以看出是消息队列名称,相同category的message会存放在同一个队列上。

Kestrel有三种协议:memcached、thrift、text。分别在独立的端口提供服务,默认如下:

memcached  => 22133,thrift => 2229 ,text => 2222

text相对简单,模拟消息生产和消费:telnet到Kestrel server上,使用"put <queue_name>:\nmessage\n\n"进行生产和"get <queue_name>\n"进行消费,"\n"看作回车键即可。

使用过程中发现text总出奇怪的问题,比如读/写消息失败,阅读Wiki:

Text protocol

Kestrel supports a limited, text-only protocol. You are encouraged to use the memcache protocol instead.

The text protocol does not support reliable reads.

不建议使用纯文本协议,后用Thrift协议重新实现client,比较稳定,目前尚未遇到问题。

初识Scribe和Kestrel,如有疏漏,还请指出!

下面是示例代码

#!/usr/bin/env python2.6

# file: transition.py
# author: caosiyang
# date: 2013/04/08

import sys 
import socket
from scribe import scribe
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from thrift.server import TServer
from thrift.server import TNonblockingServer
import kestrel_handler
from utils import *


class scribeHandler:
    """Scribe handler."""

    def __init__(self):
        pass

    def Log(self, messages):
        """Log handler."""
        print 'recv:' , messages
        for item in messages:
            print 'category: %s, message: %s' % (item.category, item.message)
        return 0


class scribeKestrelHandler(scribeHandler):
    """Scribe handler with kestrel.
    
    Put the log entries that scribe received into kestrel.
    """

    def __init__(self, _kh):
        self._kh = _kh 

    def Log(self, messages):
        """Log handler."""
        print 'recv:' , messages
        for item in messages:
            print 'category: %s, message: %s' % (item.category, item.message)
            self._kh.put(item.category, item.message[:-1])
        return 0


def main():
    #kh = kestrel_handler.KestrelTextHandler('127.0.0.1', 2222)
    kh = kestrel_handler.KestrelThriftHandler('127.0.0.1', 2229)
    handler = scribeKestrelHandler(kh)
    #handler = scribeHandler()
    processor = scribe.Processor(handler)
    server_transport = TSocket.TServerSocket(port=1463)
    transport_factory = TTransport.TFramedTransportFactory()
    protocol_factory = TBinaryProtocol.TBinaryProtocolFactory()
    #server = TServer.TThreadedServer(processor, server_transport, transport_factory, protocol_factory)
    server = TNonblockingServer.TNonblockingServer(processor, server_transport, protocol_factory, protocol_factory)
    print 'Starting Transition-Server ...'
    server.serve()
    kh.close()


if __name__ == '__main__':
    main()
From http://www.cnblogs.com/caosiyang/
原文地址:https://www.cnblogs.com/caosiyang/p/3008527.html