014_zk路径过滤分析

一、线上zk访问延迟特别高需要统计一段时间内的zk写入路径top10,实现如下:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import re,traceback

def gen_range_hosts(path,n):
    new_path=  ""
    try:
        re_match = re.match(r'(.*)"path":"(.*)","version"', path, re.M | re.I)
        if re_match is not None:
            new_path = re_match.group(2)
    except:
        print "++++++++++++{n}++++++++++++{path}".format(n=n, path=path)
        traceback.print_exc()

    return new_path

def main():
    with open('./publisher.log', 'r') as f:
        n = 1
        for line in f.readlines():
            n +=1
            new_line = line.strip()
            if new_line.find("path") != -1:
                print gen_range_hosts(new_line,n)

if __name__ == '__main__':
    main()
'''
<1>过滤日志命令:
cat newlog.log |egrep -v "^$"|sort |uniq -c|sort -rn >> okok.log
'''

二、可以根据指定时间过滤日志路径的功能需要实现。

原文地址:https://www.cnblogs.com/arun-python/p/9948912.html