通过日志关键字检测判断obb程序是否工作正常

C118+Osmocom-bb 多机 gsm sniff环境,经常发生工作一段时间后,某个手机监听的arfcn就不工作了。

检查日志发现,日志最后有连续的多条:TOA AVG is not 16 qbits, correcting (got 15),然后日志就一动不动了,无法再继续抓取sms,只能重启obb程序。

不清楚这是obb的程序bug,还是基站每天不定时调整( 某些arfcn,并不是一天24小时都工作的,有时会断那么一小会儿 )导致的。

重启obb程序的过程不算复杂,无非是先刷机(我没试过硬刷),再监听。

可以在smsweb里专门写一个方法,结合Python+shell命令定期(30秒)去检测日志(使用tail和diff命令),当判断obb工作不正常时,重新刷机(全自动刷机硬件改造方法参考置顶文章),起动监听程序。

参考代码如下:

def monitor_log():
    mysql = Database()
    while True:
        print("monitor log:")

            getusb = subprocess.Popen(["./osmocom-bb/getusb.sh"],stderr=subprocess.PIPE,stdout=subprocess.PIPE)
            usbResult = getusb.communicate()
            getusb.wait()
            device = re.findall(r'd',usbResult[0])[0]

        #find arfcn
            str_sql = "SELECT * FROM sniff limit 0," + str(device)
            data = mysql.query(str_sql)
            for row in data:
                    arf = str(row['arfcn'])
                    power = str(row['power'])
                    sptype = str(row['sptype'])
                    tty = str(row['tty'])

            counter = 0

                command = 'tail -n3 ./download_'+ tty +'.log'
                textlist = os.popen(command).readlines()
                for line in textlist:
                if "AVG" in line:
                    print("find got 15 in log! dangerous!")
                    counter = counter + 1

            #logger.info("AVG counter:" + str(counter) + " " + str(tty) + " arfcn:" + str(arf) )

            if int(counter) == 3:
                print("found 3 got 15! restart osmocon and sniff!") 
                    #cur_time = time.strftime('%Y/%m/%d %H:%M:%S',time.localtime(time.time()))
                logger.info("got 15 mon:" + str(tty) + " arfcn:" + str(arf) )
                    ps1=Process(target=download1,args=(str(tty),))
                    ps1.start()
                    ps1.join(10)
                #time.sleep(10)

                       ps2=Process(target=sniff,args=(str(tty),str(arf),))
                    ps2.start()
                    ps2.join(30)
                #time.sleep(30)
                #subprocess.Popen("./osmocom-bb/test.sh",shell = True)

            # 检测文件是否有变动
            cur_log = "download_" + tty + ".log"
            old_log = cur_log + ".old"
                getdiff = subprocess.Popen(["./diff.sh",cur_log,old_log],stderr=subprocess.PIPE,stdout=subprocess.PIPE)
                diffResult = getdiff.communicate()
                getdiff.wait()
                diff_ret = re.findall(r'd',diffResult[0])[0]
                        #logger.info("logchange mon:" + str(tty) + " arfcn:" + str(arf) + " diff_ret:" + str(diff_ret))
            if int(diff_ret) == 0:
#                print("log not change in 30secs! restart osmocon and sniff!")
#                                #cur_time = time.strftime('%Y/%m/%d %H:%M:%S',time.localtime(time.time()))
                                logger.info("log diff:" + str(tty) + " arfcn:" + str(arf) )
                                ps1=Process(target=download1,args=(str(tty),))
                                ps1.start()
                                ps1.join(10)
                                #time.sleep(10)

                                ps2=Process(target=sniff,args=(str(tty),str(arf),))
                                ps2.start()
                                ps2.join(30)
                                #time.sleep(30)
                               #subprocess.Popen("./osmocom-bb/test.sh",shell = True)

        time.sleep(30)

diff.sh:

#!/bin/bash

#diff ./download_0.log ./download_0.log.old
diff $1 $2 >> diff_$1
#echo $?
if [ $? = 0 ];then
        #echo "没区别"
        echo "0"
else
        #echo "文件有变动"
        rm -fr $2
        cp $1 $2
        #echo "文件同步成功"
        echo "1"
fi

说明:

1. 当日志里连续三行的日志都出现AVG关键字时,就认为obb工作不正常了,果断重新刷机监听。

2.当日志过了30秒后内容还和30秒前一样时,也是不正常的,重新刷机监听。

原文地址:https://www.cnblogs.com/k1two2/p/6998041.html