django web应用之python logging TimedRotatingFileHandler的执行流程源码分析 Marathon

众所周知,python的自带logging模块是线程安全的,从内部源码也可看到,很多配置或者写入日志的时候,通过加锁-释放锁(threading.RLock())来实现线程安全。

内部源码参见logging.__int__(line 210)

_lock = threading.RLock()

def _acquireLock():
    """
    Acquire the module-level lock for serializing access to shared data.

    This should be released with _releaseLock().
    """
    if _lock:
        _lock.acquire()

def _releaseLock():
    """
    Release the module-level lock acquired by calling _acquireLock().
    """
    if _lock:
        _lock.release()

tips
单进程下,日志记录正常。多进程下日志丢失问题,根据日志切分源码,多进程同时切分日志时,会导致后面的进程删除前面线程切分出来的日志,导致日志丢失。
如果进程共用一个日志文件, rotate只需要执行一次, 但实际是每个进程都会执行一次上面的过程, 第一个rotate之外的进程, 在rm的时候删除的都是没有过期的日志。
解决多进程写同一日志安全问题,包括但不局限于如下几点:

  • 1.django内部的logging配置,使用 TimedRotatingFileHandler 的话,应该考虑日志轮转只有一个进程操作,考虑加锁-python自带的fcntl的文件锁机制,需要继承TimedRotatingFileHandler然后重写doRollover()方法,性能要求不高的可以考虑
  • 2.共享队列,通过将日志操作发送给queue,同步处理
  • 3.cron定时切分+脚本实现(常用的通过uwsgi(uwsgi操作日志,优先级高于django的TimedRotatingFileHandler)定向输出日出,结合脚本执行切分具体命令,再由cron定时执行)

接下来看看logging模块内部的源码文件
就3个文件,分别是:

  • __int__.py: 功能可以简单理解为,一般化的logging配置,读取设置filter/format/handler/logger,另外比较重要的功能是提供Logger的api接口,单个文件直接调用的话,直接通过Logger这个类实现实例调用,或者是handler常用的底层class,继承关系为(父->子):Handler(实现加锁解锁/flush/) -> StreamHandler(实现日志的实际的emit与flush操作) -> FileHandler()
  • config.py: 读取设置参数,如果是在较为大型的应用中,例如django web应用,一般会在django的settings.py中配置LOGGING参数,此参数也包含filter/format/handler/logger四项配置
  • handlers.py: 实现很多额外的不同类型的handler,比如基于大小的日志rotate的RotatingFileHanler,基于时间的TimedRotatingFileHandler,基于Socket,基于syslog,基于queue,基于http等等

重点看看class TimedRotatingFileHandler

从左到右,父类到子类的继承关系:
*Handler -> StreamHandler -> FileHandler -> BaseRotatingHandler -> TimedRotatingFileHandler*

TimedRotatingFileHandler的方法:

- computeRollover(currentTime): 初始化时调用,计算轮转时间点
- shouldRollover(record): 是否轮转
- getFilesToDelete(): 获取要删除的文件,被doRollover()调用
- doRollover(): 轮转日志

继承方法:
handlers.py
- BaseRotatingHandler.emit(record): 先判断是否需要轮转文件,然后发送日志到文件,实际调用FileHandler.emit(),然后调用StreamHandler.emit(), record实际上就是一系列的format参数,如asctime, filename, module, msg等等
- BaseRotatingHandler.rotation_filename(default_name): 返回文件轮转名
- BaseRotatingHandler.rotate(src, dst):轮转文件
__int__.py
- FileHandler.emit(record)
- StreamHandler.emit(record)

下面说说具体调用逻辑。
比如logger.<levelName>("hello world...")触发了logging模块时,
1.首先调用BaseRotatingHandler.emit()
---
def emit(self, record):
        """
        Emit a record.

        Output the record to the file, catering for rollover as described
        in doRollover().
        """
        try:
            if self.shouldRollover(record): # 程序调用TimedRotatingFileHandler.shouldRollover(), 如果当前时间小于rolloverat的时间,就返回0,否则1
                self.doRollover() # 上面检查为1,应该轮转,因为时间已经大于rolloverat的时间
            logging.FileHandler.emit(self, record) # 执行记录发送,如上面所述,执行StreamHandler.emit()方法,写入日志
        except Exception:
            self.handleError(record)
---

2.调用TimedRotatingFileHandler.doRollover(), 只有轮转时才会触发
---
def doRollover(self):
        """
        do a rollover; in this case, a date/time stamp is appended to the filename
        when the rollover happens.  However, you want the file to be named for the
        start of the interval, not the current time.  If there is a backup count,
        then we have to get a list of matching filenames, sort them and remove
        the one with the oldest suffix.
        """
        if self.stream: # stream非空,进入
            self.stream.close() # 关闭流
            self.stream = None # 重新复制None
        # get the time that this sequence started at and make it a TimeTuple
        currentTime = int(time.time()) # e.g. 1638257483
        dstNow = time.localtime(currentTime)[-1] # 0
        t = self.rolloverAt - self.interval # 1638239215
        if self.utc: # pass
            timeTuple = time.gmtime(t)
        else: # +08:00, 东八区
            timeTuple = time.localtime(t)
            dstThen = timeTuple[-1] # 0
            if dstNow != dstThen: # pass
                if dstNow:
                    addend = 3600
                else:
                    addend = -3600
                timeTuple = time.localtime(t + addend)
        dfn = self.rotation_filename(self.baseFilename + "." +          # 调用BaseRotatinghandler.rotation_filename(), 返回dfn="path/xxx.log.2021-11-30_10-26"(10-26为os.stat(file)的mtime)
                                     time.strftime(self.suffix, timeTuple))
        if os.path.exists(dfn): # 文件存在则rm, 本例不存在,所以不走这块
            os.remove(dfn)
        self.rotate(self.baseFilename, dfn) # 调用BaseRotatinghandler.rotate(src, dst),实现文件的轮转, 源文件存在则重命名
        if self.backupCount > 0:
            for s in self.getFilesToDelete(): # 获取需要删除的文件
                os.remove(s) # 返回非空时,遍历删除冗余文件
        if not self.delay: # delay为false, 进入
            self.stream = self._open() # 重置非空
        newRolloverAt = self.computeRollover(currentTime) # 重新计算轮转时间
        while newRolloverAt <= currentTime:
            newRolloverAt = newRolloverAt + self.interval
        #If DST changes and midnight or weekly rollover, adjust for this.
        if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc: # 非utc时间且(午夜更新或者每周x更新)
            dstAtRollover = time.localtime(newRolloverAt)[-1]
            if dstNow != dstAtRollover:
                if not dstNow:  # DST kicks in before next rollover, so we need to deduct an hour
                    addend = -3600
                else:           # DST bows out before next rollover, so we need to add an hour
                    addend = 3600
                newRolloverAt += addend
        self.rolloverAt = newRolloverAt # 更新轮转时间点
---

3.BaseRotatinghandler.rotation_filename()
---
def rotation_filename(self, default_name):
        """
        Modify the filename of a log file when rotating.

        This is provided so that a custom filename can be provided.

        The default implementation calls the 'namer' attribute of the
        handler, if it's callable, passing the default name to
        it. If the attribute isn't callable (the default is None), the name
        is returned unchanged.

        :param default_name: The default name for the log file.
        """
        if not callable(self.namer): # 通常走这里,返回传入的name
            result = default_name
        else:
            result = self.namer(default_name)
        return result
---
4.BaseRotatinghandler.rotate(src, dst)
---
def rotate(self, source, dest):
        """
        When rotating, rotate the current log.

        The default implementation calls the 'rotator' attribute of the
        handler, if it's callable, passing the source and dest arguments to
        it. If the attribute isn't callable (the default is None), the source
        is simply renamed to the destination.

        :param source: The source filename. This is normally the base
                       filename, e.g. 'test.log'
        :param dest:   The destination filename. This is normally
                       what the source is rotated to, e.g. 'test.log.1'.
        """
        if not callable(self.rotator): # self.rotator->None,执行这里
            # Issue 18940: A file may not have been created if delay is True.
            if os.path.exists(source): # 源文件存在,则重命名
                os.rename(source, dest)
        else:
            self.rotator(source, dest)
---
5.TimedRotatingFileHandler.getFilesToDelete()
---
def getFilesToDelete(self):
        """
        Determine the files to delete when rolling over.

        More specific than the earlier method, which just used glob.glob().
        """
        dirName, baseName = os.path.split(self.baseFilename) # 如path/xxx.log.2021-11-30_10-26,分离目录及文件名
        fileNames = os.listdir(dirName) # 获取当下目录的文件数组
        result = [] # 接下来的for循环中,如果文件数组的文件名符合前后缀的比较,以absPath加入到result数组中
        prefix = baseName + "."
        plen = len(prefix)
        for fileName in fileNames:
            if fileName[:plen] == prefix:
                suffix = fileName[plen:]
                if self.extMatch.match(suffix):
                    result.append(os.path.join(dirName, fileName))
        if len(result) < self.backupCount: # 如果数量小于备份数,认为不用删除, 重置[]
            result = []
        else: # 数量冗余,排序后返回需要删除的文件切片
            result.sort()
            result = result[:len(result) - self.backupCount]
        return result
---
到这里,一条日志的完整处理流程就结束了。

django启动时,在加载LOGGING的config时,加载到此模块,首先执行此模块的初始化:

依次从上往下压栈,
1.BaseRotatingHandler.__int__(self, filename, 'a', encoding, delay)
2.logging.FileHandler.__init__(self, filename, mode, encoding, delay)
3.StreamHandler.__init__(self, self._open())
4.Handler.__init__(self)

最终通过BaseRotatingHandler首先一系列的初始化,初始化后,获得以下属性:
- baseFilename
- delay
- mode
- name
- ratator # 执行轮转时用,默认None
- stream
...

接着通过自身的属性初始化获取了:
- when: 轮转时间点, 如midnight
- backCount: 备份数
- utc: 是否utc时间, false
- atTime: bool, fasle
- interval: 轮转间隔-秒, 86400
- suffix: 年-月-日
- excMatch: 匹配模式,删除冗余备份时用
- rolloverAt: 很重要,轮转时间点,比如现在2021-11-30 14:54:00,该值应是2021-12-01 00:00:00
关于rolloverAt这个属性,
  - 如果轮转时间不是按照星期或者midnight模式的话,举个例子,按照60min一次,此时,rolloverAt=文件的最后一次修改时间 + interval(60min)
  - 如果按照星期或者midnight的话,如midnight,按照当前的时间,当天剩余时间 r = 一天的秒数86400-当前的时间换算成秒,最终rolloverAt=文件的最后一次修改时间 + r

总的来说,在django起来的时候,对于TimeRotating部分,只会初始化其相关参数

1.django应用启动流程分析

1.1 运行项目目录下的manage.py

通过给定的参数进行启动,主要是

  • 指定文件名-manage.py,
  • 运行命令,如runserver
  • 启动主机ip:port

1.2 执行初始化settings

加载设置各种settings.py中涉及的参数,典型的如:

  • CACHES
  • INSTALLED_APPS
  • LOGGING
  • MIDDLEWARE
  • TIME_ZONE

1.3 加载设置目录下的初始化

如项目可能用到celery, 将celery.py中的的配置加载

2.django的LOGGING初始化

加载LOGGING中的四项设置参数:

  • filters-过滤,不同粒度
  • formatters(重要的)-输出日志格式
  • handlers(重要的)-日志的rotate
  • loggers-哪些需要记录日志

django启动时配置logging的顺序:

# django.utils.log
1.django setup时会加载settings.py中的LOGGING参数,然后调用logging.config.dictConfig,内部调用logging.config.dictConfigClass.configure()

# logging.config.dictConfigClass.configure
2.通过传入的settings.py中设置的LOGGING的map配置,依次配置
- 2.1 formatter -> 格式化
- 2.2 filter -> 过滤器
- 2.3 handler -> 发送给哪个文件记录
  - 遍历handlers进行配置
- 2.4 logger -> api,供应用程序调用

# logging.config.dictConfigClass.configure
3.该模块实质上就是用来解析django中的settings的LOGGING的配置用的

2.1 主要说说logging部分的handlers

初始化时会遍历LOGGINGh中的各个handler,针对具体的handler class,进行对应的初始化,比如 TimedRotatingFileHandler

1.初始化init
- rotate时间 when
- backcount, 备份数
- filename-日志name
- rolloverAt-计算轮转时间(TimedRotatingFileHandler.computeRollover())

2.是否轮转,根据时间差返回1/0,需要轮转返回1,否则返回0

3.打开日志文件

参考:

原文地址:https://www.cnblogs.com/davis12/p/15620667.html