Python Revisited Day 09 (调试、测试与Profiling)

9.1 调试
- 9.1.1 处理语法错误
9.1.2 处理运行时错误
9.1.3 科学的调试
9.2 单元测试
9.3 Profiling

9.1 调试

定期地进行备份是程序设计中地一个关键环节——不管我们的机器，操作系统多么可靠以及发生失败的概率多么微乎其微——因为失败仍然是可能发生的。备份一般都是粗粒度的——备份文件是几小时之前的，甚至是几天之前的。

9.1.1 处理语法错误


if True
    print("stupid!!!")
else:
    print("You will never see me...")

  File "C:/Py/modeltest.py", line 5
    if True
          ^
SyntaxError: invalid syntax

上面的例子中，if后面忘记加了“：”，所以报错。


try:
    s = "Tomorrow is a new day, {0}"
    s2 = "gone with the wind..."
    print(s.format(s2)

except ValueError as err:
    print(err)

  File "C:/Py/modeltest.py", line 10
    except ValueError as err:
         ^
SyntaxError: invalid syntax

看上面的例子，实际上，报错的位置并没有错误，真正的错误在于print后少了半边括号，但是Python在运行到此处的时候并没有意识到错误, 因为可能通过括号分行，所以显示错误在了下一行。

9.1.2 处理运行时错误

pass

9.1.3 科学的调试

如果程序可以运行，但程序行为和期待的或需要的不一致，就说明程序中存在一个bug——必须清除的逻辑错误。清楚这类错误的最好方法是首先使用TDD(测试驱动的开发)来防止发生这一类错误，然而，总会有些bug没有避免，因此，即便使用TDD，调试也仍然是必须学习和掌握的技能。

为清楚一个bug, 我们必须采取如下一个步骤：

再现bug
定位bug
修复bug
对修复进行测试

Pycharm Debug调试心得-放下扳手&拿起键盘

9.2 单元测试

单元测试——对单独的函数、类与方法进行测试，确保其符合预期的行为。

就像我们之前那样做的：

if __name__ == "__main__":
	import doctest
	doctest.testmod()

另一种执行doctest的方法是使用uniitest模块创建单独的测试程序。unittest模块可以基于doctests创建测试用例，而不需要指导程序或模块包含的任何事物——只要指导其包含doctest即可。

我们创建了一个docunit.py的程序：



def test(x):
    """
    >>> test(-1)
    'hahahaha'
    >>> test(1)
    'lalalala'
    >>> test('1')
    'wuwuwuwuwuwu'
    """
    s1 = "hahahahha"
    s2 = "lalalalala"
    s3 = "wuwuwuwuwuwu"
    try:
        if x <= 0:
            return s1
        else:
            return s2
    except:
        return s3

注意，如果运行测试，前俩条会出错，因为不匹配。

再创建一个新的程序：



import doctest
import unittest
import docunit


suite = unittest.TestSuite()
suite.addTest(doctest.DocTestSuite(docunit))
runner = unittest.TextTestRunner()
print(runner.run(suite))

注意，第三个import的是自己的程序，输出为：

<unittest.runner.TextTestResult run=1 errors=0 failures=1>
F
======================================================================
FAIL: test (docunit)
Doctest: docunit.test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:Analibdoctest.py", line 2198, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for docunit.test
  File "C:Pydocunit.py", line 7, in test

----------------------------------------------------------------------
File "C:Pydocunit.py", line 9, in docunit.test
Failed example:
    test(-1)
Expected:
    'hahahaha'
Got:
    'hahahahha'
----------------------------------------------------------------------
File "C:Pydocunit.py", line 11, in docunit.test
Failed example:
    test(1)
Expected:
    'lalalala'
Got:
    'lalalalala'


----------------------------------------------------------------------
Ran 1 test in 0.000s

FAILED (failures=1)

Process finished with exit code 0

只是，这个时候，我们写的程序的程序名，必须为有效的模块名。

unittest 模块定义了4个关键概念。测试夹具是一个用于描述创建测试（以及用完之后将其清理）所必须的代码的术语，典型实例是创建测试所用的一个输入文件，最后删除输入文件与结果输出文件。测试套件是一组测试用例的组合。测试用例是测试的基本单元。测试运行着是执行一个或多个测试套件的对象。
典型情况下，测试套件是通过创建unittest.TestCase的子类实现的，其中每个名称以“test”开头的方法都是一个测试用例。如果我们需要完成任何创建操作，就可以在一个名为setUp()的方法中实现；类似地，对任何清理操作，也可以实现一个名为tearDown()的方法。在测试内部，有大量可供我们使用的unittest.TestCase方法，包括assertTrue(), assertEqual()， assertAlmostEqual()（对于测试浮点数很有用）、assertRaises()以及更多，还包括对应的逆方法，比如assertFalse(), assertNotEqual()、faillfEqual()、failUnlessEqual()等。

下面是一个例子，因为不知道该编一个啥，就用一个最简单的，只是为了说明这个unittest该怎么用。


import unittest



class List(list):

    def plus(self, other):
        return list(set(self + other))



class TestList(unittest.TestCase):

    def setUp(self):
        self.list1 = List(range(3))
        self.list2 = list(range(2, 5))

    def test_list_add(self):
        addlist = self.list1 + self.list2
        self.assertEqual(
            addlist, [0, 1, 2, 2, 3, 4]
        )

    def test_list_plus(self):
        pluslist = self.list1.plus(self.list2)
        self.assertNotEqual(
            pluslist, [0, 1, 2, 2, 3, 4]
        )
        def process():
            self.list2.plus(self.list1)
        self.assertRaises(
            AttributeError, process   #注意assertRaises的第二项必须callable Obj
        )

    def tearDown(self):
        """
        我不知道这么做有没有用
        :return: 
        """
        del self

if __name__ == "__main__":
    suite = unittest.TestLoader().loadTestsFromTestCase(
        TestList
    )
    runner = unittest.TextTestRunner()
    print(runner.run(suite))

更多的函数，在博客,还蛮详细的：
python的unittest单元测试框架断言整理汇总-黑面狐

9.3 Profiling

一些合理的Python程序设计风格，对提高程序性能不无裨益：

在需要只读序列是，最好使用元组而非列表；
使用生成器，而不是创建大的元组和列表并在其上进行迭代处理
尽量使用Python内置的数据结构——dicts, list, tuples——而不实现自己的自定义结构
从小字符串中产生大字符串时，不要对小字符串进行连接，而是在列表中累积，最后将字符串列表结合为一个单独的字符串
最后一点，如果某个对象需要多次使用属性进行访问，或从某个数据结构中进行访问，那么较好的做法时创建并使用一个局部变量来访问该对象。

在jupiter notebook里面用%%time输出cell单次运行的时间，%%timeit 输出运行10万次?的平均之间.

使用timeit模块：

import timeit

def function_a(x, y):
    for i in range(10000):
        x + y

def function_b(x, y):
    for i in range(10000):
        x * y

def function_c(x, y):
    for i in range(10000):
        x / y



if __name__ == "__main__":
    repeats = 1000
    X = 123.123
    Y = 43.432
    for function in ("function_a", "function_b",
                     "function_c"):
        t = timeit.Timer("{0}(X, Y)".format(function),
                         "from __main__ import {0}, X, Y".format(function))
        sec = t.timeit(repeats) / repeats
        print("{function}() {sec:.6f} sec".format(**locals()))

其中timeit.Timer()函数的第一个参数，是我们需要执行的字符串，第二个参数也是可执行的字符串，是用以提供参数的。

function_a() 0.000386 sec
function_b() 0.000384 sec
function_c() 0.000392 sec

利用cProfile模块，会更加方便且详细地给出运行时间地指示：

import cProfile
import time


def function_a(x, y):
    for i in range(10000):
        function_f(x, y)
    function_d()

def function_b(x, y):
    for i in range(10000):
        function_f(x, y)
    function_d()
    function_d()

def function_c(x, y):
    for i in range(10000):
        function_f(x, y)
    function_d()
    function_d()
    function_d()

def function_d():
    time.sleep(0.01)

def function_f(x, y):
    x * y


if __name__ == "__main__":
    repeats = 1000
    X = 123.123
    Y = 43.432
    for function in ("function_a", "function_b",
                     "function_c"):
        cProfile.run("for i in range(1000): {0}(X, Y)"
                     .format(function))

         10003003 function calls in 16.040 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.007    0.007   16.040   16.040 <string>:1(<module>)
     1000    3.878    0.004   16.033    0.016 modeltest.py:13(function_a)
     1000    0.006    0.000   10.241    0.010 modeltest.py:31(function_d)
 10000000    1.915    0.000    1.915    0.000 modeltest.py:34(function_f)
        1    0.000    0.000   16.040   16.040 {built-in method builtins.exec}
     1000   10.235    0.010   10.235    0.010 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


         10005003 function calls in 28.183 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.008    0.008   28.183   28.183 <string>:1(<module>)
     1000    4.873    0.005   28.175    0.028 modeltest.py:18(function_b)
     2000    0.015    0.000   20.903    0.010 modeltest.py:31(function_d)
 10000000    2.399    0.000    2.399    0.000 modeltest.py:34(function_f)
        1    0.000    0.000   28.183   28.183 {built-in method builtins.exec}
     2000   20.887    0.010   20.887    0.010 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


         10007003 function calls in 38.968 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.008    0.008   38.968   38.968 <string>:1(<module>)
     1000    5.004    0.005   38.959    0.039 modeltest.py:24(function_c)
     3000    0.024    0.000   31.498    0.010 modeltest.py:31(function_d)
 10000000    2.457    0.000    2.457    0.000 modeltest.py:34(function_f)
        1    0.000    0.000   38.968   38.968 {built-in method builtins.exec}
     3000   31.474    0.010   31.474    0.010 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

ncalls: 调用地次数
tottime: 在某个函数中耗费的总时间，但是派出了函数调用的其他函数内部花费的时间
percall: 对函数的每次调用的平均时间 tottime / ncalls
cumtime: 累计时间，列出了在函数中耗费的时间，并且包含了函数调用其他函数内部花费的时间
percall（第二个）: 列出了对函数的每次调用的平均时间，包裹其调用的函数耗费的时间