深度解析并实现python中的super(转载，好文)

大神半个月的成绩，让我看的叹为观止，建议看原帖地址，会让你对Python的描述符有更强的认识。

原文链接：https://blog.csdn.net/zhangjg_blog/article/details/83033210

深度解析并实现python中的super

        概述
        super的定义
        函数bound和描述器
        super的典型用法
        super的本质
        自定义super
        python中对super的实现
        写在最后

概述

python中的super是一个神奇的存在。本文对python中的super进行深入的讲解，首先说明super的定义，并列举一下super的典型用法，然后会对和super相关的语言特性进行讲解，比如mro(方法解析顺序)，descriptor描述器，函数绑定，最后尝试自己动手实现一个super，并简单探索一下python中对super的实现。
super的定义

首先看一下super的定义，当然是help(super)看一下文档介绍：

Help on class super in module builtins:

class super(object)
| super() -> same as super(__class__, <first argument>)
| super(type) -> unbound super object
| super(type, obj) -> bound super object; requires isinstance(obj, type)
| super(type, type2) -> bound super object; requires issubclass(type2, type)
| Typical use to call a cooperative superclass method:
| class C(B):
|      def meth(self, arg):
|          super().meth(arg)
| This works for class methods too:
| class C(B):
|      @classmethod
|      def cmeth(cls, arg):
|          super().cmeth(arg)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16

从文档里可以看出以下几点：

1 super是一个类

super不是关键字，而是一个类，调用super()会创建一个super对象：

>>> class A:
...     def __init__(self):
...         su = super()
...         print(su)
...         print(type(su))
...
>>> a = A()
<super: <class 'A'>, <A object>>
<class 'super'>

    1
    2
    3
    4
    5
    6
    7
    8
    9

或者：

>>> class A:
...     pass
...
>>> a = A()
>>> su = super(A, a)
>>> su
<super: <class 'A'>, <A object>>
>>> type(su)
<class 'super'>
>>>

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10

2 super支持四种调用方式

    super()
    super(type, obj)
    super(type)
    super(type, type1)

其中super(type)创建一个未绑定super对象(unbound)，其余三种方式创建的是绑定的super对象(bound)。super()是python3中支持的写法，是一种调用上的优化，其实相当于第一个参数传入调用super的当前的类，第二个参数传入调用super的方法的第一个参数。

关于super的定义先介绍到这里，下面介绍bound相关的概念，bound的概念又和描述器相关，所以接下来介绍函数bound和描述器
函数bound和描述器

要理解bound,首先要理解在python中，函数都是对象，并且是描述器。

函数都是对象：

>>> def test():
...     pass
...
>>> test
<function test at 0x10a989268>
>>> type(test)
<class 'function'>
>>>

    1
    2
    3
    4
    5
    6
    7
    8

test是一个函数，同时又是一个function对象。所以当我们使用def定义一个函数的时候，相当于创建一个function对象。因为function实现了__call__方法，所以可以被调用：

>>> getattr(test, '__call__')
<method-wrapper '__call__' of function object at 0x10a989268>
>>>

    1
    2
    3

由于function实现了__get__方法，所以，函数对象又是一个描述器对象（descriptor）:

>>> getattr(test, '__get__')
<method-wrapper '__get__' of function object at 0x10a989268>

    1
    2

因为根据python的定义，只要实现了__get__, __set__和__delete__中的一个或多个，就认为是一个描述器。

描述器的概念和bound的概念，在模块函数上提现不出来，但是如果一个函数定义在类中，这两个概念会体现的很明显。

下面我们在类中定义一个函数：

>>> class A:
...     def test(self):
...         pass
...

    1
    2
    3
    4

首先验证在类中定义的函数也是一个function对象：

>>> A.__dict__['test']
<function A.test at 0x10aab4158>
>>>
>>> type(A.__dict__['test'])
<class 'function'>
>>>
>>>

    1
    2
    3
    4
    5
    6
    7

下面验证在类中定义的函数也是一个描述器，也就是验证实现了__get__方法：

>>> getattr(A.__dict__['test'], '__get__')
<method-wrapper '__get__' of function object at 0x10aab4158>
>>>

    1
    2
    3

从上面的验证可以看到，在类中定义的函数，也是一个描述器对象。所以可以认为在类中定义函数，相当于定义一个描述器。所以当我们写下面代码时：

class A:
    def test(self):
        pass

    1
    2
    3

相当于这样：

class A:
    test = function()

    1
    2

下面简单讲一下描述器的特性。看下面的代码：

class NameDesc:
    def __get__(self, instance, cls):
        print('NameDesc.__get__:', self, instance, cls)
        if instance is None: #通过类访问描述器的时候，instance为None
            return self
        else:
            return instance.__dict__['_name']

    def __set__(self, instance, value):
        print('NameDesc.__set__:', self, instance, value)
        if not isinstance(value, str):
            raise TypeError('expect str')
        instance.__dict__['_name'] = value

class Person:
    name = NameDesc()

p = Person()

p.name = 'zhang'
print(p.name)
print(Person.name)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22

输出结果为：

NameDesc.__set__: <__main__.NameDesc object at 0x10babaf60> <__main__.Person object at 0x10babaf98> zhang
NameDesc.__get__: <__main__.NameDesc object at 0x10babaf60> <__main__.Person object at 0x10babaf98> <class '__main__.Person'>
zhang
NameDesc.__get__: <__main__.NameDesc object at 0x10e8dbf98> None <class '__main__.Person'>
<__main__.NameDesc object at 0x10e8dbf98>

    1
    2
    3
    4
    5

当一个类(Person)中存在一个描述器属性(name), 当这个属性被访问时，会自动调用描述器的__get__和__set__方法：

    当使用类名访问描述器时(Person.name) , __get__方法返回描述器本身
    当使用对象访问描述器时(p.name), __get__方法会返回自定义的值（instance._name）,我们可以自定义返回任何值，包括函数

回到上面的两段等效代码：

class A:
    def test(self):
        pass

    1
    2
    3

class A:
    test = function()

    1
    2

那么既然test是一个描述器，那么我通过A调用test和通过a调用test时，会返回什么呢？下面直接看结果：

>>> class A:
...     def test(self):
...         pass
...
>>> A.test
<function A.test at 0x1088db0d0>
>>>
>>> A.test is A.__dict__['test']
True
>>>
>>> a = A()
>>> a.test
<bound method A.test of <__main__.A object at 0x1088d9780>>

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13

通过类A访问test(A.test),还是会返回test这个描述器自身，也就是A.__dict__['test']
通过对象a访问test(a.test), 返回一个bound method。

所以我们可以认为:

    function的__get__方法，当不传入instance时(相当于A.test)，会返回function本身
    当传入一个instance的时候(相当于a.test)，会返回一个bound method。

下面的代码可以验证这个结论：

>>> A.test.__get__(None, A)
<function A.test at 0x1088db158>
>>> A.test.__get__(None, A) == A.test
True
>>>
>>> A.test.__get__(a, A)
<bound method A.test of <__main__.A object at 0x1088d9860>>
>>> A.test.__get__(a, A) == a.test
True

    1
    2
    3
    4
    5
    6
    7
    8
    9

所以我们可以认为描述器function的实现方式如下：

class function:

    def __get__(self, instance, cls):
        if instance is None: #通过类调用
            return self
        else: #通过对象调用
            return self._translate_to_bound_method(instance)

    def _translate_to_bound_method(self, instance):
        #
        # ...
        #

class A:
    test = function()

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16

下面看一下绑定(bound)和非绑定(unbound)到底有什么区别。接着看下面的示例：

>>> class A:
...     def test(self):
...         print('*** test ***')
...
>>> a = A()
>>>
>>> A.test(a)
*** test ***
>>>
>>> a.test()
*** test ***
>>>

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12

我们看到，在定义A的时候，test方法是有一个参数self的。
A.test返回一个function对象，是一个未绑定函数，所以调用的时候要传对象(A.test(a))
a.test返回一个bound method对象，是一个绑定函数，所以调用的时候不需要再传入对象(a.test())

可以看出，所谓绑定，就是把调用函数的对象，绑定到函数的第一个参数上。

做一个总结，本节主要讲解了函数，描述器和绑定的概念。结论就是function是一个可以被调用(实现了__call__方法)的描述器(实现了__get__方法)对象，并且通过类获取函数对象的时候，__get__方法会返回function本身，通过实例获取函数对象的时候，__get__方法会返回一个bound method，也就是将实例绑定到这个function上。

下面再回到super。
super的典型用法

很多人对super直观的理解是，调用父类中的方法：

class A:
    def test(self):
        print('A.test')

class B(A):
    def test(self):
        super().test()
        print('B.test')

b = B()
b.test()

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11

执行结果为：

A.test
B.test

    1
    2

从上面的例子看来，super确实可以调用父类中的方法。但是看下面的代码：

class A:
    def test(self):
        print('A.test')

class TestMixin:
    def test(self):
        print('TestMixin.test')
        super().test()

class B(TestMixin, A):
    def test(self):
        print('B.test')
        super().test()

b = B()
b.test()

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17

打印结果：

B.test
TestMixin.test
A.test

    1
    2
    3

上面的代码先创建B的对象b，然后调用b.test()，但是B的test函数通过super()，会调到第一个父类TestMixin的test函数，因为TestMixin是B的第一个父类。

TestMixin中的test函数中通过super调到了A中的test函数，但是A不是TestMixin的父类。在这个继承体系中，A和TestMixin都是B的父类，但是A和TestMixin没有任何继承关系。为什么TestMixin中的super会调到A中的test函数呢？
super的本质

其实super不是针对调用父类而设计的，它的本质是在一个由多个类组成的有序集合中搜寻一个特定的类，并找到这个类中的特定函数，将一个实例绑定到这个函数上，生成一个绑定方法(bound method)，并返回这个bound method。

上面提到的由多个类组成的有序集合，即是类的mro，即方法解析顺序(method resolution )，它是为了确定在继承体系中，搜索要调用的函数的顺序的。通过inspect.getmro或者类中的__mro__属性可以获得这个集合。还是以上面的A, TestMixin,B为例：

class A:
    def test(self):
        print('A.test')

class TestMixin:
    def test(self):
        print('TestMixin.test')
        super().test()

class B(TestMixin, A):
    def test(self):
        print('B.test')
        super().test()

#b = B()
#b.test()

print(B.__mro__)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19

输出结果为：

(<class '__main__.B'>, <class '__main__.TestMixin'>, <class '__main__.A'>, <class 'object'>)

    1

可见B的mro为(B, TestMixin, A, object)。这个列表的意义是B的实例b在调用一个函数时，首先在B类中找这个函数，如果B中调用了super，则需要从B的下一个类(即TestMixin)中找函数，如果在TestMixin中又调用了super，则从TestMixin的下一个类(即A)中找函数。

在python 2.x中，要成功调用super必须指定两个参数才行，即super(type,obj)或super(type, type1)。为了直观，我们用这种带参数的形式改写上面的示例：

class A:
    def test(self):
        print('A.test')

class TestMixin:
    def test(self):
        print('TestMixin.test')
        super(TestMixin, self).test()

class B(TestMixin, A):
    def test(self):
        print('B.test')
        super(B, self).test()

print(B.__mro__)

b = B()
b.test()

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19

其实这两个参数很关键，第一个参数是当前调用super的类，这个参数就是为了在mro中找到下一个类，然后从这个类开始搜寻函数。第二个参数有两个作用，一是确定从哪个类获取mro列表，二是作为实例，绑定到要调用的函数上。

我们以TestMixin的super(TestMixin, self).test()为例，解释这两个参数的意义。

先看第二个参数，需要知道，当从b.test()一层层的向上调时，self始终是实例b，所以不管调到哪个类中的super，self始终是b，通过这个self获取的mro永远都是B的mro。当获取到mro后，就在mro中找第一个参数TestMixin的下一个类，这里是A, 并且在A里面查找有没有目标函数，如果没有，就在A类的下一个类中找，依次类推。

还有，通过super(TestMixin, self)创建的是super对象，super并没有test方法，那么super(TestMixin)为什么能调用test方法呢？

这是因为当一个对象调用类中没有的方法时，会调用类的__getattr__方法，在super中只要实现这个方法，就会拦截到super(TestMixin, self)对test的访问，根据上面的介绍，super中可以根据传入的TestMixin和self，确认了要在A中查找方法，所以这里我们可以直接从A查找test函数，如果A中没有，那么就从mro中A后面的类依次查找。

等找到这个函数后，不能直接返回这个test函数，因为这个函数还没有绑定，需要通过这个函数(也是描述器)的__get__函数，将self实例传入，获得一个绑定方法(bound method)，然后将这个bound method返回。所以到此为止，super(TestMixin, self).test 就获取了一个bound method, 这个是A中的函数，并且绑定了self实例(这个实例是b)。然后在后面加一个(), super(TestMixin, self).test()的意义就是调用这个bound method。所以就调到了A中的test函数：

class A:
    def test(self):
        print('A.test')

    1
    2
    3

因为绑定的是实例b, 所以上面test中传入的self就是实例b。

到此为止，super的原理就讲完了。
自定义super

上面讲解了super的本质，根据上面的讲解，我们自己来实现一个my_super:

class my_super:
    def __init__(self, thisclass=None, target=None):
        self._thisclass = thisclass
        self._target = target

    def _get_mro(self):
        if issubclass(type, type(self._target)):
            return self._target.__mro__ #第二个参数是类型
        else:
            return self._target.__class__.__mro__ #第二个参数是实例

    def _get_function(self, name):
        mro = self._get_mro()
        if not self._thisclass in mro:
            return None

        index = mro.index(self._thisclass) + 1
        while index < len(mro):
            cls = mro[index]
            if hasattr(cls, name):
                attr = cls.__dict__[name]
                #不要用getattr，因为我们这里需要获取未绑定的函数
                #如果使用getattr, 并且获取的是classmethod
                #会直接将cls绑定到该函数上
                #attr = getattr(cls, name)
                if callable(attr) or isinstance(attr, classmethod):
                    return attr
            index += 1
        return None

    def __getattr__(self, name):
        func = self._get_function(name)
        if not func is None:
            if issubclass(type, type(self._target)):
                return func.__get__(None, self._target)
            else:
                return func.__get__(self._target, None)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40

和super一样，上面的my_super的__init__函数接收两个参数，一个是调用super的当前类thisclass，第二个参数target是调用my_super的函数的第一个参数，也就是self或cls。所以这个参数可能是对象实例，也可能是类(如果在classmethod中调用my_super，第二个参数要传cls)，在my_super中要分两种情况。

my_super中的_get_mro函数，根据传入的第二个参数获取mro。如果第二个参数target是对象实例，就获取它的__class__，然后获取__class__的__mro__，如果target是类，则直接获取target的__mro__。

my_super的_get_function函数，先获取mro，然后在mro上获取位于thisclass后的目标类，并且在目标类中查找函数，参数name是要查找的函数的名字。这里要注意，如果位于thisclass后的类中没有名为name的函数，则继续在下各类中查找，所以使用了while循环

my_super的__getattr__函数，用于截获my_super对象对方法的调用，举例来说，如果my_supe调用的是test，那么这个name就是’test’。在__getattr__中，首先调用_get_function，获取目标函数，然后调用函数的描述器方法__get__，将target实例绑定，然后将绑定后的方法返回。这里也发要分target是实例还是类。如果是实例(这时调用my_super的是实例函数)，则使用function.__get__(instance, None)绑定，如果是类(这是调用my_super的是类函数)，则使用functon.__get__(None, cls)绑定。

我们改写上面的例子，来验证my_super功能是否正常：

from my_super import my_super

class A:
    def test(self):
        print('A.test')

class TestMixin:
    def test(self):
        print('TestMixin.test')
        my_super(TestMixin, self).test()

class B(TestMixin, A):
    def test(self):
        print('B.test')
        my_super(B, self).test()

print(B.__mro__)

b = B()
b.test()

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21

执行后输出如下：

B.test
TestMixin.test
A.test

    1
    2
    3

和super的效果是一样的。

下面我们在写一个菱形继承的实例来验证，并且验证类函数中使用my_super功能是否正常：

from my_super import my_super

class A:
    def test(self):
        print('A.test')

    @classmethod
    def test1(cls):
        print('A.test1')

class B(A):
    def test(self):
        print('B.test')
        my_super(B, self).test()

    @classmethod
    def test1(cls):
        print('B.test1')
        my_super(B, cls).test1()

class C(A):
    def test(self):
        print('C.test')
        my_super(C, self).test()

    @classmethod
    def test1(cls):
        print('C.test1')
        my_super(C, cls).test1()

class D(B,C):
    def test(self):
        print('D.test')
        my_super(D, self).test()

    @classmethod
    def test1(cls):
        print('D.test1')
        my_super(D, cls).test1()

d = D()
d.test()

D.test1()

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45

输出如下：

D.test
B.test
C.test
A.test
D.test1
B.test1
C.test1
A.test1

    1
    2
    3
    4
    5
    6
    7
    8

输出结果正常，可见我们自定义实现的my_super即支持在实例函数中调用，也可以在类函数中调用。

最后有一点不足，就是my_super必须传入参数，而super在python3中可以不用传参数，应该是在底层自动捕获了调用super的类和调用super的函数的第一个参数。

通过inspect.stack()， inspect.signature()， sys._getframe()等api应该可以获取调用my_super的函数的第一个参数，但是调用my_super的类不知道如何获取。如果哪位有解决方案，可以留言。
python中对super的实现

python中的super是在c中实现的，在最新的python 3.7.0源码中，super实现在Python-3.7.0/Objects/typeobject.c中，和python层中的super对应的，是c层中的superobject:

typedef struct {
      PyObject_HEAD
      PyTypeObject *type;
      PyObject *obj;
      PyTypeObject *obj_type;
} superobject;

    1
    2
    3
    4
    5
    6

其中在super_getattro函数中有以下代码：

do {
          PyObject *res, *tmp, *dict;
          descrgetfunc f;

          tmp = PyTuple_GET_ITEM(mro, i);
          assert(PyType_Check(tmp));

          dict = ((PyTypeObject *)tmp)->tp_dict;
          assert(dict != NULL && PyDict_Check(dict));

          res = PyDict_GetItem(dict, name);
          if (res != NULL) {
              Py_INCREF(res);

              f = Py_TYPE(res)->tp_descr_get;
              if (f != NULL) {
                  tmp = f(res,
                      /* Only pass 'obj' param if this is instance-mode super
                         (See SF ID #743627) */
                      (su->obj == (PyObject *)starttype) ? NULL : su->obj,
                      (PyObject *)starttype);
                  Py_DECREF(res);
                  res = tmp;
              }

              Py_DECREF(mro);
              return res;
          }

          i++;
      } while (i < n);

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31

可以看出确实是在类的mro列表中查找类的。
tmp = PyTuple_GET_ITEM(mro, i)现在mro中查找一个类，然后dict = ((PyTypeObject *)tmp)->tp_dict获取这类的__dict__字典，res = PyDict_GetItem(dict, name)在字典中查找函数

super_init函数对应python层super的__init__函数：

static int
super_init(PyObject *self, PyObject *args, PyObject *kwds)
{
      superobject *su = (superobject *)self;
      PyTypeObject *type = NULL;
      PyObject *obj = NULL;
      PyTypeObject *obj_type = NULL;

      if (!_PyArg_NoKeywords("super", kwds))
          return -1;
      if (!PyArg_ParseTuple(args, "|O!O:super", &PyType_Type, &type, &obj))
          return -1;

      if (type == NULL) {
          /* Call super(), without args -- fill in from __class__
             and first local variable on the stack. */
          PyFrameObject *f;
          PyCodeObject *co;
          Py_ssize_t i, n;
          f = PyThreadState_GET()->frame;
          if (f == NULL) {
              PyErr_SetString(PyExc_RuntimeError,
                              "super(): no current frame");
              return -1;
          }
          co = f->f_code;
          if (co == NULL) {
              PyErr_SetString(PyExc_RuntimeError,
                              "super(): no code object");
              return -1;
          }
    ......
    ......

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33

上面的代码中type == NULL的if分支，就是对应在python中不传参数调用super()的情况，可以看到，在c中也是通过回退调用栈(PyFrameObject)来获取调用super的类和调用super的函数的第一个参数的。
写在最后

本文实现my_super只是根据自己对super的理解，python中真实的super的一些实现细节可能并没有考虑到。并且本人对my_super并没做充分的测试，不能保证在任何场景下都能工作正常。

本人是刚学了半个月python的新手，本文中如有错误的地方，欢迎留言指正。
————————————————
版权声明：本文为CSDN博主「昨夜星辰_zhangjg」的原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/zhangjg_blog/article/details/83033210