Python DBAPI 2.0规范

Python DB-API 2.0规范
PEP:  249
标题: Python Database API Specification v2.0
版本: 83893e13db91
修改日期: 2008-03-03 12:37:19 +0000 (Mon, 03 Mar 2008)
作者: Marc-André Lemburg <mal at lemburg.com>
翻译: 佣工7001
讨论邮件: <db-sig at python.org>
状态: 最终
类别: 信息
创建:  
历史:   
替代:
248

译者注:PEP是Python Enhancement Proposals的缩写,意为Python扩展协议


简介:

定义本API的目的在于鼓励促进所有用于访问数据库的Python模块相互之间的一致性。为了做到这点,我们希望实现一个一致性的指引,以便实现更易于理解的模块,
更加通用的易于在不同数据库间移植的代码,和Python中更广泛的数据库访问手段。
本接口规格由下面几个方面构成:
        * 模块接口
        * 数据库连接对象
        * 游标对象
        * Type Objects and Constructors
        * 给模块作者的实现提示
        * 可选的DB API扩展
        * 可选的错误处理扩展
        * 可选的两阶段提交(Two-Phase Commit)扩展
        * 常见问题
        * 从版本1.0到2.0的主要变化
        * 遗留问题(Open Issues)
        * 脚注(Footnotes)
        * 鸣谢(Acknowledgments)
    
    对于本规范的意见和为题可以去讨论列表the SIG for Database Interfacing with Python
    (db-sig@python.org)。

    更多关于Python书库接口的信息可以访问 http://www.python.org/topics/database/.

    本文档描述了Python DB-API2.0规范和一些常见可选扩展。上一个版本1.0
    仍然可以在这里PEP 248获得。鼓励程序包的作者们使用上一版的规范来实现,可以作为新版本的基础版本。

模块接口(Module Interface):

    数据库的访问是通过连接对象(connection objects)来实现的。
    程序模块中必须提供以下形式的连接对象构造函数:

        connect(parameters...)

            数据库连接对象的构造函数,返回值为Connection对象实例。 
            由于目的数据库不同,函数接收数量不等的一些参数。[1]
        
    模块中必须定义下面这些模块级的变量:

        apilevel

            字符串常量,表明支持的DB API版本。
            目前只允许取值'1.0'和'2.0'。
            
            如果没有定义本常量,默认为DB-API 1.0。
            
        threadsafety

            整数常量,表明模块支持的线程安全级别,可能的值为:

                0     线程不安全,线程不能共享模块。
                1     线程可以共享模块,但是不能共享连接对象(connections)。
                2     线程可以共享模块和连接对象。
                3     线程安全,线程间可以共享模块、连接对象以及游标对象(module,connections,cursors)。

            上文中共享的意思是两个线程在没有使用互斥信号(mutex semaphore)锁的情况下, 同时使用一个资源。
            要注意的是,你并不总能使用互斥信号来确保一个外部资源线程安全,这是因为外部资源很有可能依赖于一
            个全局变量或是其他的外部资源,然而这些是你不能控制的。

        paramstyle
          
            字符串常量声明模块使用的SQL语句中的参数引出方式。可能的取值如下[2]:

                'qmark'         问号方式,例如:'...WHERE name=?'
                'numeric'       序数方式,例如:'...WHERE name=:1'
                'named'         命名方式,例如:'...WHERE name=:name'
                'format'        通用方式(ANSI C printf format codes)例如:'...WHERE name=%s'
                'pyformat'      python扩展方式(Python extended format codes),例如:'...WHERE name=%(name)s'

    模块中应该按照下面所阐述的错误类别和层次关系来处理各种错误信息:

        Warning 
            
            当有严重警告时触发,例如插入数据是被截断等等。必须是Python StandardError的子类(定义于exceptions模块中)。
            
        Error 

            这应该是警告以外所有其他错误类的基类。你可以使用这个类在单一
            的‘except’语句中捕捉所有的错误。警告(Warnings)不应认为是错误,
            因此不应该以此类作为基类,而只应该以Python StandardError作为基类。
            
        InterfaceError

            当有数据库接口模块本身的错误(而不是数据库的错误)发生时触发。
            必须是Error的子类。

        DatabaseError

            和数据库有关的错误发生时触发。
            必须是Error的子类。
            
        DataError
          
            当有数据处理时的错误发生时触发,例如:除零错误,数据超范围等等。
            必须是DatabaseError的子类。
            
        OperationalError
          
            指非用户控制的,而是操作数据库时发生的错误。例如:连接意外断开、
            数据库名未找到、事务处理失败、内存分配错误等等操作数据库是发生的错误。
            必须是DatabaseError的子类。
            
        IntegrityError             
          
            完整性相关的错误,例如外键检查失败等。必须是DatabaseError子类。
            
        InternalError 
                      
            数据库的内部错误,例如游标(cursor)失效了、事务同步失败等等。
            必须是DatabaseError子类。
            
        ProgrammingError
          
            程序错误,例如数据表(table)没找到或已存在、SQL语句语法错误、
            参数数量错误等等。必须是DatabaseError的子类。
            
        NotSupportedError
          
            不支持错误,指使用了数据库不支持的函数或API等。例如在连接对象上
            使用.rollback()函数,然而数据库并不支持事务或者事务已关闭。
            必须是DatabaseError的子类。
        
    下面是错误类的层次关系:

        StandardError
        |__Warning
        |__Error
           |__InterfaceError
           |__DatabaseError
              |__DataError
              |__OperationalError
              |__IntegrityError
              |__InternalError
              |__ProgrammingError
              |__NotSupportedError
        
    注:这些例外(Exceptions)的值(错误信息)并没有明确的定义,但是它们应该能够给用户指出足够好的错误提示。
        

连接对象(Connection Objects):

    连接对象应该具有下面的方法:

        .close() 
          
            马上关闭数据连接(而不是当__del__方法被调用的时候)。
            连接应该此后变得不可用,再次访问本连接对象应该触发
            一个错误(Error或其子类),同样所有使用本连接对象的游标(cursor)
            对象,也会导致例外发生。
            需要注意的是,在关闭连接对象之前,没有首先提交对数据库的改变
            将会导致一个隐含的回滚动作(rollback),这将丢弃之前的数据改变操作。

            
        .commit()
          
            提交任何挂起的事务到数据库中。
            需要注意的是,如果数据库支持自动提交(auto-commit),必须
            在初始化时关闭。一般会有一个接口函数关闭此特性。
            
            不支持事务的数据库也应该实现此方法,只需什么都不做。
            
        .rollback() 
          
            由于并非所有数据库都支持事务,此方法是可选的。[3]
            
            对于支持事务的数据库,调用此方法将导致数据库回滚到事务
            开始时的状态。关闭数据库连接之前没有明确调用commit()提交
            数据更新,将隐含导致rollback()被执行。
            
        .cursor()
          
            方法返回给定连接上建立的游标对象(Cursor Object)。如果数据库
            没有提供对应的游标对象,那么将由程序来模拟实现游标功能。[4]
            

Cursor Objects:

.execute*()    游标对象表示数据库游标,游标用来管理获取结果操作的上下文对象。
    同一个连接对象创建的游标对象不是彼此隔离的,也就是说一个游标对象
    对数据库造成的变化将会对于其他游标对象立即可见。而不同的连接对象
    创建的游标,则可能是隔离的也可能是非隔离的,这取决于数据库对事务
    的支持的实现(参见连接对象的.rollback().commit()方法)。
        
    游标对象应具有以下的方法和属性:

        .description 
          
            这是一个只读属性,是7个项目组成的tulip的序列。

            每个tulip包含描述一个结果集中的列的信息描述:

              (name, 
               type_code, 
               display_size,
               internal_size, 
               precision, 
               scale, 
               null_ok)

            其中,前两个项目(name and type_code)时必须的,其他的五项是可选的。
            如果没有意义可以设置为None。

            对于没有返回结果集的操作或者游标对象还没有执行过任何.execute*()的操作
            本属性可以为空(None)。
            
            type_code的含义可以比对下面Type对象的描述。
            
        .rowcount 
          
            这是一个只读属性,描述的是最后一次数据库操作影响的数据行数
            (执行.execute系列方法)。 可以是数据查询语句(DQL),比如
            'select'等返回的数据行,也可以是数据操纵语句(DML )比如'update' 和'insert'语句
            等所影响的数据行。

            如果还没有执行过任何语句,或者操作本身影响的函数由于数据访问接口的原因不能检测到。
            则本属性的值为-1  [7]

            注:将来的版本有可能重新定义后一种情况,使其取值为空(None)而不是-1。
            
        .callproc(procname[,parameters])
          
            (由于并非每种数据库都支持存储过程,此方法是可选的 [3])
            
            调用数据库存储过程时,首先必须给出存储过程的名字,其次,对于存储过程需要的
            每一个参数都必须依次给出。调用结果按照调用时的次序,输入型参数(Input parameters)
            原样不动,输出型和输入输出二合一型参数可能会被新的内容代替。
            
            存储过程也很可能以数据结果集作为返回结果,此时就要用标准的fech系列方法来
            获取结果了(.fetch*() methods)。
            
        .close()
          
            立即关闭游标(不论 __del__方法是否已调用)。从此刻开始游标对象就变得不可用了。
            任何试图访问此游标对象的方法或属性的动作都将导致一个错误Error或其子类被抛出。
            
        .execute(operation[,parameters]) 
          
            准备和执行数据库操作(查询或其他命令)。所提供参数将会被绑定
            到语句中的变量。变量的定义和数据库模块有关。(请参见模块的
            paramstyle属性的描述)。[5]
            
            游标对象将会保留这个操作的引用,如果一个后续的相同的操作被调用,
            游标对象将会以此来进行优化。当有相同的操作调用(不同的参数变量被传递)
            时,这是最为有效的优化。
            
            一项数据库操作,为了获得最大的执行效率,最好先期使用方法.setinputsizes() 来
            指定参数的类型和大小。执行时实际给出的参数和预定义的不同也是合法的,模块的实现
            需要容忍这个问题,即使以效率的损失为代价。
            
            参数可以以tuples的tuple或list的形式提供,例如当在一次调用中插入多行数据。但是
            这种调用应被认为是抛弃的不建议使用,应该使用专用的方法.executemany() 。
            
            没有对返回值进行明确界定。
            
        .executemany(operation,seq_of_parameters) 
          
            准备数据库操作(查询或其他命令),然后以序列的序列形式的函数
            来执行该操作。
            
            模块开发这可以自由选择是转化为一系列的.execute() 方法调用,还是以
            数组操作的形式,以便使数据库把这个序列的操作作为一个整体。
            
            使用此方法,可能产生一个或多个由未知的行为构成的结果集。
            建议模块作者(而不是要求)当检测到一次调用已经产生结果集时抛出例外。
            
            对于.execute()方法的描述同样可适于此方法。
            
            返回值未定义。
            
        .fetchone() 
          
            从一查询结果集中获取下一行数据,返回值为一个值的序列,如果没有更多数据
            了则返回None。[6]
            
            如果上次的.execute系列方法的调用没有生成任何结果集()或还没有进行任何数据
            库操作的调用,则调用此方法将抛出例外(Error或其子类)。

        .fetchmany([size=cursor.arraysize])
          
            Fetch the next set of rows of a query result, returning a
            sequence of sequences (e.g. a list of tuples). An empty
            sequence is returned when no more rows are available.
            
            The number of rows to fetch per call is specified by the
            parameter.  If it is not given, the cursor's arraysize
            determines the number of rows to be fetched. The method
            should try to fetch as many rows as indicated by the size
            parameter. If this is not possible due to the specified
            number of rows not being available, fewer rows may be
            returned.
            
            An Error (or subclass) exception is raised if the previous
            call to .execute*() did not produce any result set or no
            call was issued yet.
            
            Note there are performance considerations involved with
            the size parameter.  For optimal performance, it is
            usually best to use the arraysize attribute.  If the size
            parameter is used, then it is best for it to retain the
            same value from one .fetchmany() call to the next.
            
        .fetchall() 

            Fetch all (remaining) rows of a query result, returning
            them as a sequence of sequences (e.g. a list of tuples).
            Note that the cursor's arraysize attribute can affect the
            performance of this operation.
            
            An Error (or subclass) exception is raised if the previous
            call to .execute*() did not produce any result set or no
            call was issued yet.
            
        .nextset() 
          
            (This method is optional since not all databases support
            multiple result sets. [3])
            
            This method will make the cursor skip to the next
            available set, discarding any remaining rows from the
            current set.
            
            If there are no more sets, the method returns
            None. Otherwise, it returns a true value and subsequent
            calls to the fetch methods will return rows from the next
            result set.
            
            An Error (or subclass) exception is raised if the previous
            call to .execute*() did not produce any result set or no
            call was issued yet.

        .arraysize
          
            This read/write attribute specifies the number of rows to
            fetch at a time with .fetchmany(). It defaults to 1
            meaning to fetch a single row at a time.
            
            Implementations must observe this value with respect to
            the .fetchmany() method, but are free to interact with the
            database a single row at a time. It may also be used in
            the implementation of .executemany().
            
        .setinputsizes(sizes)
          
            This can be used before a call to .execute*() to
            predefine memory areas for the operation's parameters.
            
            sizes is specified as a sequence -- one item for each
            input parameter.  The item should be a Type Object that
            corresponds to the input that will be used, or it should
            be an integer specifying the maximum length of a string
            parameter.  If the item is None, then no predefined memory
            area will be reserved for that column (this is useful to
            avoid predefined areas for large inputs).
            
            This method would be used before the .execute*() method
            is invoked.
            
            Implementations are free to have this method do nothing
            and users are free to not use it.
            
        .setoutputsize(size[,column])
          
            Set a column buffer size for fetches of large columns
            (e.g. LONGs, BLOBs, etc.).  The column is specified as an
            index into the result sequence.  Not specifying the column
            will set the default size for all large columns in the
            cursor.
            
            This method would be used before the .execute*() method
            is invoked.
            
            Implementations are free to have this method do nothing
            and users are free to not use it.
            

数据类型对象及构造(Type Objects and Constructors):

    Many databases need to have the input in a particular format for
    binding to an operation's input parameters.  For example, if an
    input is destined for a DATE column, then it must be bound to the
    database in a particular string format.  Similar problems exist
    for "Row ID" columns or large binary items (e.g. blobs or RAW
    columns).  This presents problems for Python since the parameters
    to the .execute*() method are untyped.  When the database module
    sees a Python string object, it doesn't know if it should be bound
    as a simple CHAR column, as a raw BINARY item, or as a DATE.

    To overcome this problem, a module must provide the constructors
    defined below to create objects that can hold special values.
    When passed to the cursor methods, the module can then detect the
    proper type of the input parameter and bind it accordingly.

    A Cursor Object's description attribute returns information about
    each of the result columns of a query.  The type_code must compare
    equal to one of Type Objects defined below. Type Objects may be
    equal to more than one type code (e.g. DATETIME could be equal to
    the type codes for date, time and timestamp columns; see the
    Implementation Hints below for details).

    The module exports the following constructors and singletons:
        
        Date(year,month,day)

            This function constructs an object holding a date value.
            
        Time(hour,minute,second)

            This function constructs an object holding a time value.
            
        Timestamp(year,month,day,hour,minute,second)

            This function constructs an object holding a time stamp
            value.

        DateFromTicks(ticks)

            This function constructs an object holding a date value
            from the given ticks value (number of seconds since the
            epoch; see the documentation of the standard Python time
            module for details).

        TimeFromTicks(ticks)
          
            This function constructs an object holding a time value
            from the given ticks value (number of seconds since the
            epoch; see the documentation of the standard Python time
            module for details).
            
        TimestampFromTicks(ticks)

            This function constructs an object holding a time stamp
            value from the given ticks value (number of seconds since
            the epoch; see the documentation of the standard Python
            time module for details).

        Binary(string)
          
            This function constructs an object capable of holding a
            binary (long) string value.
            

        STRING

            This type object is used to describe columns in a database
            that are string-based (e.g. CHAR).

        BINARY

            This type object is used to describe (long) binary columns
            in a database (e.g. LONG, RAW, BLOBs).
            
        NUMBER

            This type object is used to describe numeric columns in a
            database.

        DATETIME
          
            This type object is used to describe date/time columns in
            a database.
            
        ROWID
          
            This type object is used to describe the "Row ID" column
            in a database.
            
    SQL NULL values are represented by the Python None singleton on
    input and output.

    Note: Usage of Unix ticks for database interfacing can cause
    troubles because of the limited date range they cover.


给模块作者的实现方法示意(Implementation Hints for Module Authors):

    * 时间/日期对象(Date/time objects)可以用Python的datetime模块中
      的对象来实现(Python 2.3版本开始提供,2.4版本开始提供C版本API),
      或者使用mxDateTime包中对象(可供Python1.5.2版本以上使用)。
      它们都提供有足够的构造方法,使用方法(在Python中和C中使用都可以)。
        
    * 下面是一个Unix下基于ticks构造为通用的date/time对象的代理示例:

        
import time

        def DateFromTicks(ticks):
            return Date(*time.localtime(ticks)[:3])

        def TimeFromTicks(ticks):
            return Time(*time.localtime(ticks)[3:6])

        def TimestampFromTicks(ticks):
            return Timestamp(*time.localtime(ticks)[:6])

* The preferred object type for Binary objects are the buffer types available in standard Python starting with version 1.5.2. Please see the Python documentation for details. For information about the C interface have a look at Include/bufferobject.h and Objects/bufferobject.c in the Python source distribution. * This Python class allows implementing the above type objects even though the description type code field yields multiple values for on type object: class DBAPITypeObject: def __init__(self,*values): self.values = values def __cmp__(self,other): if other in self.values: return 0 if other < self.values: return 1 else: return -1 The resulting type object compares equal to all values passed to the constructor. * Here is a snippet of Python code that implements the exception hierarchy defined above: import exceptions class Error(exceptions.StandardError): pass class Warning(exceptions.StandardError): pass class InterfaceError(Error): pass class DatabaseError(Error): pass class InternalError(DatabaseError): pass class OperationalError(DatabaseError): pass class ProgrammingError(DatabaseError): pass class IntegrityError(DatabaseError): pass class DataError(DatabaseError): pass class NotSupportedError(DatabaseError): pass In C you can use the PyErr_NewException(fullname, base, NULL) API to create the exception objects.

可选的DB API扩展(Optional DB API Extensions):

    在 DB API 2.0生命期内,模块作者经常扩展他们的实现提供超越规范要求的
    内容,为了增强兼容性和指明清晰的升级到将来DB API版本的道路,本章节
    定义了一系列的通用对于DB API2.0核心规范内容的扩展。

    对于所有的DB API可选规范,数据库模块作者可自由选择是否实现这些附加的
    方法和属性,因此使用它们很有可能导致抛出一个AttributeError或者是
    NotSupportedError,因为这些方法属性是否支持智能在运行时做检查。

    因此,已建议在用户使用这些扩展内容时会得到警告消息(Python warnings)
    为了使这项特性好用,警告消息必须标准化以便可以屏蔽它们。这些标准的信息
    下面称之为“警告信息”("Warning Message")。

    Cursor Attribute .rownumber

        This read-only attribute should provide the current 0-based
        index of the cursor in the result set or None if the index
        cannot be determined.

        The index can be seen as index of the cursor in a sequence
        (the result set). The next fetch operation will fetch the row
        indexed by .rownumber in that sequence.

        警告信息(Warning Message): "DB-API extension cursor.rownumber used"

    Connection Attributes .Error, .ProgrammingError, etc.

        All exception classes defined by the DB API standard should be
        exposed on the Connection objects as attributes (in addition
        to being available at module scope).

        These attributes simplify error handling in multi-connection
        environments.

        
         警告信息(Warning Message): "DB-API extension connection.<exception> used"

    Cursor Attributes .connection

        This read-only attribute return a reference to the Connection
        object on which the cursor was created.

        The attribute simplifies writing polymorph code in
        multi-connection environments.

        Warning Message: "DB-API extension cursor.connection used"

    Cursor Method .scroll(value[,mode='relative'])

        Scroll the cursor in the result set to a new position according
        to mode.

        If mode is 'relative' (default), value is taken as offset to
        the current position in the result set, if set to 'absolute',
        value states an absolute target position.

        An IndexError should be raised in case a scroll operation would
        leave the result set. In this case, the cursor position is left
        undefined (ideal would be to not move the cursor at all).

        Note: This method should use native scrollable cursors, if
        available , or revert to an emulation for forward-only
        scrollable cursors. The method may raise NotSupportedErrors to
        signal that a specific operation is not supported by the
        database (e.g. backward scrolling).

        
        警告信息(Warning Message): "DB-API extension cursor.scroll() used"

    Cursor Attribute .messages

        This is a Python list object to which the interface appends
        tuples (exception class, exception value) for all messages
        which the interfaces receives from the underlying database for
        this cursor.

        The list is cleared by all standard cursor methods calls (prior
        to executing the call) except for the .fetch*() calls
        automatically to avoid excessive memory usage and can also be
        cleared by executing "del cursor.messages[:]".

        All error and warning messages generated by the database are
        placed into this list, so checking the list allows the user to
        verify correct operation of the method calls.

        The aim of this attribute is to eliminate the need for a
        Warning exception which often causes problems (some warnings
        really only have informational character).

        
        警告信息(Warning Message): "DB-API extension cursor.messages used"

    Connection Attribute .messages

        Same as cursor.messages except that the messages in the list
        are connection oriented.

        The list is cleared automatically by all standard connection
        methods calls (prior to executing the call) to avoid excessive
        memory usage and can also be cleared by executing "del
        connection.messages[:]".

        警告信息(Warning Message):"DB-API extension connection.messages used"

    Cursor Method .next()
 
        Return the next row from the currently executing SQL statement
        using the same semantics as .fetchone().  A StopIteration
        exception is raised when the result set is exhausted for Python
        versions 2.2 and later. Previous versions don't have the
        StopIteration exception and so the method should raise an
        IndexError instead.

        警告信息(Warning Message):"DB-API extension cursor.next() used"

    Cursor Method .__iter__()

        Return self to make cursors compatible to the iteration
        protocol [8].

        警告信息(Warning Message):"DB-API extension cursor.__iter__() used"

    Cursor Attribute .lastrowid

        This read-only attribute provides the rowid of the last
        modified row (most databases return a rowid only when a single
        INSERT operation is performed). If the operation does not set
        a rowid or if the database does not support rowids, this
        attribute should be set to None.

        The semantics of .lastrowid are undefined in case the last
        executed statement modified more than one row, e.g. when
        using INSERT with .executemany().

        
        警告信息(Warning Message): "DB-API extension cursor.lastrowid used"

        

可选的错误处理扩展(Optional Error Handling Extensions):

    The core DB API specification only introduces a set of exceptions
    which can be raised to report errors to the user. In some cases,
    exceptions may be too disruptive for the flow of a program or even
    render execution impossible. 

    For these cases and in order to simplify error handling when
    dealing with databases, database module authors may choose to
    implement user defineable error handlers. This section describes a
    standard way of defining these error handlers.

    Cursor/Connection Attribute .errorhandler

        Read/write attribute which references an error handler to call
        in case an error condition is met.

        The handler must be a Python callable taking the following
        arguments:

          errorhandler(connection, cursor, errorclass, errorvalue) 

        where connection is a reference to the connection on which the
        cursor operates, cursor a reference to the cursor (or None in
        case the error does not apply to a cursor), errorclass is an
        error class which to instantiate using errorvalue as
        construction argument.

        The standard error handler should add the error information to
        the appropriate .messages attribute (connection.messages or
        cursor.messages) and raise the exception defined by the given
        errorclass and errorvalue parameters.

        If no errorhandler is set (the attribute is None), the
        standard error handling scheme as outlined above, should be
        applied.

        Warning Message: "DB-API extension .errorhandler used"

    Cursors should inherit the .errorhandler setting from their
    connection objects at cursor creation time.


可选两阶段提交扩展(Optional Two-Phase Commit Extensions):

    Many databases have support for two-phase commit (TPC) which
    allows managing transactions across multiple database connections
    and other resources.

    If a database backend provides support for two-phase commit and
    the database module author wishes to expose this support, the
    following API should be implemented. NotSupportedError should be
    raised, if the database backend support for two-phase commit
    can only be checked at run-time.

    TPC Transaction IDs

        As many databases follow the XA specification, transaction IDs
        are formed from three components:
    
         * a format ID
         * a global transaction ID
         * a branch qualifier

        For a particular global transaction, the first two components
        should be the same for all resources.  Each resource in the
        global transaction should be assigned a different branch
        qualifier.

        The various components must satisfy the following criteria:
    
         * format ID: a non-negative 32-bit integer.
    
         * global transaction ID and branch qualifier: byte strings no
           longer than 64 characters.

        Transaction IDs are created with the .xid() connection method:

        .xid(format_id, global_transaction_id, branch_qualifier)

            Returns a transaction ID object suitable for passing to the
            .tpc_*() methods of this connection.

            If the database connection does not support TPC, a
            NotSupportedError is raised.

        The type of the object returned by .xid() is not defined, but
        it must provide sequence behaviour, allowing access to the
        three components.  A conforming database module could choose
        to represent transaction IDs with tuples rather than a custom
        object.

    TPC Connection Methods

    .tpc_begin(xid)

        Begins a TPC transaction with the given transaction ID xid.

        This method should be called outside of a transaction
        (i.e. nothing may have executed since the last .commit() or
        .rollback()).

        Furthermore, it is an error to call .commit() or .rollback()
        within the TPC transaction. A ProgrammingError is raised, if
        the application calls .commit() or .rollback() during an
        active TPC transaction.

        If the database connection does not support TPC, a
        NotSupportedError is raised.

    .tpc_prepare()

        Performs the first phase of a transaction started with
        .tpc_begin().  A ProgrammingError should be raised if this
        method outside of a TPC transaction.

        After calling .tpc_prepare(), no statements can be executed
        until tpc_commit() or tpc_rollback() have been called.

    .tpc_commit([xid])

        When called with no arguments, .tpc_commit() commits a TPC
        transaction previously prepared with .tpc_prepare().

        If .tpc_commit() is called prior to .tpc_prepare(), a single
        phase commit is performed.  A transaction manager may choose
        to do this if only a single resource is participating in the
        global transaction.

        When called with a transaction ID xid, the database commits
        the given transaction.  If an invalid transaction ID is
        provided, a ProgrammingError will be raised.  This form should
        be called outside of a transaction, and is intended for use in
        recovery.

        On return, the TPC transaction is ended.

    .tpc_rollback([xid])

        When called with no arguments, .tpc_rollback() rolls back a
        TPC transaction.  It may be called before or after
        .tpc_prepare().

        When called with a transaction ID xid, it rolls back the given
        transaction.  If an invalid transaction ID is provided, a
        ProgrammingError is raised.  This form should be called
        outside of a transaction, and is intended for use in recovery.

        On return, the TPC transaction is ended.

    .tpc_recover()

        Returns a list of pending transaction IDs suitable for use
        with .tpc_commit(xid) or .tpc_rollback(xid).

        If the database does not support transaction recovery, it may
        return an empty list or raise NotSupportedError.


常见问题(Frequently Asked Questions):

    在论坛中经常看到关于DB API规范的重复性的问题。本节包括了一些人们常问的问题。

    Question: 

       当我使用.fetch*()之类的函数获取结果时,如何获取到一个字典形式的结果集而不是tuples。
       

    Answer:

       有几个可用工具来解决这个问题。多数都是利用了游标对象的.description
       属性作为基础来实现数据行的字典。

       注意,之所以没有扩展DB API规范来支持.fetch系列方法来返回字典,是因为
       这种方法有几个弊端。

       * 一些数据库及服务不支持区分字段名的大小写,或者自动把字段名转化为大
         写或小写。
 
       * 查询所生成的结果集中的字段不一定是表的字段名,并且数据库经常为这些列
         使用自己的方法来为这些字段生成名字。

       因此,要做到在不同的数据库中,都通过使用字典键值来分访问字段值,并且做到可移植的是
       不可能的。


1.0到2.0的主要改变(Major Changes from Version 1.0 to Version 2.0):

    Python DB API 2.0相对于1.0来说引入了几个很重大的改变。由于其中一些
    变动会导致已有的基于DB API 1.0的脚本不能运行。因此做了主版本号的改变,
    升级为DB-API 2.0规范来反映这些变化。
        
    下面这些是从1.0 到2.0最重要的改变:
        
        * 不在需要单独的dbi模块,而是直接打包进数据库访问模块当中。

        * 日期/时间类型添加了新的构造,RAW类型改名为BINARY。结果集中应该
          覆盖现代SQL数据库中的基本数据类型。

        * 为了更好的数据库绑定,添加了新的常量(apilevel, threadlevel, paramstyle)
          和方法(.executemany(), .nextset())。
            
        * 明确定义了需要用来访问存储过程的的方法.callproc()。
            
        * 方法.execute()的返回值定义有所改变。前期版本中,返回值定义是基于
          所执行的SQL语句类型的(这经常比较难以实现)-- 下载它没有了明确的
          定义;代替它的是用户应该访问更适合的.rowcount属性。模块作者可以仍然
          返回旧式的定义值,但是规范中不再会有明确定义,而且应该认为是取决于
          不同的数据访问模块的。
            
        * 例外的类在新的规范中有统一明确的定义。模块作者可以任意的以继承类的
          形式来扩展新规范中所定义例外的层次。

    DB API 2.0规范的追加扩展规范:

        * 定义了附加的可选的对核心DB-API功能的扩展功能集。


遗留问题(Open Issues):

    尽管2.0版本阐明了许多1.0版本遗留的问题 ,但是仍有一些遗留问题留待以后的版本来
    实现解决:
        
        * Define a useful return value for .nextset() for the case where
          a new result set is available.
        
        * Integrate the decimal module Decimal object for use as
          loss-less monetary and decimal interchange format.


脚注(Footnotes):

    [1] 作为实现准则,连接对象the connection constructor parameters should be
        implemented as keyword parameters for more intuitive use and
        follow this order of parameters:
        
          dsn         Data source name as string
          user        User name as string (optional)
          password    Password as string (optional)
          host        Hostname (optional)
          database    Database name (optional)
        
        E.g. a connect could look like this:
        
          connect(dsn='myhost:MYDB',user='guido',password='234$')
        
    [2] Module implementors should prefer 'numeric', 'named' or
        'pyformat' over the other formats because these offer more
        clarity and flexibility.

    [3] If the database does not support the functionality required
        by the method, the interface should throw an exception in
        case the method is used.
        
        The preferred approach is to not implement the method and
        thus have Python generate an AttributeError in
        case the method is requested. This allows the programmer to
        check for database capabilities using the standard
        hasattr() function.
        
        For some dynamically configured interfaces it may not be
        appropriate to require dynamically making the method
        available. These interfaces should then raise a
        NotSupportedError to indicate the non-ability
        to perform the roll back when the method is invoked.
          
    [4] a database interface may choose to support named cursors by
        allowing a string argument to the method. This feature is
        not part of the specification, since it complicates
        semantics of the .fetch*() methods.
        
    [5] The module will use the __getitem__ method of the parameters
        object to map either positions (integers) or names (strings)
        to parameter values. This allows for both sequences and
        mappings to be used as input.
        
        The term "bound" refers to the process of binding an input
        value to a database execution buffer. In practical terms,
        this means that the input value is directly used as a value
        in the operation.  The client should not be required to
        "escape" the value so that it can be used -- the value
        should be equal to the actual database value.
        
    [6] Note that the interface may implement row fetching using
        arrays and other optimizations. It is not
        guaranteed that a call to this method will only move the
        associated cursor forward by one row.
       
    [7] The rowcount attribute may be coded in a way that updates
        its value dynamically. This can be useful for databases that
        return usable rowcount values only after the first call to
        a .fetch*() method.

    [8] Implementation Note: Python C extensions will have to
        implement the tp_iter slot on the cursor object instead of the
        .__iter__() method.

鸣谢(Acknowledgements):

    非常感谢Andrew Kuchling,是他把Python Database API Specification 2.0由
    原始的HTML格式转为了PEP格式。

    非常感谢James Henstridge领导两阶段提交扩展API的讨论并最终使其标准化。




原文地址:https://www.cnblogs.com/dajianshi/p/2827096.html