Oracle Core 学习笔记二 Transactions 和 Consistency 说明

This chapterexamines the mechanisms Oracle uses to create the linked liststhrough undo records and, most importantly, how the code locates the end pointsof those lists.

--这一章主要看一下Oracle 创建和linked undo records,并且确认undorecords list的结束点。在前面的blog里提到Oracle 采用这种UNDO 机制的3个原因：readconsistency 和 Rollback，最后一个原因就是今天这篇Blog的内容。

Oracle Core 学习笔记一-- Redo 和 Undo 机制详解

http://blog.csdn.net/tianlesoftware/article/details/7626421

We’ll be lookingat the transaction table that Oracle keeps in each undo segment header block toanchor one set of linked lists, and the interested transaction list (ITL) thatOracle keeps in every single data (and index) block as the anchor toanother set of linked lists. Then we’ll take a closer look into the undosegment header to examine the transaction table control section (hereinafterreferred to as the transaction control) that Oracle uses as the anchor pointfor the final linked list.

在每个undo sement headerblock里都有一个linked lists，这个就是我们说的transaction table。同样在每个data（index） block 也会有一个linkedlist，叫interested transaction list(ITL).

We’ll finishwith a short note on LOBs (large objects), as Oracle deals with undo, redo,read consistency, and transactions differently when dealing with LOBs—or, atleast, the LOB data that is stored “out of row.”

--注意Oracle 处理普通字段与LOBs字段的区别，如果是是LOBs，Oracle 的redo ，undo，readconsistency 都会按照LOBs字段来处理，因为LOB的数据是存储是单独存储的，即“out of row”。

Oracle LOB 详解

http://blog.csdn.net/tianlesoftware/article/details/6905406

一．Conflict Resolution --冲突解决

Let’s imagine wehave to deal with a system where there are just two users, you and I, who areconstantly modifying and querying data in the small portion of a database.

--假设存在2个用户，不断的去修改和查询一部分数据。

If you areapplying a transaction to a database and I am simply querying the database, Imust not see any of your changes until the moment you tell me (by executing acommit; call) that I can see all of your changes. But even when you havecommitted your transaction, the moment at which I am allowed to see the changesyou’ve made depends on my isolation level (see the sidebar“Isolation Levels” in Chapter 2) and the nature of the work I am doing. So,from an internal point of view, I have to have an efficient method foridentifying (and ignoring) changes that are not yet committed as well aschanges that have been committed so recently that I shouldn’t yet be able tosee them. To make things a little more challenging, I need to remember that“recently” might not be all that recent if I’ve been executing a long-runningquery, so I may have to do a lot of work to get an accurate idea of when yourtransaction committed.

--假如用户A做了一个修改的事务,用户B正在查询，那么用户B看到的数据必须是用户A修改之前的，等用户A提交之后，才可以看到修改之后的数据。事实上，即使用户B提交了数据，用户A能否看见修改的数据还会受isolation level影响。

我们必须制定一种有效的方法来辨别事务最近是否提交. 实现这种方法有一定的难度，我们需要弄清楚这里的‘recently’可能不是所有的’recent’, 比如我们执行一个long-runing query，因此在事务commit时我们必须做很多的工作来获取准确的信息。

Viewing theactivity from the opposite perspective, when you commit your transaction(allowing your changes to become visible to other users), you need an efficientmechanism that allows you to let everyone see that you’ve committed thattransaction, but you don’t want to revisit and mark all the blocks that youhave changed, because otherwise this step could take just as much time as thetime it took to make the changes in the first place. Of course, if you decideto roll back your work rather than commit it, you will also need a mechanismthat links together all the undo records for the changes you have made, in theorder you made them, so that you can reverse out the changes in the oppositeorder. Since rolling back real changes is (or ought to be) a rare eventcompared to committing them, Oracle is engineered to make the commit as fast aspossible and allows the rollback mechanism to be much slower.

--从另一个观点来看，当我们提交事务，我们需要一个有效的机制来保证我们修改的数据所有人都可以看的到，但我们不想重新访问和标记我们修改的block,因为额外的操作会占用和我们之前修改数据同样多的时间。

当然我们可以可以rollback我们的事务，而不是提交，但是我们同样需要一个机制按照正确的顺序link 我们所有修改的undo record，这样才能回滚我们的数据。因为rollback 操作要比commit少很多，所以Oracle 设计时让commit尽可能的快，而允许rollback 机制慢一些。

One of the firstthings we need so that we can coordinate our activity is some sort of focalpoint for change. Since, in this scenario, you are the agent of change, yousupply the focal point or, rather, two focal points—the first is a single entryin a special part of the database to act as the primary reference point for thetransaction, and the second appears as an entry in every single table or indexblock that you change. We’ll start by looking at the reference point for thetransaction.

二．Transactions and Undo

When you createa database, you have to create an undo tablespace (and if you’re using RAC,this is extended to one undo tablespace for each instance that will access thedatabase). Unless you’re using old-style manual rollback management, Oraclewill automatically create several undo segments in that tablespace and willautomatically add, grow, shrink, or drop undo segments as the workload on thedatabase changes.

--在我们创建数据库的时候，我们必须创建undo tablespace（如果是RAC 环境，那么每个实例都会对应一个undo tablespace）。除非我们使用manul rollback 管理，这种情况下，Oracle 会在SYSTEM表空间下创建一些undo segments，然后用这些undo segments 来作为工作平台进行add，grow，shrink 或者drop undo segment操作。

Transactionmanagement starts with, and revolves around, the undo segments. The segmentheader block, which (for undo segments) is the first block of the segment,contains a lot of the standard structures that you will see in the segmentheader block of other types of segment—the extent map and the extent controlheader, for example—but it also contains a number of very special structures(see Figure 3-1), in particular the transaction table (TRN TBL:, a short listidentifying recent transactions) and the transaction table control section (TRNCTL::, a collection of details describing the state and content ofthe transaction table).

--事务管理从undo segment开始或者围绕undo segment进行，undosegment header block是undo segment的第一个block，header block 由一些保准的structures组成：extentmap，extent control header。在transaction table和transactiontable control section里也包含一些特殊的structures，如transaction table中的TRNTBL:，transaction table control section的TRNCTL::。

Figure 3-1. Schematic comparing key contentof different types of segment headers

The followingdump is an extract from a transaction table, restricted to just the first fewand last few entries and hiding some of the columns we don’t need to discuss.This extract includes one entry (index = 0x02) that represents anactive transaction.

--下面是从transaction table里抽取的dump文件，我们这里只看开头和结尾的部分entries，中间的的部分我们不讨论。 Entry(index = 0x02))代表的就是一个活动的事务。

TRN TBL: index statecflags wrap# uel scn dba nub cmt

---------------------------------------------------------------------------------

0x00 9 0x00 0x20130x001b 0x0000.016f1fc1 0x0180083e 0x00000001 1302762364

0x01 9 0x00 0x20140x001a 0x0000.016f1f54 0x0180083e 0x00000001 1302762364

0x02 10 0x80 0x20130x0002 0x0000.016f20fc 0x0180083e0x00000001 0

0x03 9 0x00 0x200c0x001c 0x0000.016f20d8 0x0180083e 0x00000001 1302762364

...

0x20 9 0x00 0x200f0x001f 0x0000.016f1c75 0x0180083f 0x00000001 1302762364

0x21 9 0x00 0x20130x0010 0x0000.016f1e0c 0x0180083f 0x00000001 1302762364

This dump isfrom an 8KB block size using automatic undo management on a system runningOracle Database 11g, and the restrictions on space imposed by the 8KB blockmean that the transaction table holds just 34 rows. (Earlier versions of Oracleheld 48 entries in automatic undo segments and 96 entries in manually managedrollback segments—which didn’t have an extent retention map—when using 8KBblocks).

--这个内容是从一个8KB的block 里dump出来的，环境是Oracle11g下的自动undo 管理,因为8KB大小的限制，每个transaction table只能保存34条记录，在Oracle11g之前的版本中，在使用automatic undo segments可以保存48条entries，手工管理rollbacksegment可以保存96条记录，使用manually managed rollback segment时，没有extentretention map。

Since there’sonly a limited number of entries in a transaction table and a limited number ofundo segments in an undo tablespace, you can only record details about arelatively small number of recent transactions, and you will have to keepreusing the transaction table entries. Reusing the entries is where the columnlabeled wrap# becomes relevant; each time you reuse anentry in the table, you increment the wrap# for that entry.

--因为每个事务表里只能保存有限的entry，并且undo 表空间里的undosegment也是有限的，我们只能记录最近小部分事务的详细信息, 因此我们必须重用transaction tableentry，重用entry与entry中的wrap#有关，每次使用事务表里的entry，该entry的wrap#值就会增加。

Note：

Occasionally Ihear the question, “Does the wrap# get reset every time the instance restarts?”The answer is no. As a general principle, any sort of counter that is stored onthe database is unlikely to be reset when the instance restarts. Remember,every slot in every undo segment has its own wrap#, so it would be a lot ofwork at startup to reset them all.

--一次偶然的机会我听到一个问题：每次instance 重启时wrap# 会被重置吗？答案是不会，一般来说，任何计数都会存储在数据库中，因此不会在实例重启时被重置，记住，任何undo segment中的任何slot都有自己的wrap#，所以如果在实例重启时重置他们就需要做大量的工作。

2.1 Start and End of Transaction --事务开始与结束

When a sessionstarts a transaction, it picks an undo segment, picks an entry from thetransaction table, increments the wrap#, changes the state to “active” (value10), and modifies a few other columns. Since this is a change to a databaseblock, it will generate a redo change vector (with an OP code of 5.2) that willultimately get into the redo log file; this declares to the world and writesinto the database the fact that the session has an active transaction.

--当一个事务开始时，它会申请一个undo segment，在从这个undo segment中的transactiontable中申请一个entry，增加entry的wrap#，并将其状态改成active（即值为10），然后修改其他的一些字段值。因为这些是改变数据库的block，所以也会生成redo change vector（这种情况的OP值是5.2）, 这些redo vector最终也会写入redolog file。这些操作完成后在表明是一个active transaction。

Similarly, whenthe transaction completes (typically through a commit; call), the session setsthe state back to “free” (value 9) and updates a few other columns in theentry—in particular, by writing the current SCN into the scn column. Again,this constitutes a change to the database so it generates a redo change vector(with an OP code of 5.4) that will go into the redo log. This moment is alsorather special because (historically) this is the “moment” when your sessionprotects its committed changes by issuing a call to the log writer (lgwr) towrite the current content of the redo log buffer to disc and then waiting forthe log writer to confirm that it has finished writing. Once the log writer haswritten, you have a permanent record of the transaction—in the ACID jargon, thetransaction is now durable.

--同样，当一个事务完成时（即commit），session 会重新将transactiontable中的entry对应的wrap# 状态改为9，然后更新其他的一些字段值，注意一点，这里会把current SCN 写入entry的scn 字段。同样，这个些操作也会生成redo vector（此时对应的OP值为5.4），并写入redo log，这个时候非常特殊，因为session 会调用logwrite 进程把当前的redo log buffer中的数据写入disc，然后等待logwriter确认已经写完，这样就保护它所提交的数据不会丢失，所有事务的record都会写入disc。

Note：

You will oftenfind comments on the Internet and in the Oracle documentation about the logwriter “creating a commit record.” There is no such action. When you commit,you modify a database block, specifically the undo segment header block holdingthe transaction table slot that you’re using, and this block change firstrequires you to generate a redo change vector (historically as a stand-aloneredo record) and copy it into the redo log buffer. It is this change vectorthat (very informally) could be called “the commit record”; but it’s yoursession (not the log writer) that generates it and puts it into the redo logbuffer, it’s just a specific example of the standard logging mechanism. Theonly special thing about “the commit record” is that once it has been copiedinto the log buffer, the session calls the log writer to write the currentcontents of the log buffer to disk, and waits for that write to complete.

--在文档中经常可以看一句：log writer “creating a commit record”,实际上并没有这个操作，commit时，会修改block，特使是undo segment headerblock，其记录了我们使用的transaction table slot，在修改之前也是需要先生成redo change vector（一个独立的redo record），然后把redovector写入redo log buffer，这个操作可以称为： the commit record，但其是在session中生成并写入buffer。

关于the commit record，一旦redovector copy进入log buffer，session 就会调用log writer 将redolog buffer中的当前内容写入disc，并等待写入操作结束，session 操作才能完成。这个机制也保证了修改的数据不会丢失，如果在没有写完之前，db crash了，那么session 就是失败的，在下次启动时会回滚相关的操作。

A transaction isdefined by the entry it acquires in a transaction table and is given atransaction ID constructed from the undo segment number, the index number ofthe entry in the transaction table, and the latest wrap# of that entry—so whenyou see a transaction ID like 0x0009.002.00002013, you can translate this into:undo segment 9, entry 2, wrap# 0x2013 (8,211 decimal). If you want to checkwhich undo segment this is and the location of the header block, you can alwaysquery view dba_rollback_segs by segment_id.

--事务由transaction table中的entry来定义，会提供一个transaction ID，其由undo segment number，transactiontable中entry的index number和entry中最新的wrap#构成，所以我们可以看到transaction ID，如：0x0009.002.00002013，从这个ID，我们可以获得如下信息： undo segment 号是9，entry 号是2，wrap#值是 0x2013 (8,211 十进制)，如果我们想检查使用了哪个undo segment和undohead block的位置，可以通过segment_id 来查询dba_rollback_segs 视图。

This transactionID will appear in several different places—a couple of the well-known placesare in the dynamic performance views v$transaction and v$lock. The examples ofdumps that I’ve printed so far came from an instance where nothing else wasrunning, so when I ran the following queries, I knew they would return just onerow which would be for the transaction I had started:

--在除了dump 文件，在v$transaction 和v$lock视图中我们也可以查到transaction ID的信息。如：

select xidusn, xidslot, xidsqn fromv$transaction;

XIDUSN XIDSLOT XIDSQN

---------- ---------- ----------

9 2 8211

select trunc(id1/65536) usn, mod(id1,65536)slot, id2 wrap, lmode from V$lock where type = 'TX';

USN SLOT WRAP LMODE

---------- ---------- ---------- ----------

9 2 8211 6

You’ll noticethat the lock mode on this “transaction lock” is 6 (exclusive, or X, mode).While my transaction is active, no one else can change that entry in thetransaction table, although, as you will see in Chapter 4, other sessions maytry to acquire it in mode 4 (share, or S, mode) so that they can spot themoment the transaction commits (or rolls back). You’ll also notice that whereI’ve been talking about an “entry” in the transaction table, the view refers toit as a slot, and this is how I’ll refer to it from now on.

--我们可以注意到lock 的模式是6，排他锁. 关于锁的更多说明参考：

ORACLE 锁机制

http://blog.csdn.net/tianlesoftware/article/details/4696896

2.2 The Transaction Table –事务表

Table 3-1 lists and describes the columnsfrom the transaction table extract presented earlier in the chapter.

--下表列出了事务表字段及其描述：

In fact, youdon’t need to do block dumps to see the transaction table information becauseit’s exposed in one of the x$ structures: x$ktuxe. This is one of the strangerstructures in Oracle because a query against the structure will actually causeOracle to visit each undo segment header block of each undo segment in thedatabase. The formatting of the contents is different and the cmt column(transaction commit time) isn’t available.

--一般来说，我们不需要dump block 来查看transaction table里的内容，我们可以直接通过x$ktuxe 来查看。每次查询这个字典，都会导致数据库访问所有undo segment 的undo segment header block。

/* Formatted on 2012/6/2 19:18:03 (QP5 v5.185.11230.41888) */
SELECT indx,
       ktuxesta,
       ktuxecfl,
       ktuxesqn wrap#,
       ktuxescnw scnW,
       ktuxescnb scnB,
       ktuxerdbf dba_file,
       ktuxerdbb dba_block,
       ktuxesiz nub
  FROM x$ktuxe
 WHERE ktuxeusn = 9 AND ktuxeslt <= 5

INDXKTUXESTA KTUXECFL WRAP# SCNW SCNB DBA_FILEDBA_BLOCK NUB

---- ---------- ---------- ------ -------------- --------- --------- -----

0INACTIVE NONE 8211 0 24059841 6 2110 1

1INACTIVE NONE 8212 0 24059732 6 2110 1

2ACTIVE NONE 8211 0 24060156 6 2110 1

3 INACTIVE NONE 8204 0 24060120 6 2110 1

4INACTIVE NONE 8212 0 24059364 6 2111 1

5INACTIVE NONE 8212 0 24059497 6 2110 1

So what we have in a transaction table is a“focal point” that records a transaction ID as follows:

--因此transactiontable是一个‘focal point’，其用如下信息来记录transaction ID。

1.A specific physical location stored inthe database

2.An indicator showing whether thattransaction has committed or is still active

3.The SCN for a committed transaction

4.Information about where we can find themost recent undo record generated by the transaction

5.The volume of undo generated by the transaction

This means wecan typically access critical information about the most recent N × 34transactions (where N is the number of undo segments available to end-userprocesses, 34 is the number of transaction table slots in an undo segment in11g, and assuming a fairly steady pattern of transactions) that have affectedthe database.

--因此着我们可以访问最近的N*34个事务（N是undo segments的数量，34是Oracle11g中每个单个transactiontable最多保存的entry数）。

In particular,if a transaction has to roll back, or if a session is killed and smon (systemmonitor) has to roll its transaction back, or if the instance crashes and,during instance recovery, smon has to roll back all the transactions that wereactive at the moment of the crash, it is easy to spot any active transactions(state = 10) and find the last undo block (the dba) each transaction was using.Then we can start walking backward along the chain of undo blocks for eachtransaction, applying each undo record as we go, because (as you saw in Chapter2) each undo record points to the previous undo record for the transaction. Itisn’t commonly realized, by the way, that when Oracle has applied all therelevant undo records, the last thing it does is update the transaction tableslot to show that the transaction is complete—in other words, it commits.

--如果事务发生了回滚，或者session 被kill，那么smon进程会回滚相关的事务。如果instance crash，那么在instance recovery时，smon 也会将事务回滚事务crash的时刻，我们可以通过entry(transaction table slot)的状态来判断事务是否是active（state=10），然后根据DBA查找每个事务最近使用的undo block，然后根据这些undo block的chain来backward每个事务，应用每个undo record。最后更新transaction tableslot（entry）标记事务完成，即commit。

Note:

It is possibleto declare named savepoints in mid-transaction and then rollback to savepointX. If you do this, your session keeps a list of the current savepointsin the session memory with the address of the last undo record created beforethe savepoint call was issued. This allows the session to apply undorecords in reverse order and stop at the right place. An interesting (butperhaps undocumented) side effect of creating a savepoint in a transaction isthat it seems to disable some of the array-processing optimization thatsometimes takes place in the construction of undo records.

--在事务操作中，我们可以可以创建一个savepoints，然后rollback到这个savepointX。如果我们创建了savepoints，那么session 会在创建savepoint之前，在sessionmemory中使用最新的last undo record address创建并保持一个list。这样允许session应用这些undo record来回复操作。

2.3 Reviewing the Undo Block --回顾Undo Block

It’s worthlooking at a small extract from an undo block at this point, because there is alittle detail about block “ownership” that you need to understand to completethe picture. Here’s the start of an undo block dump showing the recorddirectory and a little bit of the first and last records:

--下图是一个undo block的dump 内容，其中包含了开始和结束的部分records。

Without lookingtoo closely at the details, an undo block appears to be similar in many ways toan ordinary data block—there’s a header section with some control informationand metadata; there’s a row directory that lists the locations of the itemsthat have been stacked in the block, there’s a heap of items (in this case,undo records) stacked up from the end of the block, and then there’s the blockfree space in the middle. One important difference between table rows and undorecords, though, is that undo records don’t get changed (except in one specialcircumstance), so they always stay in the same place once they’ve been put intothe block—unlike table rows, which, as you saw in Chapter 2, may get copiedinto the free space as they are updated, leaving tangled pointers and(temporary) holes in the block (see Figure 3-2).

--如果不仔细区别，undo block 和data block还是很类似的，他们的header 部分都包含了一些control information 和metadata, 都有一个rowdirectory存储了对应block中存放stack 位置的lists，都有item的heap信息（recordheap 和row heap）。注意这里的heap是倒叙存放的，即从block的最后面开始存储，free space 是在中间。

Table row和undo record一个非常重要的区别：undo records 不会改变，所以一旦放入了undo block，他们会保留在相同的位置，而table rows在修改时会随机的存放在block中。

Figure 3-2. Schematic comparison of an undoblock and table block

Note：

There is one case where undo records do getmodified, but the modification is a change to a single byte flag, which meansthe record doesn’t change size and therefore doesn’t need to be copied for themodification. That single-byte change will still generate a few dozen bytes ofredo. The change occurs when a session is rolling back a transaction (orrolling back to a savepoint) and, when it uses the undo record, it sets a flagbyte in the record to a value for User Undo Applied. You can see this work reportedin the statistic rollback changes - undo records applied.

注意：

在这个case里，undo record 也会被修改，但是这里的修改是改变singlebyte flag，也就是说record的大小不发生变化，因此这里的undo record的修改也不需要进行copy操作。 Single-style的修改也是会生成0bytes redo， single-byte的修改反生在session回滚时或者rollback 到某个savepoint时，当使用undorecord

，它会在undo record中设置flag byte为一个有效值。可以使用undo recordapply来查看相关rollback change的统计信息。

Looking at thetop line of the preceding block dump, the xid: (transaction ID) is0x0008.029.00002068, which means that this is undo segment 8 (0x0008), the“owner” of this undo block is currently a transaction that is using slot 41(0x029) from the transaction table (since the slot number is over 34, we caninfer that this is from an older version of Oracle, rather than 11g), and thisis the 8,296th time (0x00002068) that the transaction slot has been used. Wecan also see from the incarnation number (seq: 0x97a) that the undo blockitself has been wiped clean (newed in Oracle-speak) and reused 2,426 times.

Note：

When Oracle is about to reuse an undo block,it doesn’t care about the previous content, so it doesn’t bother to read itfrom disk before reusing it; it simply allocates a buffer and formats a newempty block in the buffer. This process is referred to as newing the block. Ifyou have enabled Flashback Database, though, Oracle will usually decide that itneeds to copy the old version of the block into the flashback log, so it willread it before newing it. This action can be seen in the statistic physical readsfor flashback new. This mechanism isn’t restricted to undo blocks – you willsee the same effect when you insert new rows into a freshly truncated table,for example – but it is the most common reason for this statistic to startappearing when you enable database flashback.

--注意：

当Oracle 重用undo block时，它不会关心undoblock里之前的数据，因此在使用之前不会先去磁盘读该undo block里的数据，而是简单的分配一个buffer，然后在buffer里个格式化一个emptyblock。这个过程被成为: newing the block. 如果我们启用了Flashback Database， Oracle一般会决定是否copy 就的block到flashbacklog里，因此在这种情况下会先读这个undo block。该动作的具体信息可以查看statisticphysical read for flashback new。这个机制不限于undo blocks.

There’s an odddiscrepancy, though, in the first line of record #0x1, where we can see thetext slt: 0x17, which doesn’t match the first line of the last record (#0xC) inthe block, where we see the text slt: 0x29. This means the first record was putinto this undo block by a transaction using slot 23 (0x17) of the transactiontable while the last record was put there by the transaction using slot 41(0x29)—which is what we expect since that’s the one that “owns” the block.

It is alittle-known fact that a single undo block may contain undo records frommultiple transactions. This oversight is, I think, amisinterpretation of a comment in the Oracle documentation that transactionsdon’t share undo blocks—a true, but slightly deceptive, statement. Atransaction will acquire ownership of an undo block exclusively, pin it, andthen use it until either the block is full (at which point the transactionacquires another undo block and updates its transaction table slot to point tothe new block) or the transaction commits.

--另一个鲜为人知的事实，一个undo block里可能包含多个事务的undo record。事务会获取自己专有的undo block，pin block，然后使用该block，直到该block变满（此事事务会申请另一个undo block，并更新transaction table slot指向新的undo block）, 或者提交事务。

If there’s stillenough empty space left in the block when the transaction commits(approximately 400 bytes the last time I tested it), the block will be added toa short list in the undo segment header called the free block pool. If thishappens, the next transaction to start in that undo segment is allowed to takethe block from the pool and use up the remaining space. So active transactionswill not write to the same undo block at the same time, but severaltransactions may have used the same undo block one after the other.

--如果在事务提交时还有足够空间剩余（大约400 bytes），那么该block就被添加到undosegment header中叫作free block pool区的short list中。如果做了此次操作，那么在下次事务开始时，undosegment 会允许从block pool里拿到这个block并使用剩下的空间，所以对于active 事务，不会在同时写入相同的undo block，但是对于多个事务可能会使用相同的undo block。

In general,then, the last record in an undo block will belong to the transaction thatcurrently “owns” the block, but in extreme circumstances,any transaction that has put records into that block will be able to identifyits own records because it has stamped its records with its slot number.

Note：

Occasionallypeople get worried about the number of user rollbacks their systems arerecording, more often than not because they’velooked at the statistic Rollback per transaction %: in an Automatic WorkloadRepository (AWR) or Statspack report. Don’t worry about it until after you’velooked at the instance activity statistics transaction rollbacks and rollbackchanges - undo records applied. It’s quite possible that you’ve using one ofthose web application servers that issue a redundant rollback; call after everyquery to the database. This will result in lots of user rollbacks that don’tturn into transaction rollbacks, but do no work.

--注意：

有人会担心他们系统使用user rollback来进行record的数量，多半因为他们在AWR或者statspack中看到： Rollback pertransaction %。不用担心这个问题，除非我们看到statistics transaction rollback 和rollback change（undo record applied）。还有一种可能就是我们的web应用执行了redundantrollback。

So you know howto start and end a transaction and how to deal with rolling back a transaction,either voluntarily or after a session or system crash. There are lots moredetails we could investigate about the inner workings of transaction control,but we’ve covered the main activity that surrounds the transaction table. It’stime now to look at undo from another perspective and turn our attention to thedata blocks and the ITL structure that transactions use as the focal point forthe changes they make to a block.

三．Data Block Visits and Undo –访问DataBlock 和 Undo

Any time yoursession looks at a data block, it needs to ensure that what you see is theappropriate version of the data. This means that, from an external point ofview, your session should not see any uncommitted data, or data that wasmodified and committed since the start of your query (or DML statement or eventransaction—depending on the isolation level). This is referred to as aread-consistent version of the data.

--任何时间查看data block，都需要保证我们看到的数据是合适的版本，即不能看到任何没有提交的数据，或者从我们查询开始时数据已经修改并提交。具体可以参考数据的read-consistent version。

Note：

It’s easy toforget that read consistency is also a necessary prerequisite to changing data.If your session is supposed to modify the data in a block, then, from aninternal point of view, it has to see it in two different ways—it has to seethe current version of the data, because that’s the only thing that can legallychange, and it has to see a read-consistent version of the data, because ifthere are critical differences between the two views, your session may have towait, it may have to restart the current statement, or it may even have to failand raise an error (typically ORA-08177: can't serialize accessfor this transaction).

--read consistency是change data的先决条件。如果session要去修改block里的数据，那么从内部的角度来看，会看到两种不同的版本：current version 和read-consistent version。

We’re going towalk through the details of how read consistency works in the next fewsections, so we need to set up a little data, see exactly what it looks like,and then watch it very closely as one session makes changes and another sessionworks to avoid seeing those changes.

3.1 Setting the Scene –设置场景

We’ll start withthe example of querying the data. Imagine the following sequence of events in amultiuser environment where there are three other sessions apart from your ownsession connected to the database, and a table defined and loaded by the followingSQL :

假设在多用户下执行如下操作：

create table t1(id number, n1 number);

insert into t1 values(1,1);

insert into t1 values(2,2);

insert into t1 values(3,3);

commit;

Before allowinganything else to happen, let’s take a look at a block dump of the single blockin our table at this point. We need this dump so that we can watch the internalinformation changing as one session modifies the user data and another sessiontries to work backward toward a specific version of that data.

--在执行其他操作之前先dump 表中的一个block，这样好方便我们查看修改数据时block中的变化：

The detail weare really going to pursue in this section is the interested transaction list(ITL)—the tabular section near the top of this dump starting with a set oflabels at line 5 and holding two rows of information

3.2 The Interested Transaction List -- ITL 说明

Table 3-2 lists and describes each item inthe ITL.

下表是ITL中各字段的说明：

OraceITL(Interested Transaction List) 说明

http://blog.csdn.net/tianlesoftware/article/details/6573988

We can see inthe initial block dump that the ITL for our block contains two entries—this isthe default number when creating a table or index in Oracle Database 9i orlater. If you want to create (or rebuild) an object with a larger ITL in eachblock, which you might do to avoid contention at higher levels of concurrentmodification, you can do so by setting the initrans parameterat object creation time—but the ITL in any block can grow dynamically if itneeds to, provided there is enough free space available in the block. The sizeof the ITL is limited by the maxtrans parameter (at least forearlier versions of Oracle), which has a syntactic limit of 255 but a hardlimit dictated by the size of the object’s data block. The syntactic limit is,unfortunately, ignored in versions 10g and later; and in the case of an 8KBblock size, the hard limit is 169.

--从dump 文件中，我们可以看出该block的ITL有2个entry。这个也是从Oracle9i之后版本中创建表或者索引时默认的entry数。如果我们要创建（重建）一个对象，且对象block使用较大的ITL数，那么这样可以避免在高并发情况下的争用。我们也可以在创建对象时通过设置initrans来控制ITL数，实际上任何block中的ITL 都是可以根据需要动态增长的，只要block里还有足够的空间。ITL的最大数限制是255，实际上其大小受datablock的大小决定。在Oracle 10g之后的版本，对于8KB的blocksize，最大的ITL数只能是169.

Initrans ：默认值1，该参数表示在单一块中最初活动的交易事务数。

Maxtrans ：默认值是255，表示在单一块中最大交易事务数。

Oracle Table创建参数说明

http://blog.csdn.net/tianlesoftware/article/details/4954417

There are a few odditieswith initrans. For an index, the value for initrans applies only to the leafblocks—each branch block (which includes the root block) will get one ITLentry, which is used only for block splits. The first ITL of a leaf block isreserved for leaf block splits, and there is one special case: when you createa new index on an empty (or sufficiently small) table, the single block of theindex has to behave both as a root (branch) and a leaf, so it gets two ITLentries—one because it’s a branch and the second because it’s also a leafand therefore needs an ITL for actions other than block splits.Historically the default value of initrans for a table was 1, but in recentversions of Oracle this changed to 2 even though the data dictionary will stillreport the value as 1. (And if you load some blocks using direct path loads,you will find that they initially have three ITL entries.)

--注意initrans参数一个特异性。对于索引（二叉树结构），initrans 只在leaf blocks上申请，每个branch block(包含rootblock) 都只获取一个ITL entry，这个操作仅用来做block splits。leafblock的第一个ITL用来做leaf block splits。

有一个特例：当在一个空表或者非常小的表上创建索引时，索引单个block 需要做为root 和leaf，在这种情况下，就有两个ITL entry。第一个原因：它是一个分支，第二个原因：它也是一个leaf，因此需要ITL来进行block splits。

之前的Oracle中，表的initrans参数默认值是1，但是最近的版本中Oracle改成2，即使数据字典还是显示为1.（如果使用directpath load来装载一些block，那么可以发现他们的的初始化有3个ITLentries。）

Oracle 10g 对象默认 ITL 数量测试

http://blog.csdn.net/tianlesoftware/article/details/7640766

The ITL existsto identify transactions that recently changed a data block, but it takes spaceto identify a transaction, and Oracle likes to keep the ITL as short aspossible. (Oracle doesn’t shrink an ITL once it has grown; arguably, there’sgenerally no good time to do so, but you may occasionally come across a fewextreme cases where it seems to be an “obvious” thing to do.) Moreover, on anindex leaf block split, the old ITL is copied forward into the new leafblock—and this strategy can end up wasting a lot of space.

--ITL 的作用是为了识别最近对data block做修改的事务，但它也需要占用block的空间，所以Oracle更希望将ITL保存的近可能的短。一旦ITL分配之后就不会进行shrink。

在索引的leaf block split上,也是直接将旧的ITL复制到新的leaf block上，这个策略可以节省大量的空间。

Note: There aretwo other SCNs recorded at fixed locations on each data block: the cleanout SCN(labeled in the dump as csc:), which records the SCN at which the block waslast subject to full cleanout (see “Delayed Block Cleanout” later in thischapter), and the last change SCN (labeled in the dump as scn:), which recordsthe SCN as at the most recent change to the block and is linked with an extrabyte (labeled seq:) that records the number of times the block has changed atthat SCN (if the seq: reaches 254, it rolls over to 1 and triggers an incrementin the instance SCN).

--在每个data block的固定位置都有两种类型的SCN record：cleanoutSCN（dump里的标记是csc）和last change SCN（dump里暴击是scn），CSC 记录的是最后一次fullcleanout，而scn 则是最近一次change 时的SCN。Scn会与另个一值相关（标记是seq：），其记录是的在SCN时block改变的次数，如果seq达到了254，那么就会重1开始从新增加。

--每隔5 分钟，系统产生一次系统时间标记与scn 的匹配并存入SYS.SMON_SCN_TIME 表(由SMON 进程来进行Update操作)。

So what we seein this block is a short list of recent transactions. In fact, this block is sonew that one of those ITL entries hasn’t even been used yet—every item in ITLentry 0x02 is zeroed out. But in ITL entry 0x01 we see that transaction1.1.1de4 (undo segment 1, slot 1, sequence 7,652) changed the block recently.It committed at SCN 0x01731c46, but the block has not yet been cleaned outproperly because the flag is --U- for upper bound commit (we’lllook at the different effects that can appear from a commit call in a littlewhile), and the lock counter reports three rows locked. (If you glance furtherdown the dump, you can see that all three rows in the block show lb: 0x01—thethree rows in this block were all locked by the transaction currently reportedat ITL entry 0x01.) Finally, you can see from the uba that if you go to record5 of block 11,976 of file 6 (which should have sequence number 0x0543), you’llfind a description of how to reverse out the most recent change applied by the transaction.

--因为ITL，所以我们可以看到block上最近的事务的列表。在上面的dump文件中，ITL的entry0x02是0. 即这个entry还没有使用。但ITL entry 0x01我可以看见最近的事务：1.1.1de4 (undo segment 1, slot 1, sequence 7,652)。该事务的commit在SCN 0x01731c46，但是该事务还没有clean out. 因为flag 标记是：--U-。

Without dumpingall the relevant undo, I’ll just let you know that record 5 from that undoblock says “clear slot (row) 0x02 from the table block and change the uba forITL entry 1 to read 0x01802ec8.0543.04 (i.e. point to record 0x04 in thecurrent undo block)”; record 4 says: “clear slot (row) 0x01 from the tableblock and change the uba for ITL entry 1 to read 0x01802ec8.0543.03”; and,finally, record 3 says “clear slot (row) 0x01 from the block and change ITLentry 1 to ‘no previous use’.”

3.3 Concurrent Action –并发活动

Now that we havesome data ready, and a block with one used and one empty ITL entry, let’s startfour separate sessions and take the following step—in exactly the followingorder:

--分别开启4个session，然后按照如下顺序执行操作：

Session1: updatet1 set n1 = 101 where id = 1;

Session2: updatet1 set n1 = 102 where id = 2; commit;

Mysession: set transaction readonly;

Session3: updatet1 set n1 = 99 where id = 3; commit;

Mysession: select id, n1 from t1;

The first callin the session that I’ve labeled “My session” sets the isolation level (seeChapter 2) to read only, in effect freezing the database (from my point ofview) to a specific point in time or, to be a little more precise, to aspecific SCN.

From that pointonward, two restrictions apply to my session: first, I am not allowed to seeany uncommitted changes made by any other user (which is the standard readcommitted behavior in Oracle anyway), and second, more stringently, I am noteven allowed to see any committed changes made after that moment.

So when I run myselect statement, the uncommitted change made by session 1 and the committedchange made by session 3 must remain invisible, and I should see only thecommitted change made by session 2. My query has to return the result set:(1,1), (2,102), (3,3). On the other hand, because of the near-real-time natureof the way that Oracle changes blocks, all the changes will have been made tothe copy of the block that is in memory before I actually start to run myquery. So what’s going on inside Oracle to make it possible for me to see thecorrect result?

This (strippedto a minimum) is what the block looks like after the three sessions have madetheir changes and just before my query starts to run:

I forced Oracleto write this block to disk with a call to alter system checkpoint beforedumping it, to show that all the changes (including the uncommitted change fromsession 1) are not just in the buffered copy of the critical block but couldalso be on disk already. You might notice, by the way, that rows 0 and 1 are ata different location in the block (0x1f7b and 0x1f71 rather than the offsets0x1f97 and 0x1f8d that appeared in the previous dump) because their updatesincreased the row length, so Oracle had to copy the rows to the block’s freespace to make the change; on the other hand, row 2 is still in the samelocation (0x1f85) because the change to its value didn’t change the length ofthe row, thus allowing Oracle to do the update in place.

Look carefullyat the ITL: although we have executed three transactions affecting this blockin the very recent past, there are still only two ITL entries. This is becauseOracle tries to keep the ITL as short as possible, and will reuse ITL entriesfor committed transactions rather than extend the list. (The ITL entries willbe reused in the order of oldest commit SCN first.)

ITL entry 0x01is showing the effects of a commit cleanout, the rapid but incomplete cleanoutthat may get done to some of the changed blocks as a transaction commits. Theflag is set to upper bound commit, but the lock count is still set to 1, andthe Scn/Fsc label is still set to fsc even though the value itself is showing acommit SCN of 0x01731c83. If we check the body of the block, we see that row 2(the third row) is showing its lock byte (lb:) as 0x01—row 2 is the one rowthat had been locked by the transaction in ITL 1. This ITL entry is the ITLentry for our third transaction, the one that changed n1 to 99 for id = 3 andthen committed.

ITL entry 0x02looks as if it is not yet committed. For reasons we will see later, this may bedeceptive—it is possible that the transaction committed some time ago withoutOracle doing anything to tidy up the mess and mark the transaction ascommitted. In fact, this ITL is recording our first, uncommitted transactionand we can see that it has one row locked (Lck = 1 in the ITL) and that row 0is the one row in the block that is locked by ITL 0x2 (lb: = 0x2 in the body ofthe block).

It would be niceat this point to have an animated sequence of diagrams, showing the linksbetween an ITL entry and the relevant parts of the associated undo segment, butthe best I can do is Figure 3-3, which is a generic example showing how an ITLpoints to the related slot in the undo segment header’s transaction table,while the uba points to a specific record in an undo block in that segment,and—if this happens to be the most recent block in the transaction—how thetransaction table slot would also point to the same undo block.

四．Creating Consistency --CR block创建说明

So what happensnext? Oracle clones the block in memory and does a load of work to get from thecurrent version to a version that (again, stripped to the minimum) looks likethis:

--在我们修改data时，Oracle 会将对应的block copy一份到内存，然后做一些改变version的操作。

当我们想要查找某个数据的时候，发现这个数据块的版本比我们要查找的要新，那么我们只能从UNDO中去查找这个数据的前映像（PRE IMAGE），在回滚段中找到这个数据的前映像后，把前映像和CURRENT的数据块合并，就形成一个CR BLOCK。

Notice particularly that ITL entry 0x2 is back to the state “never been used” and thatITL entry 0x1 is reporting transaction ID 0x0009.00d.00002100 rather thantransaction ID 0x000a.00e.00001b93. Moreover, this transaction is marked ascommitted and cleaned out (Flag = C---) with the lock count reset to zero. Thistransaction is the committed transaction from session 2 that we are supposed tosee. The changes due to sessions 1 and 3—which hadn’t yet committed—havedisappeared completely from this version of the block.

--注意这里的ITL entry 0x2 又被标记为: never been used,但是ITLentry 0x1 报告的transaction ID 发生了变化，并且该事务被标记为commited和cleaned out(Flag 是C---)，lock 统计被重置为0.

Take a look atthe data: the values in the three rows are (1,1), (2,102), and (3,3), and allthree rows are showing a zero lock byte (i.e., none of the rows is locked).Here’s another little point to note, though—check the address in the block ofrow 0: it’s moved again, from 0x1f7b to 0x1f68.

Oracle’salgorithm has done the following:

--Oracle算法操作如下：

1. Clone the block into anotherbuffer, using the clone as the target for the next four steps.

--将block 克隆到buffer中，然后使用clone的block进行修改（CRblock）。

2. Apply cleanout to any committedtransactions that need it.

--对任何已经commited的事务应用cleanout。

3. Reverse out the changes fromany uncommitted transactions.

--回滚任何uncommited的事务的修改。

4. If there are any changes due totransactions with a commit SCN higher than the target SCN (i.e., the snapshotSCN at the start of the query, DML, or transaction), reverse out the changesfor the transaction with the highest commit SCN.

--如果因为修改（其他session 的操作）导致导致原block的commit SCN 高于我们clone block的SCN，根据高SCN 对cloneblock进行回滚操作，即保证block的数据是一致的（把其他的修改也同步到cloneblock中）。

5. Repeat step 4 as necessary.

--根据需要重复第四步。

经过以上5步骤的操作，就完成了CR block的创建。之前的blog，参考：；

CR(consistent read) blocks create 说明

http://blog.csdn.net/tianlesoftware/article/details/6529401

In our case,this didn’t take many steps. Here’s the ITL for the current version of theblock again, before any read-consistency work:

--在我们的示例中，并没有多少操作，在我们开始任何read-consistency 工作之前，ITL的信息如下：

Itl Xid Uba Flag Lck Scn/Fsc

0x01 0x000a.00e.00001b93 0x01800518.04f5.34 --U- 1 fsc0x0000.01731c83

0x02 0x0004.00c.00001964 0x018036ad.05ff.3a ---- 1 fsc0x0000.00000000

Oracle’s firststep is to check (by looking at the state of slot 12 in the transaction tablein undo segment header 4) that the transaction in ITL entry 0x02 is stillactive. Since it is, the changes it has made to this block have to be reversedout, so Oracle goes to undo block 0x018036ad, checks that its sequence(incarnation) number is 0x5ff, and then looks at record 0x3a, which tells itthat “column 1 of row 0 of table 0 should be changed to value c1 02” (Oracle’sinternal representation of decimal 1). This change affects the length of therow, so Oracle copies the row into the block’s free space and makes the change.The undo record also has the instruction to rewrite the ITL entry as “neverbeen used.” So the ITL in our clone now looks like this:

--Oracle 的第一步是检查ITL entry 0x02还是active状态的，因为它是活动的，所以它是活动的，所以我们需要先回滚这个事务的操作，从ITL的uba中获取对应的undo block，检查对应的sequence，然后找到对应的undorecord。因为这个操作会影响row的长度，索引oracle copy row到block中的空闲空间并进行change。操作完成之后，undo record 有一个指令来重新ITL的信息，标记为:never been used。所以最后ITL的信息如下：

Itl Xid Uba Flag Lck Scn/Fsc

0x01 0x000a.00e.00001b93 0x01800518.04f5.34 --U- 1 fsc0x0000.01731c83

0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc0x0000.00000000

There are nomore uncommitted transactions, so Oracle checks the SCN it’s supposed to reachand all the SCNs in the ITL and decides that 0x01731c83 is too high, and thatit’s the highest of the guilty SCNs. So Oracle starts the process of reversingout the changes due to transaction 0x000a.00e.00001b93 by reading undo record0x01800518.04f5.34. This undo record tells it to change column 1 of row 2 oftable 0 to c1 04 (decimal 3). Since this doesn’t affect the size of the row,Oracle makes the change in place. The undo record also tells Oracle that it hasjust unwound the first change made to this block by this transaction and thatit’s also holding the previous version of this ITL entry, which should also becopied back. As explained in Chapter 2, the first change a transaction makes toa block is a special case—it’s the need to select an ITL and preserve thecurrent contents that makes it special (and, occasionally all the current ITLentries will be in use and the transaction will have to create a new one—ifthere’s enough space). Specifically, the undo record contains some informationthat is clearly an ITL entry, which looks like this:

--这里没有uncommited 的事务，所有Oracle检查到达的SCN和ITL里的所有SCN，并决定0x01731c83是最高的SCN，所以Oracle启动恢复进程使用undo来回滚事务。

因为这个操作不影响row大小，所以Oracle直接替换。Undo record 也会记录相应的操作信息。这里要注意的是undo record里也会包含ITL entry 的一些信息，如下：

op: L itl: xid: 0x0009.00d.00002100uba: 0x018030ca.074c.11 flg:C--- lkc: 0 scn:0x0000.01731c7a

So Oracle copies this back in place,leaving the ITL of our clone looking like this:

--最后Oracle把我们修改之后的buffer中的block在copy回去，并修改我们克隆block中ITL的信息，如下：

Itl Xid Uba Flag Lck Scn/Fsc

0x01 0x0009.00d.00002100 0x018030ca.074c.11 C--- 0 scn0x0000.01731c7a

0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc0x0000.00000000

Again Oraclechecks to see if there are any SCNs in the ITL that are higher than the targetSCN—and at this point there aren’t. So this is the correct read-consistent (andvery private) version of the block that this session needs to use.

--最后在检查ITL 中是否有SCN 大于我们克隆block的SCN。如果有，会继续进行回滚操作，我们这里没有，所以就创建了一个read-consistent 版本的block。

Note：

When Oracleaccesses a row through an indexed access path, it may be able to use the RowCR(row consistent read) mechanism when it gets to the table block. This meansthat Oracle has gone through the complex processing I’ve just described to getthe index leaf block into a read-consistent state and has then been able tocheck that the row itself has not changed since the reconstructed commit SCNfor the index entry.

--如果oracle 使用index access path来访问某个row，它可能会使用RowCR（rowconsistent read）机制来获取block。

五．Consistent Doesn’t Mean Historic -- 一致状态并不是永久的

An interestinglittle point to pick up here is that the thing that we’ve constructed is aversion of the block that has never actually existed. We now have a block wherethe change from session 1 simply doesn’t exist; but the change from session 2is in place and visible. In the train of events up to the point where westarted our query, this never happened.

--这里有一个很有趣的知识点，我们构造的block version 实际上并不存在.

In fact, theclosest we came to the state of our read-consistent clone was probably themoment just before we started the update from session 3. At that moment the ITLfor the block looked like this:

--在我们开始在session 3中update 之前，是最接近read-consistentclone的状态，那时的ITL 信息如下：

Itl Xid Uba Flag Lck Scn/Fsc

0x01 0x0009.00d.00002100 0x018030ca.074c.11 --U- 1 fsc0x0000.01731c7a

0x02 0x0004.00c.00001964 0x018036ad.05ff.3a ---- 1 fsc0x0000.00000000

Notice, by theway, that ITL 0x01 is showing the effects of a commit cleanout(--U-, 1, fsc) even though the undo record that we eventually retrieved fromthe next transaction to use this ITL entry stored (C---,0,fsc). This is anexample of delayed logging block cleanout. The next transaction didn’t copy theinformation that was in the ITL entry, but rather stored the information thatwould have been in the ITL entry if Oracle had cleaned it out properly thefirst time around.

--注意这里ITL 0x01 显示的是commit cleanout效果，即使下个事务使用这个ITL时会获取对应的undo record。这个是一个delay cleanout的示例.下个事务不会使用直接使用这个ITL来存储信息，而是先clean 完上次事务的遗留工作后在存储。

Note：

There’s a common(and reasonably valid) approximation that read consistency takes cloned blocksback into “the past.” But it’s actually more subtle than that because we mayhave to deal with a mix of committed and uncommitted transactions, as well astidying up the side effects of commit cleanout. This is probably why RACdifferentiates between the CR (consistent read) state and the PI (past image)state. The PI state identifies copies of the block that are versions of theblock that really did exist at some point in the past, and could therefore beused as a basis for rolling forward if the instance holding the current versioncrashed.

--这是一个简单有效的方法，来把clone block back 到“the past”状态来实现read consistency. 实际处理上要比上面讲的更复杂，因为需要处理commited 和uncommited 事务的混合状态和commit cleanout的剩余操作。

这个也是为什么RAC 需要区分CR(consistentRead) 状态和PI（past image）状态。PI 状态用来识别copy的block，该version 的block在过去的某个时间点是真实存在的，并且在instance 持有的current version block chrash掉了，也可以用PIblock进行rolling forward操作。

There arevariations on this theme that we could consider, but I think I’ve said enoughabout the mechanism and given you enough block dumps to deal with in one go.It’s worth mentioning that things do get a little messier (particularly in 10g)because of the side effects of in-memory undo and private redo, but thedifferences are really just details of when changes actually become visible inblock dumps (from memory and from disk). To make things a little easier tofollow, I ran all my demonstrations with repeated calls to alter systemcheckpoint and alter system flush buffer_cache so that I could always pick upthe latest state of a changed block from disk and avoid some of the timingoddities from private redo and in-memory undo; this does affect the way thatOracle works, of course, but I don’t think the differences are material to thebasic understanding of what’s going on.

六．Work Done – 完成修改

There’s one lastlittle detail about read consistency that’s worth taking special note of atthis point: how to measure the work done. There are three significantstatistics:

还有一些关于read consistency 的细节在这里需要讨论：就是如何计算work done. 有三个重要的标准：

（1）CR blockscreated

（2）data blocksconsistent reads - undo records applied

（3）consistentgets - examination

Every time you startto construct a read-consistent copy of a data block, you increment the firststatistic, and every time you read an undo record from an undo block in theprocess of constructing a read-consistent copy, you increment the secondstatistic, and every time you visit an undo block to do this, you perform atype of block visit that requires you to acquire and hold the relevant cachebuffers chains latch as you read the block—this “single latch hold” visit isrecorded under the third statistic.

--每次当我们开始构建CR块是：是从data block copy一份到buffer中，然后增加第一次的statistic，当每次我们在进程中使用undo record来构建read-consistent copy时，增加第二次的statistic。每次我们访问undoblock来做这个操作时，我们需要申请并持有相关的cache buffers chains latch。那么singlelatch hold 就会出现在第三次的statistic。

Note:

It’s worthpointing out that whenever you do a flashback query, you’re doing the same typeof work that Oracle normally uses for read consistency—but you could be doing alot more of it because you’re trying to go read-consistent in the past.

--需要指出一点，当我们做flashback 查询时，我们也是做类似的操作。

Unfortunatelythere are other actions that visit blocks using the same “single latch hold”strategy—some accesses to index blocks, including index organized tables, andsingle table hash clusters, for example—and there are two otherreasons for undo records applied (we’ll see the third one soon), sothese figures don’t give you an immediate and perfect understanding of theimpact of constructing read-consistent blocks; but they are quite helpful.

--还有其他的一些actions 访问block 也使用”singlelatch hold” 策略。比如访问index block，包括index organized tables，singletable hash cluster。

Another relatedstatistic is the consistent changes figure, which has the same (or virtuallythe same) meaning as the data blocks consistent reads - undo records appliedbut is an older version of the statistic. Finally, you should be aware of thefact that you often find that you don’t need to do any work at all to constructthe correct version of a block, and many of your consistent reads may berecorded as no work - consistent read gets.

--另一些相关的statistic 是consistent changes figure，其和consistentread 意义一样，使用undo record apply，但它应用的是一个older version。最后我们意识到，在构建block 的correct version 时，我们不需要做任何操作。

Note：

Statisticsrelated to consistent gets record the work done getting the “right version” ofthe block. Unfortunately the statistics cover a multitude of sins. The basicstatistic, for example, counts the number of blocks that have beenreconstructed as well as the number of block visits you have made to undoblocks to do that reconstruction. Fortunately the visits to the undo blocks arealso recorded under consistent gets - examination, and the reconstructed blocksare recorded under CR blocks created. Unfortunately both these statisticsinclude counts from other tasks as well.

--statistics 记录了consistent gets 的过程，但是 statistics也掩盖了一些信息…

七．Commit SCN

Oracle Blockscn/commit scn/cleanout scn 说明

http://blog.csdn.net/tianlesoftware/article/details/6660530

Oracle blockcleanout 说明

http://blog.csdn.net/tianlesoftware/article/details/6663527

Finally we cometo the third linked list that runs through the undo records—the history ofcommit SCNs—and this is a topic that covers a lot of ground.

--在学习笔记一中已经提高undo records的2种linkedlist，这里看的是第三种linked list：history of commit SCN.

Earlier in thischapter I pointed out that when a transaction commits, it writes the currentSCN into its transaction table slot, and that the redo change vector for thisdata block change constitutes the entirety of the so-called “commit record.”

--当事务commit时，会将current SCN 写入transactiontable的slot里，然后redo change vectory 会对这个block上的操作构建一个：commit record。

Historicallythis was the only block change your session would make when committing atransaction, after which it would post a message to the log writer to copy thelog buffer to disk and wait for the log writer to confirm that it had completedthe write. The benefit of this approach (combined with the fact that the logwriter would sit in the background constantly writing the log buffer to diskanyway) was that a commit would take the same amount oftime irrespective of how much work you had done modifying data in the course ofthe transaction. This strategy earned the name fast commit.

--一般来说，仅在block change时才需要commit 事务，然后发布一条消息给log writer，让log 进程将log buffer 写入disk，然后等待logwriter确认，最后完成change 操作。这种方法的好处是commit 需要相同的时间，而不管我们数据所做的修改量的大小，因为这个原因，我们也称为：fast commit。

With the adventof Oracle 7.3, though, this strategy changed to address the“pinging” problem suffered by Oracle Parallel Server (OPS)—the precursor toRAC. If the only thing you’ve done to show that a transaction has committed isto update a transaction table slot, then it will be up tosomeone else to clear up the mess you’ve left behind in the data blocks—namely,clear the lock bytes, set the Fsc/Scn to the commit SCN, and set the commitflag to C---.

--从Oracle 7.3 开始，为了解决OPS的‘pinging’问题，这个策略发生改变。如果我们只做一件事，把事务commit 更新到transaction table slog里，然后由其他人来处理我们之前留下来的数据，也就是说，清理lock bytes，设置Fsc/Scn 为commitSCN，然后设置commit flag 为C---.

Note:

There are a fewplaces in Oracle which “hide” work by sharing it out. The principle is simple:if I tidy up the 1,000 blocks that I’ve changed, it will take me a lot of extratime; if the next 200 sessions that visit those blocks tidy up a few each, nosingle session will notice the overhead. There are times when lazy is good.

--Oracle 为了完成提交也做了一些其他的工作：假如我们需要处理1000个我们修改的block，如果在commit时就全部完成，那么会使用其他额外的时间。但是如果在接下来的200个session里面都会访问这些block，那么我们可以处理的工作丢给访问这些block的session来处理，这样每个session 只处理部分的block，那么效率就会高很多。

This lateraction is the mechanism known as delayed block cleanout, and most ofthe time it works very well because it’s an almost invisible overhead,especially in OLTP systems running on a single instance. But in RAC (OPS), ifyour session is the one that needs to do delayed block cleanout, you might haveto send a request to another instance, asking for the relevant undo segmentheader blocks so that you can find out the commit SCN of any apparentlyuncommitted transactions in the block you’re looking at. Under OPS, passing ablock around meant one instance writing it down to disk—possibly flushing someof the redo log buffer to disk before doing so—and the other instance readingit back; this is what is meant by “pinging.” The same thing can still happenwith RAC (but only rarely, we hope) where writes of this type are recorded bythe statistic fusion writes; in general, though, cache fusion means that weexpect most blocks that move between instances to travel across the network.

--上面的处理方法就是：delayed block cleanout, 大部分情况下其都会运行的非常好，尤其是单实例的OLTP实例下。如果是RAC 环境，使用delay block cleanout，就可能需要发送请求到其他的实例，来申请相关的undosegment header block，从而获取还未提交事务的commit SCN。

7.1 Commit Cleanout

Because of theold OPS problem, Oracle changed the fast commit to be a little slower. Whileyour session is changing data blocks, it is also building in its session memorya list (limited to 10 percent of the size of the cache) of those blocks, and onthe commit it revisits any of those blocks that are still in the buffer,updates the flag in the relevant ITL entry with the value -U--, and sets the commitSCN, but doesn’t bother to log those changes. This process is known as commitcleanout. It’s just enough work to tell any session that subsequently views theblock that the transaction has committed and exactly when it committed.

--因为OPS 的问题，Oracle 将fast commit边的稍慢一些。当session修改data block时，也会在session memory中构建一个modified block list，其最大可占用sessioncache 10% 的buffer。当提交时，会访问buffer中的这些block，更新block上相关ITLentry中的flag 为:-U--,然后设置commit SCN，这些操作都会被记录到redo log里。当进程被标记为commit cleanout时，它的工作就是通知访问这个block的session，事务已经提交或者什么时候提交。

Dave 说明：

Buffer cache中block 和disk上blockversion 不一致时，buffer cache中的block就会被标记为dirtyblock，dirty block 一般都是包含了我们的修改，为了数据的持久性，我们需要把这些dirty block刷新到disk上，commit操作会触发一些机制，比如checkpoint，会进行一个dbwr写的操作。 dirty block从不一致到一致的过程就是cleanout。

Note：

You might wonderwhy Oracle doesn’t do a complete cleanout on commit. The reason is probablyspeed. If you know which ITL entry you’ve been using (and that’s part of theinformation the session keeps in its list of modified blocks), youcan calculate exactly where in the buffered block you have to apply yourchanges. Remember, ITL entries do not move; on the other hand, thelocation of a row in a block can change, so the only way to clear a row lockbyte is by following pointers—which could take more time (especially in the badold days when CPUs had very small caches). There’s also the point that the fullcleanout would have to be logged, but Oracle has a strategy to avoid loggingthe commit cleanout.

--为什么Oracle 不在commit时完成cleanout？主要原因是速度。通过ITL entry，我们可以精确的计算缓存中我们需要修改的block。这里要记住，ITL entries 不会移动。另一方面，block中row的位置是可以改变的，所以只有一个方法，使用指针来清除row lock byte，但这可能需要更多的时间，也正式因为这个原因，fullcleanout 必须写log，但是Oracle有策略来避免log commit cleanout.

So the generalstrategy for dealing with commits is to make a logged change to the transactiontable slot, and an unlogged change to some of the data blocks. In fact, thework done during a commit cleanout has amazingly low visibility—not only is thechange to the ITL entry not logged, the action isn’t even reported as any typeof buffer visit that you would normally recognize. The script cleanout.sql(available in the Source Code/Download area of the Apress web site), forexample, creates a table with 500 rows spread across 500 blocks and thenupdates every row. On the commit, the work done—according to a snapshot ofv$sessstat—was as follows:

--一般处理commit操作的策略是只log transactiontable slot，而unlog 一些data block。事实上，在commit cleanout 期间的工作也不是很透明，不仅因为ITL entry 的改变没有被log，任何访问buffer 的行动也没有被report。

Name Value

---- -----

session logicalreads 13

db blockgets 1

consistentgets 12

db block changes 1

redo synchwrites 1

commitcleanouts 500

commit cleanouts successfullycompleted 500

redoentries 1

redosize 96

We have visitedall 500 blocks on the commit (commit cleanouts 500), but we’ve done virtuallyno block visits of the type we would normally think about (that is, sessionlogical reads, db block gets, consistent gets); moreover, despite changingthose 500 blocks, we’ve recorded just one db block change—which ismisleading—and a single, tiny redo entry—which is the change to the undosegment header as we updated the transaction table slot, and is truthfullyrecording the one thing we logged.

--在commit时，有500个commitcleanouts，但是实际上没有有任何我们通常意义上的block的访问（session logical reads,db block gets, consistent gets），而且，尽管我们改变了500个blocks，但是记录的只有一个block，这个change是我们在undosegment header上更新transaction table slot时的记录。

Note ：

Technically thelist of candidate buffers that a session keeps for a single commit cleanout isallowed to be 10 percent of the size of the buffer cache. So it is possible fora single commit to use quite a lot of CPU and cause quite a lot of latchactivity on commit—without showing exactly what’s going on, unless you knowwhere to look.

--singe commit cleanout 最大允许使用session buffer cache的10%来保存修改的block，这些block已list的形式存在在buffer中。因此对于singlecommit就可能会使用大量的CPU资源，同事导致大量的latch 操作.

An interestingvariation on this test is to flush the buffer cache (alter system flushbuffer_cache;) before issuing the commit—just to see what happens; the changein statistics is a little surprising:

--一个有趣的变化是我们在commit之前flush buffer cache（使用altersystem flush buffer_cache命令），我们就会收集的信息就会如下：

Name Value

---- -----

commit cleanoutfailures: block lost 100

commitcleanouts 100

As you can seeOracle seemed to give up after 100 attempts at doing a commit cleanout, anddidn’t even look at the other 400 blocks. This is an interesting example of thetype of “statistical guess” that Oracle makes about the best run-timestrategy—the commit cleanout doesn’t seem to be working very well here, so ithas stopped trying.

--正如上面的结果显示，Oracle 在尝试做了100次commitcleanout之后，放弃了这个操作，甚至没有去查看剩下的400个blocks。这个就是:statistical guess 的一个有趣示例，在这里Oracle 使用了：best run-time 策略，当commit cleanout不能很好工作时就会停止该操作。

In thisvariation of the test, of course, I now have 500 blocks on disk that have beenchanged by a committed transaction, but the relevant ITL entry in each blockmakes it appear as if the transaction has not yet committed. Interestingly, ifI, or another session, read those blocks back into the cache before I commit,then I am still able to clean them out when I commit.

--在上面的测试中，我们有500个blocks被修改，并被事务提交，但是每个block对应的ITL并没有标记为commit,有趣的事，如果本session或则会其他的session，在我commit之前读取这些blocks到cache中，我依旧可以进行commit操作。

Before taking a closerlook at what’s going to happen to those blocks if they aren’t in the cache,I’ll just pick up one last point about how the fast commit affects the workload.Script core_cleanout_2.sql is a version of the original commit test that I’vemodified to execute the following statement three times, first after updatingmy 500 rows, then after a checkpoint (alter system checkpoint;), and finallyafter the commit:

--这里看一下fast commit 有多少工作量：

/* Formatted on 2012/6/2 19:43:36(QP5 v5.185.11230.41888) */

SELECT objd, COUNT (*)

FROM v$bh

WHERE dirty = 'Y'

GROUP BY objd

ORDER BY count (*);

The code simplycounts the number of buffers in the cache which have their “dirty bit” set,grouping by data object ID to make it easy for me to pick out the blocks frommy table. Ignoring some of the very small numbers that appeared because ofbackground activity, I found the following:

--这个代码查询buffer cache中block 状态为dirty的数量。查询的结果分析如下：

1. After the initial update, 504blocks from the table and 212 undo blocks were dirty.

--初始化的update后，504blocks和212 个undo block被标记为dirty。

2. After the checkpoint there were(as we would expect—see Chapter 6) no dirty blocks in memory. Remember that acheckpoint call makes Oracle copy any dirty blocks to disk; it doesn’t makeOracle remove them from the buffer cache.

--在checkpoint 之后，内存中没有dirtyblock，checkpoint 会copy 任何dirty block 到disk，但它不会从buffercache中移除这些block。

3. After the commit cleanout, 500blocks from the table and two undo blocks were dirty.

--在commit cleanout之后，500个blocks和2个undo blocks被标记为dirty。

Because of the“pseudo-delay” I introduced between the update and commit, Oracle is going toend up writing those 500 table blocks a second time. Realistically thisparticular detail probably won’t have any significant impact on most systems,but it’s just a little hint to remind you not to let your code wait a long timebetween the update and the commit.

Note:

When the currentversion of a block in the buffer cache doesn’t match the copy on disk, thebuffer is said to be dirty (a dirty block). When dbwr copies the buffer todisk, the buffer becomes clean because once again it matches the block on disk.

--当buffer cache中block的currentversion 和disk 的version 不匹配，那么buffer中的就会被标记为dirty。当dbwr 进程将dirtyblock 从buffer中copy到disk后，buffer 就会变成clean，因为disk和buffer中block的状态又一致了。

Let’s go back,now, to the case where we flushed the buffer cache and saw Oracle report 100commit cleanout failures and then ignore the last 400 blocks. What’s going tohappen to the 500 blocks the next time anyone reads them?

7.2 Delayed Block Cleanout

We have 500table blocks with committed changes, but the blocks weren’t in memory when weissued the commit, so Oracle wasn’t able to update their ITL entries to showthat the transaction had committed. So how will anyone find out, when theyquery the blocks, that the transaction is committed and (if they need to know)exactly when it committed?

--我们对500个blocks进行了commit 的操作，但是当我们执行commit命令时，block并不在内存中，所有Oracle 不能去更新block对应的ITL为commit状态。

If at this pointwe run a query that does a tablescan on the table, we’ll see statisticssomething like the following in 11g:

--那么在我们执行一个tablescan的查询时，收集的统计信息如下：

Name Value

---- -----

db blockgets 0

consistentgets 1,012

consistent gets from cache(fastpath) 501

consistent gets -examination 506

physicalreads 504

db blockchanges 500

calls tokcmgrs 500

redoentries 500

redosize 36,000

cleanouts only - consistent readgets 500

immediate (CR) block cleanoutapplications 500

commit txn count duringcleanout 500

cleanout - number of ktugctcalls 500

table scan blocksgotten 500

I have omittedvarious “small” values that aren’t relevant to the discussion, and a few of thestatistics that re-iterated the 500-block count of the table. The items I wantto pick up particularly are the following:

1. We looked at 500 table blocksin our tablescan but did 1,012 consistent gets: where did the extra 500 (and abit) come from?

--查询了500个tableblocks，但consistnet gets有1012个。

2. Of the 1012 consistent gets,506 are reported as examinations—and by now you know that one of the possiblecauses of examinations is looking at blocks from the undo tablespace.

3. Oracle has reported no db blockgets, but we have done 500 db block changes and generated 500 redo entries(averaging 72 bytes each). This is another example, like the one we saw withcommit cleanouts, where the well-known statistics are not giving us the fullpicture.

4. We have done 500 calls tokcmgrs (Kernel Cache Miscellaneous Get Recent SCN). As a side note on the wayin which the code and instrumentation changes, this statistic wasn’tincremented when I ran the test under 10g.

5. We have a number of statisticsrelating to “cleanouts,” including two that seem to be saying something aboutcommitted transactions: commit txn count during cleanout and cleanout - numberof ktugct calls (Kernel Transaction Undo Get Commit Time).

What’s happeningis this: as we read a table block from disk, Oracle can see that there is anITL that seems to hold an uncommitted transaction and can see the rows markedby that transaction, so it has to think about creating a read-consistentversion of the block. Our session reads the correct undo segment header block (consistentget - examination) to check the state of the transaction (it’spossible that it checks v$transaction before visiting the block, but that’sonly a conjecture), at which point it finds that the transaction has committedand can see the commit time (kcmgrs, kcmgct).

--当我们从disk上读取一个table block 时，Oracle 可以看到该block对应ITL的信息，判断该Block是否持有uncommit transaction，并且看到事务对应的row，如果存在uncommit 的事务，那么oracle 就必须创建一个CRblock。Session 从正确的undo segment header block（consistentget - examination）中检查事务的状态，这个状态可以指出事务已经提交，并且可以查看commit的时间。

Having found thecommitted transaction, Oracle can copy the commit SCN back to the ITL entry,set the commit flag, and clear the lock bytes from the block—but that’s a datablock change to something that we know is the current version of the block(it’s just come off disk, so it must be the current version even though wedidn’t report a db block get), and a data block change means we generate someredo.

--发现了commit 的事务，Oracle 就copy commitSCN 到ITL entry中，修改ITL中commit flag，clear lock byte。

This processhappens for every single block of the tablescan—hence the frequent appearanceof a value close to 500 in many of the statistics – and when the tablescan iscomplete, all the data blocks from the table are dirty and we’ve generated 500redo records. You might note, by the way, that our session doesn’t show anychange in redo synch write (that’s the counter the session increments to showthat it needed the log writer to write the log buffer to disk), so the redorecords won’t be written to disk immediately, but they will get there in a fewseconds, or when the next transaction commits, or when dbwr decides to writesome of the dirty table blocks to disk and posts lgwr to write the redoprotecting those blocks before it writes the blocks themselves (see Chapter 6).

--在tablescan时，每个block都会执行上面的操作。当tablescan 完成后，所有的data block 都变成了dirty 状态，并且也会生成500 条redo records。这里要注意，我们本次的session 并没有显示任何redo synch write的操作，因此redorecord并不是及时写的，可能需要过几秒或者在下次事务提交时，亦或者dbwr 决定写dirty blocks时，在写dirtyblock之前，都会写先对应的redo records。

因为延时写的原因，select也会产生redo和undo。

Note：

Parallelqueries—and serial direct path reads in 11g—havean interesting side effect on delayed block cleanout. When you do a direct pathread of a block (i.e., into private memory, the PGA, rather than the publicmemory, the SGA), Oracle still has to go through the routine of creating thecorrect consistent read version of the block. If this means doing delayed blockcleanout, the operation is carried out in private. In this case we don’tgenerate the redo and we don’t put a dirty copy of the block back into thebuffer cache. If I take the final tablescan from core_cleanout_2.sql and makeit a parallel tablescan (see script core_cleanout_3.sql, available in theSource Code/Download area of the Apress web site) the extra 500 consistentgets, calls to ktugct, and so forth are performed every single time I repeatthe tablescan. Similar behavior appears when you make tablespaces read only andthe tablespaces hold objects that are in need of some delayed blockcleanout—although 11g has introduced the concept of a minimum active SCN, whichmay have some impact on this behavior. 11g has also introduced the read onlytable, but when a table (rather than a tablespace) is declared read only, 11gwill still apply and write back the changes due to delayed block cleanout.

--并行查询(在Oracle 11g里是一系列的direct path reads) 也会影响delayedblock cleanout。当我们用direct path 来读取block时（写入privatememory（PGA），而不是写入public memory（SGA）），Oracle还是会通过机制来创建CR block。如果这里是做delay blockcleanout, 那么创建CR block的操作会按照private的执行，在我们的这个案例中，这里没有生成redo，我们也没有把block的dirtycopy回buffer cache。

We’ve seen thatOracle is able to recognize that a block is in need of delayed block cleanout,and has a mechanism for finding out the commit SCN of the relevant transactionsand cleaning out the block properly, but all we’ve seen so far is the tip of theiceberg.

--Oracle 可以识别到block是否需要进行delay block cleanout，有一个机制来查找相关事务的commit SCN并进行合适的cleanout。

In the examplewe ran our tablescan moments after we had committed the data, so when Oraclesaw the offending ITL entry, it could read the transaction ID (xid:) from theITL entry and say “let’s go to the correct undo segment header and transactiontable slot and see what’s happened.”

--在我们的示例中，是提交之后在查询table block，所以我们可以看到ITL entry，从ITLentry中，可以看到transaction ID，并且可以看到对应的undo segmentheader 和transaction table slot。

Remember,though, there are only 34 (or 48, or 96, depending on version and option)transaction table slots per undo segment, and only a limited number of undosegments per undo tablespace. If I create a brand-new undo tablespace in my 11gdatabase, switch into it, and drop the old tablespace, I’ll have 10 undosegments available (to start with). So how will Oracle deal with delayed blockcleanout if I execute 17,000 transactions (that’s 50 × 10 × 34) somewhere elsein the database before I reread the table? After all, with that much extra workhappening before I look at the table again, the transaction table slot I usedwhen updating the table is sure to have been overwritten. The answer, perhapssurprisingly, is that the resulting statistics look no different from theprevious test (see core_cleanout_4.sql, available in the Source Code/Downloadarea of the Apress web site) even though I must have overwritten every singletransaction table slot in the entire database about 50 times.

--每个undo segment 中只能存放34（48或96）个transactiontable slot，并且undo tablespace 中undo segment 的数量也是有限的，在Oracle 11g中，如果我们创建一个新的undo tablespace，并替换就的undo tablespace，那么它可以有10个undosegment 可用，如果我们执行了17000个事务（50*10*34），oracle处理delayblock cleanout时需要做大量额外的工作，当我们重用transaction table slot 时，必须确保其可以被重写。

So how did Ifind the information I need to tidy up the ITL entries? The answer (for thisexample) is that I don’t need to know when my original transaction committed; Ijust need to know that it has committed. As usual a data block dump is veryrevealing; this is what the ITL looked like in one of the table blocks afterthe cleanout (the relevant transaction entry is 0x02—notice the flag columnreads, C-U-):

--那么我们又该如何来查看我们需要处理的ITL entry？答案是我们不需要知道原始事务何时提交的，我们只需要知道它已经提交。Data block dump 可以揭秘这里的真相。下面是一个经过cleanout 的ITL 信息。

Itl Xid Uba Flag Lck Scn/Fsc

0x01 0x0001.007.00001e7b 0x01800381.0561.13 C--- 0 scn0x0000.0186e2e6

0x02 0x0003.005.000021a9 0x01800376.06fd.0c C-U- 0 scn0x0000.01877523

This is anexample of the delayed block cleanout using upper bound commit. Thesignificance of the C is that it marks the transaction as committed and showsthat the lock bytes for the rows will be clear. The significance of the U isthat the commit SCN is not guaranteed to be exactly correct—the transaction hadcommitted by the time Oracle got to this SCN, but it may have committedearlier. The querying session has done just enough work to figure out that thetransaction committed before the query started, and it’s used the oldest SCNthat it can find cheaply to determine this fact, and the work it has done isonly just a tiny bit more than the previous example where it had to look forthe right transaction table slot.

What’s happeningis similar to the previous case, to start with: Oracle looks at the block andsees that there is an ITL that seems to hold an uncommitted transaction and cansee the rows marked by that transaction, so it has to think about creating aread-consistent version of the block. As before, it goes to read the correctundo segment header block (consistent get - examination) to check the state ofthe transaction. But in this case the wrap# of the transaction is higher thanexpected (the slot has been reused about 50 times since the originaltransaction ran).

Because thetransaction slot has been overwritten, Oracle can infer that the transactionhas committed, which (in this case) is all it cares about. But it needs to putsomething in as the commit SCN on the ITL, so what value should it use?

It could simplyuse the SCN it finds in the overwritten transaction slot, but—in much the sameway that Oracle picks the oldest (least recently committed) entry when it hasto reuse an ITL entry from a data block—Oracle also reuses the oldest (leastrecently committed) transaction table slot in the undo segment when it needs tostart a new transaction, so a better approximation would simply be to use theoldest commit SCN it can find in the transaction table. In fact, Oracle can dojust a little bit better than that, because each time Oracle reuses atransaction table slot, it copies the previous commit SCN from that slot intothe transaction control mentioned at the start of this chapter. So, for theupper bound commit SCN, Oracle identifies the relevant undo segment, picks theSCN from its transaction control, and copies that into the ITL.

7.3 Transaction Table Rollback

There are caseswhere it’s very cheap to get a “sufficiently accurate” commit SCN for blocksthat require delayed block cleanout, but are there any circumstances that mightrequire us to do more work to improve the accuracy of the “upper bound”estimate?

The first thought is that our test case used a single session, which makes it a bitspecial, so what happens if we use the two sessions shown in Table 3-3?

It makes nodifference to the results; again, session 1 need only check that its originaltransaction committed before the final select statement began. It doesn’t needto know exactly when the transaction committed, so again it can use the SCNfrom the transaction control as a good-enough approximation to the commit SCN.

How about thesequence of events provided in Table 3-4, though?

By setting thetransaction to read-only, I am, in effect, faking a long-running query. Whenthe select statement runs at line 4, it has to see the database as it was atline 2. This produces some interesting results (the test case is a variationdescribed in core_cleanout_4.sql) from which I’ve extracted some key figures:

session logicalreads 2,407

consistentgets 2,407

consistent getsfromcache 2,407

consistent gets-examination 1,900

CR blockscreated 1

transactiontables consistent reads - undo recordsapplied 1,395

transactiontables consistent readrollbacks 1

Our first testshowed roughly 1,000 session logical reads with 500 consistent gets -examination. This test shows (roughly) an increase of 1,400 on both these statistics,and the same count on a statistic called transaction tables consistent reads -undo records applied. We have done a lot of work to find a suitable commit SCN;we have applied 1,395 undo records to create a single read-consistent copy ofthe undo segment header so that we can see what the transaction table lookedlike at the moment of the original commit.

八．Transaction Control (TRN CTL::) –事务控制

It’s time to saya little more about the transaction control content and what happens at thestart of a transaction. Here’s an example of the transaction control justbefore a new transaction starts in the undo segment:

--这里也该说一些事务控制的内容，以及开始一个事务时发生了什么。下面是我们开始一个事务之前undo segment中的的信息：

TRN CTL:: seq:0x08f5 chd: 0x000d ctl: 0x0017 inc: 0x00000000nfb: 0x0001

mgc: 0xb000 xts:0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)

uba: 0x0180120f.08f5.21 scn: 0x0000.018bc704

You’ll noticethat there’s an scn: recorded at the end of the transaction control, and justbefore it there’s a thing labeled uba:, a label we’ve seen before as part of anITL entry, meaning undo byte address but holding (as this uba also seems to) anundo record address. The obvious questions when we see these items are, “What’sin that uba?” and “Where does that scn come from?”

--我们可以注意到这里的2个值：scn：和 uba：，这里的uba是我们undo address。

You’ll alsonotice the chd (chain head, perhaps) and ctl (chain tail).These are the ends of a linked list that tell us, respectively, the nexttransaction table slot to use and the last transaction table slot used in thisundo segment.

--同时还有参数值：chd(chain head, perhaps) 和 ctl(chain tail). 这2个参数告诉我们linked list的结尾，所以下个事务在使用这个undosegment时可以使用的位置。

Since the next(chd) slot is 0x0d, the following is what thatslot looks like at this point (with a few details deleted to make the line fitthe page):

--因为下个slot（chd）是0x0d,所以对应的slot如下：

index statecflags wrap# uel scn dba nub cmt

--------------------------------------------------------------------------------

0x0d 9 0x00 0x6a600x000b 0x0000.018bc75e 0x0180120d 0x000000011305214727

If we now starta new transaction, the transaction control and transaction table slot change tobecome as follows:

--如果我们此时开始一个新的事务，那么transaction control 和transaction table slot 会变成如下：

TRN CTL:: seq:0x08f5 chd: 0x000b ctl: 0x0017 inc: 0x00000000nfb: 0x0000

mgc: 0xb000 xts:0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)

uba:0x0180120f.08f5.24 scn: 0x0000.018bc75e

index statecflags wrap# uel scn dba nub cmt

--------------------------------------------------------------------------------

0x0d 10 0x80 0x6a610x0017 0x0000.018bcd3e 0x0180120f0x00000001 0

Then, as wecommit (and assuming no other transactions have been using the same undosegment), they change to

--当我们commit时（这里假设没有其他事务使用相同的undosegment），变化如下：

TRN CTL:: seq:0x08f5 chd: 0x000b ctl: 0x000d inc: 0x00000000nfb: 0x0001

mgc: 0xb000 xts:0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)

uba:0x0180120f.08f5.24 scn: 0x0000.018bc75e

index statecflags wrap# uel scn dba nub cmt

--------------------------------------------------------------------------------

0x0d 9 0x00 0x6a610xffff 0x0000.018bcd92 0x0180120f 0x000000011305278699

We’ll take amoment to look at the chd, ctl, and uel values first, because that part isstraightforward. You will recall from table 3-1 that the uel column in thetransaction table slot points to the slot that should be used next. Before westarted the transaction, the transaction control was pointing to slot 0x000d,and slot 0x0d had a uel of slot 0x000b. As the transaction starts you can seethat the chd value changes to 0x000b, and the uel value changes to 0x0017

--这里我们要注意chd，ctl 和uel的值发生的变化。Uel 列的值指向了下一个slot。在我们开始事务之前，transaction control 指向slot 0x000d，并且slot0x0d 有一个slot 0x000b。当我们开始事务之后，可以看到chd 变成了0x000b，uel 变成了0x0017.

Note：

In some versionsof Oracle the uel will change to point back to “itself” when it’s in use, whilein other versions it will change to zero. This is probably an irrelevant changein detail because the state shows that the current slot is not available whateverthe uel is pointing to.

--在一些版本中，uel在使用时会指向其自己本身，而在另一些版本则可能指向0. 不过这个是一个不相干的问题，因为state 显示无论uel 指向什么，当前的slot都不可用。

Finally, afterwe commit, the ctl changes to 0x000d and the uel changes to 0xffff (“nothingafter me”). So we pick a slot from the head of the list, use it, and return itto the tail of the list. Consequently, the slot we pick for a new transactionis always going to be the oldest committed (lowest SCN) slot in the transactiontable.

--最终，在我们提交之后，ctl 变成了0x00d，uel 变成了0xfff。当我们从list的开头检查到一个slot，然后使用它，在将其指向list的结尾。因为，我们当前事务使用的slot总是transaction table中最低的SCN.

Another littlechange you might spot is the nfb entry in the transaction control. This is thenumber of free blocks in the free block pool, a short list of blocks in thisundo segment that still have space available for undo records. There was oneblock in this list before we started the transaction (nfb: 0x0001), but we tookit off the list to use it (nfb: 0x0000), and when we committed there was stillplenty of space in it so we put it back in the free pool (nfb: 0x0001). Here’sa dump of the free block pool (limited to the first entry) just after wecommit:

--另一个改变是nfb，nfb参数代表的是free block pool中free blocks的数量，在我们的示例中，在我们开始之前就剩一个可用（nfb=0x0001）,当我们用完之后，nfb变成0x0000,当我们提交之后，又有block放如了freepool，nfb 变成了0x0001，此时free block pool的dump 结果如下：

FREE BLOCKPOOL:: uba: 0x0180120f.08f5.24 ext: 0x17 spc: 0xe56

As you can see,the entry tells us how much space is still available in the block (spc), whichextent (ext) the block is in, and the undo record address (uba) of the lastundo record in that block. Because my transaction was a very small transactionusing just one undo record, the uba here is the same as the uba I wrote to thetransaction control—the uba in the transaction control is for the first recordof my transaction, and the uba in the free pool is for the last record of mytransaction, and in this case they’re the same thing.

--从这个dump结果，可以告诉我们block中还有多少空间可用（spc），extent的block是哪个（ext），block中最近一个undo record 的undorecord address（uba）。因为我们的事务非常简单，仅使用了一个undo record，所以这里的uba 地址相同。实际上，transaction control中的uba是我们事务的first record，free pool中的uba 是事务的最后lastrecord。

Let’s get backto the transaction control, and its scn and uba.

The scn has beenreplaced by the commit SCN from the transaction table slot I’ve just reused.This is what gives Oracle the fast way of finding the “oldest SCN in thetransaction table” very quickly in the simpler cases of the upper bound commit.It checks the slot it wants, sees that it is at the wrong wrap# for the transactionit is checking, and goes straight to the transaction control.

--scn的值被我们在transaction table slot中使用的commitSCN 所替换.

The uba has beenreplaced by the uba of the first undo record of the new transaction. Actually,although I’ve already claimed twice that 0x0180120f.08f5.24 is the first undorecord of the new transaction, that’s not immediately visible in theinformation I’ve printed so far. I had to dump the relevant undo block toconfirm that fact—and here are the first few lines dumped for that specificrecord:

--uba的值被新事务的first undo record中的uba值替换。

*-----------------------------

* Rec #0x24 slt:0x0d objn: 99692(0x0001856c) objd:99692 tblspc: 9(0x00000009)* Layer: 11 (Row) opc:1 rci 0x00

Undo type: Regularundo Begin trans Last buffersplit: No

Temp Object: No

Tablespace Undo: No

rdba: 0x00000000Ext idx: 0

flg2: 0

*-----------------------------

uba: 0x0180120f.08f5.21 ctl max scn:0x0000.018bc704 prv tx scn: 0x0000.018bc75e

txn start scn: scn: 0x0000.018bcd3e logonuser: 86 prev brb: 25170445 prev bcl: 0

I mentioned inChapter 2 that the first undo record of a transaction is a special case—this iswhere we finally discover how special.

九．First Change

Before westarted the transaction, the transaction control listed uba 0x0180120f.08f5.21and scn 0x0000.018bc704. These have appeared in the undo record (three linesfrom the bottom) with the labels uba: and ctl max scn:—the transaction controlinformation has been copied to the first record of the next transaction and hasthen been updated with a pointer to tell us where that record can be found.

--在我们开始一个事务之前，transaction control listed的uba是：0x0180120f.08f5.21，scn 0x0000.018bc704. 可以在undorecord的uba:,ctl max scn 发现这些值。Transaction control 的信息copy 到下个事务的firstrecord里，然后更新指针，告诉我们record的位置。

But there’smore. The previous contents of the transaction table slot that we have pickedfor our new transaction have also been copied into the new undo record. The scnand dba values in our slot were 0x0000.018bc75e and 0x0180120d, and we can nowsee them in the undo record with the labels prv tx scn: and brb (although thebrb has been dumped in decimal rather than hexadecimal). We’ve also saved thestart scn (0x0000.018bcd3e) for the new transaction in the undorecord with the (strangely repetitive) label txn start scn: scn:.

--transaction table slot里面之前的内容我们pick出来供新事务使用，并写入新的undo record。所以slot中Scn和dba的值分别是：0x0000.018bc75e 和0x0180120d，可以看到undorecord里面标签：prv tx scn：和brb。我们也会在新事务的undo record里使用txn,scn 标签来保存开始scn(0x0000.018bcd3e)。

In summary,then, every time a new transaction starts, it saves the old transaction tableslot information to its first undo record, updates the pointer in thetransaction control to point to that record, saving the previous value of thepointer into the same undo record. This makes it possible for a process to walka linked list of “first undo records” to reconstruct the transaction table backto any point in the past (until it runs out of undo records). Let’s walkthrough the process to see how it works.

--总的来说，每次新事务开始时，都会保存旧事务的slot信息到新事务的first undo record里，更新指针指向事务的 record, 然后把之前指针里的值保存到undo record里。这样就使进程已link的方式访问这个list，从而构造出block在过去某个时间的状态。

Imagine I visita block at some time in the future and see from the ITL that there are somerows that appear to be locked by transaction 0006.00d.00006a5f (that’s undosegment 6, slot 13, wrap# 27231).

--假设我们在将来的某个时刻访问一个block，从block的ITL中可以看到有一些row被事务locked，事务是： 0006.00d.00006a5f ( undo segment 6, slot 13,wrap# 27231).

I look at undosegment 6, transaction slot 13, and find that its wrap# is currently at 27233(0x6a61). Because the wrap# shows me the slot has been used a coupleof times since the changes made by transaction 0006.00d.00006a5f,I can conclude that the row changes have been committed but at a lower SCN thanI can currently see in the slot.

--我们查看undo segment 6，transaction slot 13，当前的wrap：27233(0x6a61). Wrap# 显示这个slot 从我们上次的事务（0006.00d.00006a5f）后被使用了2次。我们可以推断row的修改已经提交，且 commit SCN 要小于我们我们在slot中看到的SCN值。

Before I do anymore work, I may decide that that SCN is a good enough approximation for an“upper bound commit,” but if it isn’t, I can look at the SCN in the transactioncontrol because it’s either the previous commit SCN from slot 13 (althoughstatistically that’s very unlikely) or the commit SCN from anothertransaction/slot that committed after the previous commit from slot 13. So thetransaction control SCN is another possible “upper bound commit.”

At this pointall I’ve done is a quick visit to the undo segment header to get an approximatecommit SCN. But if that small amount of work hasn’t yet produced a good enoughapproximation, I now have to start working quite hard to create aread-consistent view of the transaction table that is sufficiently old that itgives me an even better, or possibly exact, commit SCN.

十．Transaction Table Consistent Read --事务表的一致读

The stepsinvolved in creating a read-consistent copy of the transaction table are asfollows:

--创建事务表一致读的步骤如下：

1.I clone theundo segment header block in memory—that’s the action that increments thestatistic transaction tables consistent read rollbacks.

--在内存中克隆undo segment header block。

2.I use the ubafrom the transaction control to identify the first undo record of thetransaction that last updated the transaction control. This was, as we haveseen, the oldest transaction slot available in the transaction table at thatmoment.

--使用transaction control中的uba来识别transactioncontrol中最后一个事务的first undo record。这样可以确定transaction table中可用的slot。

3.The undorecord tells me which transaction table slot its undo should be applied to(remember the slt: 0xNN entry on the record), and the commit SCN for that slot,so I can apply the undo to my clone—that’s the action that increments thestatistic transaction tables consistent reads - undo records applied. At thesame time I read back the uba and scn values that need to be written to thetransaction control and apply them.

--undo record提示哪个transaction table slot的undo需要被应用和该slot的commitSCN，所以我们可以通过apply来完成我们的克隆。此时我们读取到之前需要写入transactioncontrol并应用他们的uba和scn 值。

4. At this point I have taken thetransaction table and transaction control one step back into the past. It’spossible that the commit SCN I’ve recovered is “good enough” (i.e., the firsttime I’ve seen an SCN lower than the exact value I’m interested in) for anupper bound commit. It’s possible that this step actually took slot 13 (the oneI was interested in) back to exactly the right wrap# and gave me the exactcommit SCN. (In my case I said that I was starting with a wrap# of 0x6a61 andneeded to get back to wrap# 0x6a5f, so I’m not going to be that lucky thatsoon; I’ll have to get through wrap# 0x6a60 first.)

--此时我们将transaction table 和transactioncontrol 的状态带到了之前的，可能此时的commit SCN 正好接近与upper bound commit。如slot13 返回到了正确的wrap#和精确的commit SCN。在我们的案例中，开始的wrap#是0x6嗄1，而我们需要需要返回到0x6a56。

5. If I haven’t found a low enoughSCN yet, I have at least managed to construct an older version of thetransaction control, so I go back to step 2 and repeat until I reach a suitablevalue or run out of undo records and crash out with Oracle error “ORA-01555snapshot too old” because I haven’t kept enough history in the undo segment tolook that far back into the past.

--在第四部中，如果我们没有确定到合适的SCN，那么我们构建一个更老版本的transaction control，所以我们需要返回到第二部，重新读取一次，直到我们获取合适的值，或者到undo record 被crash out，我们接收到ORA-01555快照过旧的错误。

This, then, isthe last chain that runs through the undo segment. The first undo record ofeach transaction points to the first undo record of the previous transactionthat started in the same undo segment; the transaction control points to thefirst “first undo record”; and by following first undo records backward, we canreclaim critical details about the history of transaction table slots, andreconstruct old versions of the transaction table and identify the exact commitSCN of any transaction that took place at some time in the past and is now inneed of an accurate delayed block cleanout.

--这也是undo segment的最后一个chain。每个事务的firstundo record都指向上个事务的first undo record，然后在同一个undo segment里开始新的事务，transactioncontrol 也指向first undo record，在first undo record之后，我们可以增加其他的block，用来描述transaction table slots的历史信息，也可构建transaction table的old version和识别任何事务在某个时间的精确的commit SCN，以及用来判断是否需要进行delayed block cleanout。

ORA-01555

If you don’tknow that Oracle error 1555 translates into “snapshot too old,” you can’t be areal DBA. It is probably the most well known of error numbers in the Oracleworld. It happens because we can’t keep history indefinitely. (Anyone who’stried to create a full audit system for a database to hold seven years ofhistory knows the problem—a complete history requires a lot of space, and aftera while you hope that no one ever queries it because of the amount of I/O asingle audit trail query can take.) We define an undo tablespace (per instance)to hold history, but in the normal course of events we only want to hold enoughhistory to make sure that our big, slow queries can find enough history forread consistency as they run. This is quite difficult to do, because we createmultiple undo segments in the undo tablespace so that incoming transactionsdon’t end up competing for too few transaction tables; and sometimes a fewsegments can grow very large, using up an “unfair” share of the space, whichcan make it hard for Oracle to use the undo tablespace effectively.

To improve thesituation, Oracle Corp. introduced automatic undo management, allowing theinstance to take undo segments offline and bring them online, to grow andshrink them, to move available extents from one segment to another, and even toallocate and drop undo segments (the latter is under the control of thebackground process smon and is only triggered once every 24 hours). The targetdriving automatic undo management is the parameter undo_retention, which theDBA uses to state how long history should be kept (the default is 15minutes/900 seconds). It was the introduction of automatic undo management thatdemanded the presence of the retention table shown earlier in Figure 3-2. Thetable allows Oracle to decide when it is safe to remove an extent from an undosegment. In Oracle 11.2, the tracking has been enhanced by the internalintroduction of the minimum active SCN, which is also used to avoid some of thework done in finding upper bound commit times.

快照过旧的内容参考：

OracleORA-01555 快照过旧说明

http://blog.csdn.net/tianlesoftware/article/details/4745898

十一. LOBs

It’s worthsaying a few words about LOBs since Oracle has some special methods forhandling undo and redo on LOBs, and these methods follow through to thehandling of transactions and read consistency. If a LOB value has been storedin the row, it is in no way special; it’s just another column. But if a LOBvalue was stored “out of row” (either because the specific value was too largeor because the LOB has been declared with the disable storage in row option),then the LOB data itself is not subject to the normal undo handling and neednot be subject to the normal redo handling.

--Oracle 采用了特别的方式来处理LOBs上的undo 和redo。如果LOB的值存储在row里，那么方法一样，但如果存储在：out of row（因为LOB字段值可能很大，需要单独的来存储），那么普通的redo 和undo 的方法就不合适。

There arevariations on the basic theme, but the critical thing to remember with LOBvalues that have been stored out of row is that you have to access them throughpointers stored in the row or through the LOBINDEX, which, although it’s aslightly special variation of Oracle’s standard B-tree index, is subject to thetypical undo and redo processing and is therefore amenable to the normaltransactional and read-consistency behavior.

--当LOB的值存储在out of row时，我们访问LOB时，就必须通过存储在row里的指针或者LOBINDEX，LOB 索引和普通的B-tree索引有一些区别，其是受undo 类型和 redo 进程来控制，也因此支持normal transactional和read-consistency 的动作。

When you updatea LOB value that is stored out of row, Oracle creates a new copy of that valueand leaves the old one in place—the old copy is the undo, and part of the LOBdefinition tells Oracle how long it can keep old copies of LOB values. But aswe create the new copy, we update the LOBINDEX in two places: one to say wherethe new copy is, and the other to control the order in which old copies will beoverwritten. The limit for history can be set by reference to space (percentageof LOB space used for old copies) or time (number of seconds to keep old LOBs).In either case, it is possible to cause massive space and performance problemsif the nature of your LOB activity causes very frequent updates to occur.

--当我们更新out of row的LOB时，Oracle会对该值copy 一份并存放在一个位置，old copy就是该LOB的undo，LOB的定义属性会告诉Oracleold copy 要保存多长时间。当我们创建一个new copy时，我们更新LOBINDEX的2个位置：一个是告诉new copy的位置，另一个是将被重写的old copy 的位置。LOB的undo 保存时间可以根据空间或者根据时间来定义。

总结：

There are twokey structures in the database that make it possible for Oracle to handletransaction processing and read consistency efficiently: the interestedtransaction list (ITL) that appears in each data block and lists recenttransactions that affected that block, and the transaction table that appearsin the segment header block of each undo segment and lists recent transactionsthat affected the database.

--有2种结构让Oracle 处理transaction 和readconsistency 成为可能：

（1）ITL（InterestedTransaction List）：其存在每个data block，该list 会显示最近block上操作的事务的信息。

（2）transactionTable：其存在每个undo segment的segment header block中，该list 会显示数据库上最近操作的事务的信息。

An ITL entryrecords a transaction ID (xid:), an undo record address (uba:), and a commitSCN. The commit SCN tells Oracle if (and when) the transaction committed. Ifthe commit SCN is not available, then the transaction ID identifies atransaction table slot with sequence number, and this information allows Oracleto check the state of the transaction and when (if) it committed. If yoursession needs to hide the changes made by that transaction, it can use the undorecord address to find the start of a chain of undo records that tells it howto reverse the changes made by that transaction to that data block.

--一个ITLentry 记录包含：xid：，uba：，commitSCN. Commit SCN 告诉Oracle 事务是否提交，何时提交的。如果commit SCN 不可用，那么xid会用sequence号来标记transaction table slog，这个信息允许Oracle 来检查事务是否提交。

如果想回滚事务的操作时，可以使用uba地址来发现事务开始的undo record list的开始位置，然后根据这些undo record来回滚数据。

A transactiontable slot records the state of a transaction, the address of the last undoblock that it wrote to, and (when committed) the commit SCN. Because therearen’t many transaction table slots in a database, they are continuouslyreused, so the slot has a wrap# number counting the number of times it has beenused, and this also becomes part of the transaction ID. If a transaction has tobe rolled back, the pointer to the last undo block allows Oracle to find thelast undo record created by the transaction, and each undo record points to theprevious undo record for its transaction, so the undo records can be visitedand applied in the opposite order to the order in which they were created.

--transaction table slot 记录了事务的状态，最后一个undo block的地址，和commitSCN。因为数据库中slot的数量是有限的，所以需要循环使用，因此slot 需要wrap#号来统计slot使用次数，wrap#也是transactionID的一部分。如果事务回滚了，指向允许Oracle 查看最后的undo record，而每个undorecord又指向之前的undo record，通过这个undo record的list可以进行回滚操作。

Transactiontable slots can be overwritten fairly quickly, which means the commit SCN for atransaction would be lost if Oracle didn’t have a mechanism to preserveinformation from the transaction table slot before reusing it. Each undosegment header has a transaction control section summarizing the use of thetransaction table. As a transaction table slot is overwritten, its commit SCNis written into the transaction control along with the address of the firstundo record of the transaction that has just acquired the slot. The previousinformation in the transaction control is then written into the first undorecord of the new transaction, and this effectively builds a linked list ofundo records (the “first record” of each transaction) that can be used to rollthe transaction table back to an earlier point in time.

--Transaction table slots也可以被重写，为了保护事务的commit SCN不丢失，Oracle有一套机制来保护slot里的信息。每个undo segment header 有一个transaction control区域，其保存了所有transaction table的使用信息，在事务con写slot时，对应的commit SCN 会和first undo record地址一起写入transactioncontrol。Transaction control 里之前的信息会被写入新事务的first undorecord里，通过这个undo record的link list，可以将事务回滚到之前的某个时间点。

Any time thatOracle is following a list of pointers through the undo segment and arrives atan undo block with the wrong seq: (i.e., incarnation) number, you will get anOracle error ORA-01555, “snapshot too old,” because the block you wanted to seehas been reused.

--任何时候，Oracle 使用使用指针list 来访问undosegment，然后根据seq:来查到对应的undo block ，如果undo block 被重用，那么就会遇到ORA-01555的错误。

Read consistencyfor LOBs is completely different. Essentially, Oracle doesn’t update LOBs; itsimply keeps old copies of LOBs for a while before overwriting them, but usesthe standard read-consistency mechanism on the LOBINDEX to allow it to point tothe correct old copy. There is a special “snapshot too old” error for LOBs(ORA-22924) that appears when the index has become read consistent but the LOBvalue has been overwritten.

--LOBs的Read consistency 与普通字段有一定区别，本质上，Oracle 不会更新LOBs，而是简单的在重写之前copy 一份LOBs，但是会在LOBINDEX上使用正常的机制来访问对应的old copy。对于LOBs的快照过旧，赌赢的错误代码是：ORA-22924.

说明：根据<OracleCore Essential Internals for DBAs andDevelopers> 进行翻译整理。

-------------------------------------------------------------------------------------------------------

Skype: tianlesoftware

QQ: tianlesoftware@gmail.com

Email: tianlesoftware@gmail.com

Blog: http://www.tianlesoftware.com

Weibo: http://weibo.com/tianlesoftware

Twitter: http://twitter.com/tianlesoftware

Facebook: http://www.facebook.com/tianlesoftware

Linkedin: http://cn.linkedin.com/in/tianlesoftware

-------加群需要在备注说明Oracle表空间和数据文件的关系，否则拒绝申请----

DBA1 群：62697716(满); DBA2 群：62697977(满) DBA3 群：62697850(满)

DBA 超级群：63306533(满); DBA4 群：83829929 DBA5群： 142216823

DBA6 群：158654907 DBA7 群：172855474 DBA总群：104207940

道森Oracle，国内最早、最大的网络语音培训机构，我们提供专业、优质的Oracle技术培训和服务！我们的官方网站：http://www.daosenoracle.com 官方淘宝店：http://daosenpx.taobao.com/