truncate table很慢之enq: RO

使用ASSM表空间（默认模式）的时候，在dss系统中确实会出现truncate很慢的现象，但是他不会100%重现，得看概率。通过sql trace（对任何v$sysstat看起来资源消耗很低的情况，都可以通过sql trace找到根本原因，所以sql trace是个用来分析但是未必能够帮助解决问题的必备工具）可以看到内部时间如何消耗的。对于truncate，因为目前我们没有直接遇到过，就不分析了，但是在FDA的时候遇到了，顺便搜索了下（FDA是其他原因，见本博客其他帖子），truncate本身慢的原因如下：

Here’s one that started off with a tweet from Kevin Closson, heading towards a finish that shows some interesting effects when you truncate large objects that are using ASSM. To demonstrate the problem I’ve set up a tablespace using system allocation of extents and automatic segment space management (ASSM). It’s the ASSM that causes the problem, but it requires a mixture of circumstances to create a little surprise.

create
    tablespace test_8k_auto_assm
    datafile    -- OMF
    SIZE 1030M
    autoextend off
    blocksize 8k
    extent management local
    autoallocate
    segment space management auto
;
 
create table t1 (v1 varchar2(100)) pctfree 99 tablespace test_8k_auto_assm storage(initial 1G);
 
insert into t1 select user from dual;
commit;
 
alter system flush buffer_cache;
 
truncate table t1;

I’ve created a table with an initial definition of 1GB, which means that (in a clean tablespace) the autoallocate option will jump straight to extents of 64MB, with 256 table blocks mapped per bitmap block for a total of 32 bitmap blocks in each 64MB extent. Since I’m running this on 11.2.0.4 and haven’t included “segment creation immediate” in the definition I won’t actually see any extents until I insert the first row.

So here’s the big question – when I truncate this table (using the given command) how much work will Oracle have to do ?

Exchanging notes over twitter (140 char at a time) and working from a model of the initial state, it took a little time to get to understand what was (probably) happening and then produce this silly example – but here’s the output from a snapshot of v$session_event for the session across the truncate:

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
local write wait                                    490           0          83.26        .170          13
enq: RO - fast object reuse                           2           0         104.90      52.451         105
db file sequential read                              47           0           0.05        .001           0
db file parallel read                                 8           0           0.90        .112           0
SQL*Net message to client                            10           0           0.00        .000           0
SQL*Net message from client                          10           0           0.67        .067         153
events in waitclass Other                             2           0           0.04        .018         109

The statistic I want to highlight is the number recorded against “local write wait”: truncating a table of one row we wait for 490 blocks to be written! We also have 8 “db file parallel read” waits which, according to a 10046 trace file, were reading hundreds of blocks. (I think the most significant time in this test – the RO enqueue wait – may have been waiting for the database writer to complete the work needed for an object checkpoint, but I’m not sure of that.)

The blocks written were the space management bitmap blocks for the extent(s) that remained after the truncate – even the ones that referenced extents above the high water mark for the table. Since we had set the tables initial storage to 1GB, we had a lot of bitmap blocks. At 32 per extent and 16 extents (64MB * 16 = 1GB) we might actually expect something closer to 512 blocks, but actually Oracle had formatted the last extent with only 8 space management blocks. and the first extent had an extra 2 to cater for the level 2 bitmap lock and segment header block giving: 32 * 15 + 8 + 2 = 490.

As you may have seen above, the impact on the test that Kevin was doing was quite dramatic – he had set the initial storage to 128GB (lots of bitmap blocks), partitioned the table (more bitmap blocks) and was running RAC (so the reads were running into waits for global cache grants).

I had assumed that this type of behaviour happened only with the “reuse storage” option of the truncate command: and I hadn’t noticed before that it also appeared even if you didn’t reuse storage – but that’s probably because the effect applies only to the bit you keep, which may typically mean a relatively small first extent. It’s possible, then, that in most cases this is an effect that isn’t going to be particularly visible in production systems – but if it is, can you work around it ? Fortunately another tweeter asked the question “What happens if you ‘drop all storage?'”

truncate有三个选项，如下：

DROP STORAGE, the default option, reduces the number of extents allocated to the resulting table to the original setting for MINEXTENTS. Freed extents are then returned to the system and can be used by other objects.
DROP ALL STORAGE drops the segment. In addition to the TRUNCATE TABLE statement, DROP ALL STORAGE also applies to the ALTER TABLE TRUNCATE (SUB)PARTITION statement. This option also drops any dependent object segments associated with the partition being truncated.

DROP ALL STORAGE is not supported for clusters.

Note:
This functionality is available with Oracle Database 11g release 2 (11.2.0.2).
```
TRUNCATE TABLE emp DROP ALL STORAGE;
```
REUSE STORAGE specifies that all space currently allocated for the table or cluster remains allocated to it. For example, the following statement truncates the emp_dept cluster, leaving all extents previously allocated for the cluster available for subsequent inserts and deletes:
```
TRUNCATE CLUSTER emp_dept REUSE STORAGE;
```

Here’s the result from adding that clause to my test case:

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
enq: RO - fast object reuse                           1           0           0.08        .079           0
log file sync                                         1           0           0.03        .031           0
db file sequential read                              51           0           0.06        .001           0
SQL*Net message to client                            10           0           0.00        .000           0
SQL*Net message from client                          10           0           0.56        .056         123
events in waitclass Other                             3           0           0.87        .289         186

Looking good – if you don’t keep any extents you don’t need to make sure that their bitmaps are clean. (The “db file sequential read” waits are almost all about the data dictionary, following on from my “flush buffer cache”).

Footnote

The same effect appears in 12.1.0.2

Footnote 2

It’s interesting to note that the RO enqueue wait time seems to parallel the local write wait time: perhaps a hint that there’s some double counting going on. (To be investigated, one day).

Footnote 3 (June 2018)

The same effect appears in 12.2.0.1