Monday 18 April 2011

UNDO TABLESPACE MANAGEMENT

This document intends to provide information about UNDO tablespace that is not commonly available. I have not included syntax of tablespace creation, dropping etc commands as this can be easily available in books, manuals and websites.


WHAT IS UNDO?

Word UNDO means to reverse or erase the change. In oracle world, UNDO allows to reverse the transaction so that database looks like before the transaction started. Also it provides read consistent image of data i.e. if during a transaction if one session is changing a data where as at the same time other session wants to read that data then UNDO will provide the state of data before the transaction starts.


USE OF UNDO

There is few main usage of UNDO in oracle database.

* UNDO is used to reverse the uncommitted transaction on the issue of ROLLBACK command.
* It is used to provide read consistent image of a record
* It is used to during database recovery to apply any uncommitted transaction in redo logs to datafiles.
* Flashback Query also uses UNDO to get the image of data back in time.

UNDO MANAGEMENT

Oracle needs UNDO tablespace to create undo segments. If undo tablespace is not created then it will use SYSTEM tablespace as UNDO and not recommended. To create UNDO tablespace you have to use UNDO keyword in create tablespace command. Most common configuration of UNDO_MANAGEMENT is AUTO (Default is MANUAL in 8i & 9i).

One can have multiple UNDO tablespace in a database but only one can be active at any given time.

In AUTOMATIC undo management oracle create and manages undo itself and DBAs don’t have to worry about its management. Oracle attempts to assign one undo segment to each transaction. When it cannot, it creates and assigns additional undo segments. Multiple transactions can write to one undo segment if space in undo tablespace is depleted which can cause contention. In order to avoid this situation add more space in undo tablespace. Later this article will provide scripts to monitor fragmentation in UNDO tablespace and method to overcome it.

From Oracle documentation

The following is a summary of the initialization parameters for automatic undo management mode:

Initialization Parameter Description
UNDO_MANAGEMENT If AUTO, use automatic undo management mode. If MANUAL, use manual undo management mode. The default is MANUAL.


UNDO_TABLESPACE An optional dynamic parameter specifying the name of an undo tablespace to use. This parameter should be used only when the database has multiple undo tablespaces and you want to direct the database instance to use a particular undo tablespace.

UNDO_RETENTION A dynamic parameter specifying the minimum length of time to retain undo. The default is 900 seconds. The setting of this parameter should take into account any flashback requirements of the system.


TYPES OF AUTOMATIC UNDO MANAGEMENT

When oracle introduced UNDO management it was based on segment management. Segment management is expensive on CPU, memory and IO so oracle add a new feature in later version of oracle 10g to do undo management in memory and called it in-memory Undo (IMU).

a) IN-MEMORY UNDO

Main benefit of IMU is that as there is no block level change of UNDO segment involves here so IMU will not generate any redo entry. Hence IMU improve the Undo Header contention and undo block contention by doing UNDO management in memory called IMU Node. Rather than writing a change to undo buffer ,undo is written to IMU Node. On the other hand memory structure require latches to maintain serial execution so make sure that enough IMU latches are created by changing processes parameter. Oracle use ‘In memory Undo Latch’ to access IMU structures in shared pool. So if you have high waits on this latch then you can increase the number of latches by increasing processes parameter or switch in-memory undo by setting in_memory_undo to false.

The following is a summary of the initialization parameters for automatic undo management mode:

Initialization Parameter Description

_in_memory_undo Default is TRUE and enabled. To disable it change parameter to FALSE.

_imu_pools Default is 3 on some system. This sets the number of IMU pools. It is not related to memory allocation for IMU.

_recursuve_imu_transactions This enables Oracle’s own SQL to use IMU. Default is FALSE.

_db_writer_flush_imu Allows Oracle the freedom to artificially age a transaction for increased automatic cache management. Default is

Initialization Parameter Description TRUE


There is no parameter that allow us to change memory allocation for IMU Node but changing shared_pool_size parameter can help adjusting the IMU memory allocation.

To findout how much memory is allocated run the command

On  UAT database

SQL> select * from v$sgastat where name like 'KTI-UNDO';


POOL NAME BYTES
------------ -------------------------- ----------
shared pool KTI-UNDO 8522272

SQL> select
2 n.name statistic,
3 v.value statvalue,
4 v.statistic# statnum
5 from
6 v$statname n,
7 v$sysstat v
8 where
9 n.statistic# = v.statistic# and
10 n.name like '%commit%'
11 order by
12 1 asc
13 /

Statistic Value Stat #
------------------------------------------------------- ---------------- ------
IMU commits 987,170 267



WARNING : Understand before taking action

As mentioned earlier that huge wait on “in memory Undo Latch “ and high amount of cpu usage shows that you have contention latch contention with IMU. As they are very CPU hungry if waiting for latch so you would like to resolve this issue as early as possible.

There are few reasons that you might have contention with IMU

* Few latches than actually require
* Not enough memory allocated for IMU Node. Check v$latch_children view.
* Not enough CPU core available i.e. if your system is already suffering from high CPU utilisation.

Few suggestions to try in these cases

* Increase the number of processes parameter so that it you can increase number of latches. Be careful and understand the other impact of increasing processes parameter.

* Increase the shared_pool_size. Again also understand the impact of increasing shared_poo_size on over database performance.

* Increase the number of CPU core if possible.

* Last resort is to disable the IMU and force oracle to use segment management by changing _in_memory_undo to FALSE.

b) SEGMENT BASED UNDO

Segment based Undo management is same as any table or index segment management and normally costly than IMU but on the other hand Oracle has more experience in handling segment based undo management than IMU. IMU is released in later version of oracle 10g.

Oracle also provides a tool cal UNDO advisor which checks the information in AWR and advice and helps you on setting up undo environment. Sample example of Undo advisor from oracle documentation is mentioned below

DECLARE
tid NUMBER;
tname VARCHAR2(30);
oid NUMBER;
BEGIN
DBMS_ADVISOR.CREATE_TASK('Undo Advisor', tid, tname, 'Undo Advisor Task');
DBMS_ADVISOR.CREATE_OBJECT(tname, 'UNDO_TBS', null, null, null, 'null', oid);
DBMS_ADVISOR.SET_TASK_PARAMETER(tname, 'TARGET_OBJECTS', oid);
DBMS_ADVISOR.SET_TASK_PARAMETER(tname, 'START_SNAPSHOT', 1);
DBMS_ADVISOR.SET_TASK_PARAMETER(tname, 'END_SNAPSHOT', 2);
DBMS_ADVISOR.execute_task(tname);
end;
/

Where start_snapshot is 1 and end_snapshot is 2. This information is also available in DBA_ADVISOR_* data dictionary views.

Oracle document further discusses the formula of calculating undo space requirement.

UndoSpace = UR * UPS + overhead
where:
UndoSpace is the number of undo blocks
UR is UNDO_RETENTION in seconds. This value should take into consideration long-running queries and any flashback requirements.
UPS is undo blocks for each second
overhead is the small overhead for metadata (transaction tables, bitmaps, and so forth)
As an example, if UNDO_RETENTION is set to 2 hours, and the transaction rate (UPS) is 200 undo blocks for each second, with a 4K block size, the required undo space is computed as follows:

(2 * 3600 * 200 * 4K) = 5.8GBs
Such computation can be performed by using information in the V$UNDOSTAT view. In the steady state, you can query the view to obtain the transaction rate. The overhead figure can also be obtained from the view.

DEFRAGMENT UNDO TABLESPACE

I have already touched this topic in other white paper “ Free Space Fragmentation by Tablespace” but I feel that it is appropriate to mention it here again.

As oracle create different size segments in UNDO tablespace normally and also extend and shrink according to the need of system, this cause fragmentation of UNDO tablespace. Oracle manages defragmentation of UNDO tablespace by SMON processes but if system is highly transactional then it is unlikely that SMON will have a chance to properly defragment UNDO tablespace. It is a common observation then whenever we receive error message of snapshot too old or run out of space from undo tablespace, DBAs add more space as resolution. I have done one test to defragment undo tablespace and happy to share the results.


On one UAT database , UNDO tablespace was fragmented by measuring FSFI and size of tablespace was 60Gb. I created secondary undo tablespace and switch new undo tablespace (UNDO2) as default undo tablespace. This allow oracle (SMON) to clear down all unused segments down on the first undo tablespace (UNDO). Also I found that we are now using 20Gb of space for undo compare to 60Gb this is 2/3rd of space back to the system. Once original undo tablespace (UNDO) is clear down then you can switch it back to default undo tablespace and drop the other undo tablespace (UNDO2).

To switch undo tablespace use
ALTER SYSTEM SET UNDO_TABLESPACE = ‘UNDO2’;


If you are using spfile then you can add spfile clause also. If you are using pfile then change it manually and bounce the database.

To monitor tablespace fragmentation
rem
rem Name: free_space.sql
rem
rem FUNCTION: Provide data on tablespace extent status
rem FUNCTION: Use 0 = OK , 1= Warning , 2= Critical
rem FUNCTION: includes fsfi from DBA Handbook
rem
SET FEED OFF
SET FLUSH OFF
SET VERIFY OFF
set pages 58 LINES 132

COLUMN tablespace HEADING Name FORMAT a25
COLUMN files HEADING '#Files' FORMAT 9,999
COLUMN pieces HEADING 'Frag' FORMAT 9,999
COLUMN free_bytes HEADING 'Free
Byte' FORMAT 9,999,999,999,999
COLUMN free_blocks HEADING 'Free
Blk' FORMAT 999,999,999
COLUMN largest_bytes HEADING 'Biggest
Bytes' FORMAT 9,999,999,999,999
COLUMN largest_blks HEADING 'Biggest
Blks' FORMAT 999,999,999
COLUMN ratio HEADING ' Percent' FORMAT 999.999
COLUMN average_fsfi HEADING 'Average
FSFI' FORMAT 999.999
COLUMN alert_level HEADING 'Alert
level' FORMAT 99
SELECT
tablespace,
files,
pieces,
free_bytes,
free_blocks,
largest_bytes,
largest_blks,
ratio,
average_fsfi,
case when average_fsfi < 30 then 2
when average_fsfi between 30 and 40 then 1
else 0
end alert_level
FROM
(SELECT
tablespace,
COUNT(*) files,
SUM(pieces) pieces,
SUM(free_bytes) free_bytes,
SUM(free_blocks) free_blocks,
SUM(largest_bytes) largest_bytes,
SUM(largest_blks) largest_blks,
SUM(largest_bytes)/sum(free_bytes)*100 ratio,
SUM(fsfi)/COUNT(*) average_fsfi
FROM
(SELECT tablespace_name tablespace, file_id, COUNT(*) pieces,
SUM(bytes) free_bytes, SUM(blocks) free_blocks,
MAX(bytes) largest_bytes, MAX(blocks) largest_blks,
SQRT(MAX(blocks)/SUM(blocks))*(100/SQRT(SQRT(COUNT(blocks)))) fsfi
FROM sys.dba_free_space
GROUP BY tablespace_name, file_id, relative_fno)
GROUP BY tablespace)
/

CONCLUSION

UNDO tablespace tuning is one of the areas commonly missed by DBAs as most of the time tuning is fire fighting activity. This document will help you to analyse your undo tablespace management and tuning scientifically on the basis of some analysis. There is so much more to add on this topic that maybe I’ll add in later version of this document.

Key point is when you do tuning is to “Understand what you doing” and look at the bigger picture.

2 comments: