Wednesday, January 12, 2011

ORA-01555: snapshot too old: rollback segment number with name "" too small

ORA-01555: snapshot too old: rollback segment number with name "" too small
============================================================================


Quick check & solution:-



1) Increase size of rollback segment which will reduce the likelihood of overwriting
rollback information that is needed.

2) UNDO_RETENTION parameter value is enough.

3) if possible try to tune the longest running queries

4) Reduce the number of commits .
(ORA-01555 frequently caused by COMMIT within PL/SQL LOOP)

5) Run the processing against a range of data rather than the whole table.

6) Add additional rollback segments. This will allow the updates etc. to be spread
across more rollback segments thereby reducing the chances of overwriting required
rollback information.

7) If fetching across commits, the code can be changed so that this is not done.

8) Ensure that the outer select does not revisit the same block at different times
during the processing.





Analysis:
=========

---find out longest running query.
--------------------------------------------

SQL> select max(MAXQUERYLEN) from v$undostat;

MAX(MAXQUERYLEN)
----------------
81431

SQL>


if we increase undo_retention more than longest running query can
avoid ORA-01555 SNAPSHOT too old errors.




****Execute following query to get the optimal undo_retention size to be increased.
----------------------------------------------------------------------

SQL>
SQL> SELECT d.undo_size/(1024*1024) "ACTUAL UNDO SIZE [MByte]",
2 SUBSTR(e.value,1,25) "UNDO RETENTION [Sec]",
3 ROUND((d.undo_size / (to_number(f.value) *
4 g.undo_block_per_sec))) "OPTIMAL UNDO RETENTION [Sec]"
5 FROM (
6 SELECT SUM(a.bytes) undo_size
7 FROM v$datafile a,
8 v$tablespace b,
9 dba_tablespaces c
10 WHERE c.contents = 'UNDO'
11 AND c.status = 'ONLINE'
12 AND b.name = c.tablespace_name
13 AND a.ts# = b.ts#
14 ) d,
15 v$parameter e,
16 v$parameter f,
17 (
18 SELECT MAX(undoblks/((end_time-begin_time)*3600*24))
19 undo_block_per_sec
20 FROM v$undostat
21 ) g
22 WHERE e.name = 'undo_retention'
23 AND f.name = 'db_block_size'
24 /

ACTUAL UNDO SIZE [MByte] UNDO RETENTION [Sec] OPTIMAL UNDO RETENTION [Sec]
------------------------ ------------------------- ----------------------------
23552 900 12993





SQL>
SQL>
SQL> SELECT *
2 FROM (SELECT begin_time, txncount, maxquerylen, ssolderrcnt,
3 nospaceerrcnt, unxpstealcnt, expstealcnt
4 FROM v$undostat
5 ORDER BY begin_time DESC)
6 WHERE ROWNUM <= 30 ;

BEGIN_TIM TXNCOUNT MAXQUERYLEN SSOLDERRCNT NOSPACEERRCNT UNXPSTEALCNT EXPSTEALCNT
--------- ---------- ----------- ----------- ------------- ------------ -----------
12-JAN-11 6596 47 0 0 0 9
12-JAN-11 2382 242 0 0 0 0
12-JAN-11 2441 4464 0 0 0 0
12-JAN-11 2183 3855 0 0 0 0
12-JAN-11 2100 3250 0 0 0 0
12-JAN-11 2139 2641 0 0 0 0
12-JAN-11 7795 4453 0 0 0 0
12-JAN-11 2489 3849 0 0 0 0
12-JAN-11 2426 3248 0 0 0 0
12-JAN-11 2261 2647 0 0 0 0
12-JAN-11 2544 2039 0 0 0 0
12-JAN-11 2393 1143 0 0 0 0
12-JAN-11 7963 844 0 0 0 4
12-JAN-11 2030 238 0 0 0 2
12-JAN-11 1272 541 0 0 0 0
12-JAN-11 1223 30 0 0 0 1
12-JAN-11 720 233 0 0 0 0
12-JAN-11 600 1739 0 0 0 0
12-JAN-11 6023 1131 0 0 0 2
12-JAN-11 325 522 0 0 0 0
12-JAN-11 244 750 0 0 0 0
12-JAN-11 219 270 0 0 0 0
12-JAN-11 160 0 0 0 0 0
12-JAN-11 277 0 0 0 0 0
12-JAN-11 5850 0 0 0 0 31
12-JAN-11 175 0 0 0 0 0
12-JAN-11 164 0 0 0 0 0
12-JAN-11 140 0 0 0 0 1
12-JAN-11 185 0 0 0 0 0
12-JAN-11 134 0 0 0 0 0

30 rows selected.

SQL>







===================================================
introduction:
===================================================

An ORA-01555 is never about running out of rollback. It is about rollback that was
generated being overwritten. A select statement will not cause rollback to be "held".
As soon as the transaction that generated the rollback commits - that rollback may be
reused and if it is and it is needed by some query, you will get an ORA-01555.

if you size your rollback adequately, neither will you.

The ORA-01555 happens when people try to save space typically. They'll have small
rollback segments that could grow if they needed (and will shrink using OPTIMAL).

So, they'll start with say 10 or so 1 Mb rollback segments. These rollback segments COULD
grow to 100 MB each if we let them (in this example) however, they will NEVER grow unless
you get a big transaction.

If your database does lots of little transactions, the rollback segment will never grow on their own.
They will stay small.

Now, someone needs to run a query that will take 5 minutes. On your system however the
rollback wraps every 2 minutes due to lots of little transactions going on. In this
system, ORA-01555 will happen frequently.

What you need to do here is size rollback so that it wraps less frequently (less frequently
then your long running queries). Here if you sized the rollback so that you had 10, 10 MB segments
(not so they could GROW to 10MB but that they are starting at 10MB)
we would wrap maybe every 20 minutes now. that'll
give that 5 minute query plenty of time to complete without reusing rollback it needs.






===============================================
ORA-01555 Explanation
===============================================

There are two fundamental causes of the error ORA-01555 that are a result of Oracle
trying to attain a 'read consistent' image. These are :

*** The rollback information itself is overwritten so that Oracle is unable to rollback
the (committed) transaction entries to attain a sufficiently old enough version of the
block.

*** The transaction slot in the rollback segment's transaction table (stored in the
rollback segment's header) is overwritten, and Oracle cannot rollback the transaction
header sufficiently to derive the original rollback segment transaction slot.

Both of these situations are discussed below with the series of steps that cause the
ORA-01555. In the steps, reference is made to 'QENV'. 'QENV' is short for 'Query
Environment', which can be thought of as the environment that existed when a query is
first started and to which Oracle is trying to attain a read consistent image. Associated
with this environment is the SCN
(System Change Number) at that time and hence, QENV 50 is the query environment with SCN
50.

CASE 1 - ROLLBACK OVERWRITTEN

This breaks down into two cases: another session overwriting the rollback that the
current session requires or the case where the current session overwrites the rollback
information that it requires. The latter is discussed in this article because this is
usually the harder one to understand.

Steps:

1. Session 1 starts query at time T1 and QENV 50

2. Session 1 selects block B1 during this query

3. Session 1 updates the block at SCN 51

4. Session 1 does some other work that generates rollback information.

5. Session 1 commits the changes made in steps '3' and '4'.
(Now other transactions are free to overwrite this rollback information)

6. Session 1 revisits the same block B1 (perhaps for a different row).

Now, Oracle can see from the block's header that it has been changed and it is
later than the required QENV (which was 50). Therefore we need to get an image of the
block as of this QENV.

If an old enough version of the block can be found in the buffer cache then we
will use this, otherwise we need to rollback the current block to generate another
version of the block as at the required QENV.

It is under this condition that Oracle may not be able to get the required
rollback information because Session 1's changes have generated rollback information that
has overwritten it and returns the ORA-1555 error.

CASE 2 - ROLLBACK TRANSACTION SLOT OVERWRITTEN

1. Session 1 starts query at time T1 and QENV 50

2. Session 1 selects block B1 during this query

3. Session 1 updates the block at SCN 51

4. Session 1 commits the changes
(Now other transactions are free to overwrite this rollback information)

5. A session (Session 1, another session or a number of other sessions) then use the
same rollback segment for a series of committed transactions.

These transactions each consume a slot in the rollback segment transaction table
such that it eventually wraps around (the slots are written to in a circular fashion) and
overwrites all the slots. Note that Oracle is free to reuse these slots since all
transactions are committed.

6. Session 1's query then visits a block that has been changed since the initial QENV
was established. Oracle therefore needs to derive an image of the block as at that point
in time.

Next Oracle attempts to lookup the rollback segment header's transaction slot
pointed to by the top of the data block. It then realises that this has been overwritten
and attempts to rollback the changes made to the rollback segment header to get the
original transaction slot entry.

If it cannot rollback the rollback segment transaction table sufficiently it will
return ORA-1555 since Oracle can no longer derive the required version of the data block.


It is also possible to encounter a variant of the transaction slot being overwritten
when using block cleanout. This is briefly described below :

Session 1 starts a query at QENV 50. After this another process updates the blocks that
Session 1 will require. When Session 1 encounters these blocks it determines that the
blocks have changed and have not yet been cleaned out (via delayed block cleanout).
Session 1 must determine whether the rows in the block existed at QENV 50, were
subsequently changed,

In order to do this, Oracle must look at the relevant rollback segment transaction table
slot to determine the committed SCN. If this SCN is after the QENV then Oracle must try
to construct an older version of the block and if it is before then the block just needs
clean out to be good enough for the QENV.

If the transaction slot has been overwritten and the transaction table cannot be rolled
back to a sufficiently old enough version then Oracle cannot derive the block image and
will return ORA-1555.

(Note: Normally Oracle can use an algorithm for determining a block's SCN during block
cleanout even when the rollback segment slot has been overwritten. But in this case
Oracle cannot guarantee that the version of the block has not changed since the start of
the query).


=============================================
Solutions
=============================================

This section lists some of the solutions that can be used to avoid the ORA-01555 problems
discussed in this article. It addresses the cases where rollback segment information is
overwritten by the same session and when the rollback segment transaction table entry is
overwritten.

It is worth highlighting that if a single session experiences the ORA-01555 and it is not
one of the special cases listed at the end of this article, then the session must be
using an Oracle extension whereby fetches across commits are tolerated. This does not
follow the ANSI model and in the rare cases where
ORA-01555 is returned one of the solutions below must be used.

CASE 1 - ROLLBACK OVERWRITTEN

1. Increase size of rollback segment which will reduce the likelihood of overwriting
rollback information that is needed.

2. Reduce the number of commits (same reason as 1).

3. Run the processing against a range of data rather than the whole table. (Same
reason as 1).

4. Add additional rollback segments. This will allow the updates etc. to be spread
across more rollback segments thereby reducing the chances of overwriting required
rollback information.

5. If fetching across commits, the code can be changed so that this is not done.

6. Ensure that the outer select does not revisit the same block at different times
during the processing. This can be achieved by :

- Using a full table scan rather than an index lookup
- Introducing a dummy sort so that we retrieve all the data, sort it and then
sequentially visit these data blocks.

CASE 2 - ROLLBACK TRANSACTION SLOT OVERWRITTEN

1. Use any of the methods outlined above except for '6'. This will allow transactions
to spread their work across multiple rollback segments therefore reducing the likelihood
or rollback segment transaction table slots being consumed.

2. If it is suspected that the block cleanout variant is the cause, then force block
cleanout to occur prior to the transaction that returns the ORA-1555. This can be
achieved by issuing the following in SQL*Plus, SQL*DBA or Server Manager :

alter session set optimizer_goal = rule;
select count(*) from table_name;

If indexes are being accessed then the problem may be an index block and clean out
can be forced by ensuring that all the index is traversed. Eg, if the index is on a
numeric column with a minimum value of 25 then the following query will force cleanout of
the index :

select index_column from table_name where index_column > 24;

2 comments:

Anonymous said...

Hi! Todays moning I did some selects and there it is:
SQL> select blob_to_clob(content) from downloaded where id=220606;
ERROR:
ORA-01555: snapshot too old: rollback segment number with name "" too smal

So I followed your blog and did a
ALTER SYSTEM SET UNDO_RETENTION=1107651;
ALTER TABABLESPACE UNDOTBS1 RESIZE 1024M

I can select the lobs length:
SQL> select length(content) from downloaded where id=220606;

LENGTH(CONTENT)
---------------
49173

But still can not select the lob itself:
SQL> select content from downloaded where id=220606;
ERROR:
ORA-01555: snapshot too old: rollback segment number with name "" too small
ORA-22924: snapshot too old

So what can I do? How can I retrive the data?

Thanks
Christian

PS I use Oracle 11gR2 64 Bit on Suse Linux

Anonymous said...

Hi,

I have one query, I hope you will help and guide me on that problem.

I have one table which will contains 100 millions of records, right now its containing 50k. Let assume that right now we have 100 millions of records in that table and out of that let assume 10 millions records have Null value in Downtime Columns.

For that I had create one procedure. In that I have declare one variable "update_limit" and I have assign it 1000 records, So that after every 1000 update records, those will be persisted in Database via commit command.

There is any formula to find out what is the optimal value for "update_limit" Variables.

If someone ask me why you assign 1000 then I have no answer, so for that only I want any formula so that I will tell them, due to this only I have assign this much value or blah-blah.

Waiting for your reply..

Thanks
Hem001