Tales From A Lazy Fat DBA

Its all about Databases & their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Migrated to RAC and getting ‘row cache locks’ or ‘enq: SQ – contention’ ?

Posted by FatDBA on September 24, 2021

Hi Everyone,

Recently I was working on a performance issue where customer reported frequent slowness and hang issues with their newly migrated 12.2 2-Node RAC cluster. I was involved at the time when issue was already gone and now I had to dig out the data the history either from AWR or via DBA_HIST_XX views. I started glancing over AWR reports for the probe period (~ 2 hours). I saw especially node 1 was swamped with excessive ‘row cache lock’ wait events, and that’s too with very high average wait time of 7973.47 ms (~ 8 seconds per wait). Though similar waits were found happening on instance 2, but quite less as compared to node1 (take a look at the AWR snip below)

You can also see ‘enq: SQ – contention’ in place of ‘row cache locks’ as this got renamed.

Below is the snippet from AWR that states it spend ~ 99% of DB Time% on sequence loading.

While checking ‘enqueue stats’ I saw ‘SQ-Sequence Cache’ type enqueues with very high overall wait period of 545 seconds (~ 9 minutes).

Next target was to find out the source SQL or the statements waiting on these row cache lock waits. And as expected, it was a SQL that interacts with the sequence to generate the NEXTVAL and feed that information to another statements that inserts records to a frequently accessed application log table. You can think of that statement something like below …

-- Generating next available value from the sequence
SELECT TEST_SEQ.NEXTVAL FROM DUAL; 

And source being a sequence, I thought of generating the DDL to see all its options or properties. And as expected, this sequence has NOCACHE option because this was recently upgraded from a standalone 12.1 database to a new 12.2 2-Node RAC cluster. The main reason for specifying NOCACHE earlier was to avoid gaps in sequence number as the value is not lost when the instance abnormally terminates.

CREATE SEQUENCE  "DIXIT"."TEST_SEQ"  MINVALUE 1 MAXVALUE 9999999999999999999999999999 INCREMENT BY 1 START WITH 1673163 NOCACHE  NOORDER  NOCYCLE  NOKEEP  NOSCALE  GLOBAL ;

And we complety missed to modify sequences as per the best practice of having CACHE + NOORDER combination in case of RAC. With this combo, each instances caches a distinct set of numbers in the shared pool & Sequences will not be globally ordered.

When caching is used, then the dictionary cache (the row cache) is updated only once with the new high watermark, e.g. when a caching of 20 is used and a nextval is requested the first time, then the row cache value of the sequence is changed in the dictionary cache and increased by 20. The LAST_NUMBER of the DBA_SEQUENCES get increased with the cache value or 20. The extracted 20 values, stored in the shared pool, will be distributed to the sessions requesting the nextval of it.

When no caching is used, then the dictionary cache has to be updated for any nextval request. It means the row cache has to be locked and updated with a nextval request. Multiple sessions requesting a nextval will hence be blocked on a ‘row cache lock’ wait. Each instance allocates numbers by access to the database but cache fusion may delay sending current seq$ block to a requesting instance if the block is busy owing to many sequence number allocations from the instance owning the current block image.

But there is a caveat when you use CACHE option and that is that the gaps in the sequence numbering occur when the sequence cache is lost e.g. any shared pool flush or instance shutdown like an single instance databases. When the sequence caching is used and the cached values are flushed from the shared pool. The same happen in RAC as in single instance databases. Any flush on any shared pool is enough to invalidate the cache value on RAC systems. And I don’t see any problem having a gap in the sequence, if not using a banking application.

Let me explain it through an example ..

-- Will create a sequence, default is to cache 20 sequence values in memory.

SQL> create sequence mytest_seq start with 1 increment by 1;

Sequence created.

SQL> select mytest_seq.nextval from dual;

  NEXTVAL
----------

         1

SQL> select mytest_seq.nextval from dual;

  NEXTVAL
----------

         2


-- The database is terminated and after startup, the next value of the sequence is selected.


SQL> select mytest_seq.nextval from dual;

  NEXTVAL
----------

        21

-- The first 20 values were in the cache, but only the first two were actually used. 
-- When the instance got terminated, sequence values 3 through 20 were lost as they were in cache. 

So, we decided to use caching, considering the average modifications and sequence generation requests to the main table, we planned to go with 500 sequence to be cached that Oracle will pre-allocate and keep in the memory for faster access.

ALTER SEQUENCE TEST_SEQ cache 500; 

And yup, the issue got fixed as soon we made sufficient sequences numbers available in the cache and no more ‘row cache lock’ waits afterwards.

Hope It Helped!
Prashant Dixit

Advertisement

2 Responses to “Migrated to RAC and getting ‘row cache locks’ or ‘enq: SQ – contention’ ?”

  1. manikandan.ar said

    Very good catch. Really useful

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: