Are you suffering from excessive ‘cursor: mutex X’ & ‘cursor: mutex S’ waits after the upgrade ?

Posted by FatDBA on September 15, 2021

Hi Everyone,

Recently, I was contacted by one of my friend who was battling with some performance issues, since they moved from 12c to 19c. He was mostly strained about a particular problem with the new 19c database where he was getting excessive concurrency classed waits of “cursor: mutex X” (> 92% of the DB Time) and some “cursor: mutex S” events in the database. This was leading to frequent database hang instances.

As per the above snippet from AWR report for the period, ‘cursor: mutex X’ was consuming more than 170 ms per wait or an average wait time and was responsible for more than 91% of the total DB Time consumption.

Initially I though it was a case of classic hard parsing issue, as Cursor: mutex S wait usually occurs when Oracle is serializing parsing of multiple SQL statements. I mean there must be SQLs which are going through excessive hard parsing and has too many child cursors in the library cache. So, I immediately checked section ‘SQL Ordered by Version Count’ and saw one individual statement was there with 7,201 versions or Childs within a period of 2 hours.

Same was confirmed through ASH report too (see below pasted snippet). This particular SELECT statement was waiting on both on these two concurrency classed events specific to library cache.

I further drilled down on this issue to identify the cause behind this problem by querying V$SQL_SHARED_CURSOR (for reasons) to know why a particular child cursor was not shared with existing child cursors, and I was getting BIND_EQUIV_FAILURE as a reason. The database has the ACS(Adaptive Cursor Sharing) and CFB(Cardinality Feedback) enabled and seemed a ‘cursor leak’ issue.

I also noted huge sleeps for CP type mutexes on functions kkscsAddChildNode & kkscsPruneChild, below is the snippet from AWR, take a look at the first two in red.

And when I was about to prepare the strategy (i.e. specific plan purges etc.) to handle the situation, I thought to generate the hang analyze to identify if there are any known/familiar hang chains within stack traces. And I saw most of the chains running the same cursor from different processes waiting on ‘cursor: mutex X’ with below error stack … I mean there were multiple unique sessions waiting for a parent cursor mutex in exclusive mode on the same cursor under the following stack.

<-kgxExclusive<-kkscsAddChildNode<-kxscod<-kkscsCompareBinds<-kkscscid_bnd_eval<-kkscsCheckCriteria<-kkscsCheckCursor<-kkscsSearchChildList<-kksfbc<-

So, we had an error stack showing wait chains running the same cursor from different processes waiting on ‘cursor: mutex X’ and with BIND_EQUIV_FAILURE=Y in V$SQL_SHARED_CURSOR and CFB & ACS enabled, it was appearing that this was happening due to some bug.

Oracle support confirmed my doubt. They affirmed that this was happening all due to two unpublished bugs 28889389 and 28794230. For first one we need to apply patch 28889389, which has the optimized code for cursor mutex while searching the parent cursor for the match, for second one 28794230, they recommended few alternatives .given below …

_optimizer_use_feedback=false
_optimizer_adaptive_cursor_sharing=false
_optimizer_extended_cursor_sharing_rel=none

But even after setting above three undocumented parameters, which is to disable cardinality feedback and adaptive & extended cursor sharing, we only saw ~30% reduction in total waits. Later on they recommended us to apply the optimizer related bug fix control

_fix_control='23596611:OFF

and that completely resolved the issue.

Hope It Helped!
Prashant Dixit

This entry was posted on September 15, 2021 at 5:54 PM and is filed under Uncategorized. Tagged: Database, oracle, performance, troubleshooting. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Tales From A Lazy Fat DBA

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes