Recently someone shared me an AWR report from a production 19c system, and he was very tensed about one of the strange looking wait event called ‘TCP Socket (KGAS)’. He was strained because the event was coming with a very high average wait time of 7863.68ms (7.86 seconds), and was consuming around 98.0% of the total DB Time.
Luckily I’d encountered something similar in the past for one of the customer, where the application team was unable to send the mail as DBMS scheduler, and it was stuck for a long time with wait event “TCP Socket(KGAS)” where problem was not with the scheduler, but was an underlying network or third-party application problem.
So, today’s post is all about the wait event, what it is, how to resolve it etc.
KGAS is a element in the server which handles TCP/IP sockets which is typically used in dedicated connections i.e. by some PLSQL built in packages such as UTL_HTTP and UTL_TCP. A session is waiting for an external host to provide requested data over a network socket. The time that this wait event tracks does not indicate a problem, and even a long wait time is not a reason to contact Oracle Support. It naturally takes time for data to flow between hosts over a network, and for the remote aspect of an application to process any request made to it. An application that communicates with a remote host must wait until the data it will read has arrived.
From an application/network point of view, delays in establishing a network connection may produce unwanted delays for users. We should make sure that the application makes network calls efficiently and that the network is working well such that these delays are minimized.
From the database point of view, these waits can safely be ignored; the wait event does not represent a database issue. It merely reports the total elapsed time for a network connection to be established or for data to arrive from over the network. The database waits for the connection to be established and reports the time taken. Its always good to check with the network or the third-party application vendors to investigate the underlying socket.
But in case of systemwide waits – If the TIME spent waiting for this event is significant then it is best to determine which sessions are showing the wait and drill into what those sessions are doing as the wait is usually related to whatever application code is executing eg: What part of the application may be using UTL_HTTP or similar and is experiencing waits. This statement can be used to see which sessions may be worth tracing
SELECT sid, total_waits, time_waited
FROM v$session_event WHERE event='TCP Socket (KGAS)' and total_waits>0 ORDER BY 3,2;
In order to reduce these waits or to help find the origin of the socket operations try:
Check the current SQL/module/action of V$SESSION for sessions that are waiting on the event at the time that they are waiting to try and identify any common area of application code waiting on the event.
Get an ERRORSTACK level 3 dump of some sessions waiting on the event. This should help show the exact PLSQL and C call stacks invoking the socket operation if the dump is taken when the session is waiting. Customers may need assistance from Oracle Support in order to get and interpret such a dump but it can help pinpoint the relevant application code.
Trace sessions incurring the waits including wait tracing to try and place the waits in the context of the code executing around the waits. eg: Use event 10046 level 8 or DBMS_MONITOR.SESSION_TRACE_ENABLE.
Use DBA_DEPENDENCIES to find any application packages which may ultimately be using UTL_HTTP or UTL_TCP underneath for some operation.
Example: Execute the following SQL from a session on a dedicated connection and then check the resulting trace file to see “TCP Socket (KGAS)” waits:
alter session set events '10046 trace name context forever, level 8';
Last week we faced an interesting issue with one of the production system which was recently migrated from Oracle 12.2. to 19.15. The setup was running on a VMWare machine with limited resources. It all started when the application team started reporting slowness in their daily scheduled jobs and other ad-hoc operations, when checked at the database layer, it was all ‘enq: CR – block range reuse ckpt‘ wait event. Same can be seen in the below ORATOP output, where the BLOCKER ID 3817 is the CKPT or the checkpoint process.
The strange part was, the blocker was CKPT process and it was all for a particular SQL ID (an INSERT operation), see below in next oratop screen fix.
As far as this wait event (other classed), This comes just before you delete or truncate a table, where we need a level segment checkpoint. This is because it must maintain the consistency of the blocks there may be in the buffer memory and what’s on the disc. As per the definition, this wait event happens due to contention on blocks caused by multiple processes trying to update same blocks. This may seem issues from the application logic resulting into this concurrency bottleneck, but interestingly this was happening on a simple INSERT operation, not a DELETE or TRUNCATE.
Same can be seen in the AWR and ASH reports too! There are CBC (Cache Buffer Chains) latching and latch free events too along with ‘enq: CR – block range reuse ckpt‘, but the initial focus was to understand the event and its reasons. As far as ‘direct path read temp‘ it was happening due to couple of SELECT statement which we resolved after attaching a better plan with the SQLs.
Wait event source (from ASH)
SQL Text was quite simple, an INSERT INTO statement.
I’ve tried first with the CKPT process traces, just to see if there is anything interesting or useful caught within the process logs or traces. The trace was short and has only got some strange and obscure looking content, but which at least gave us an idea that the reuse block checkpoint of the RBR was failed due to enqueue, and its entry was failed due to abandoned parent. Still, that doesn’t helped us anything, we were unsure about the reason.
--> Info in CKPT trace file ---> XXXXX_ckpt_110528.trc.
RBR_CKPT: adding rbr ckpt failed for 65601 due to enqueue
RBR_CKPT: rbr mark the enqueue for 65601 as failed
RBR_CKPT: adding ckpt entry failed due to abandoned parent 0x1b57b4a88
RBR_CKPT: rbr mark the enqueue for 65601 as failed
There were few things logged in the alert.log, multiple deadlocks (ORA 0060), too many parse errors for one SELECT statement and some checkpoint incomplete errors (log switching was high >35)
-- deadlocks in alert log.
Errors in file /opt/u01/app/oracle/diag/rdbms/xxxxxx/xxxx/trace/xxxx_ora_73010.trc:
2022-04-19T13:41:22.489551+05:30 ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
ORA-00060: Deadlock detected. See Note 60.1 at My Oracle Support for Troubleshooting ORA-60 Errors. More info in file /opt/u01/app/oracle/diag/rdbms/pwfmfsm/PWFMFSM/trace/PWFMFSM_ora_73010.trc.
-- From systemstatedumps
[Transaction Deadlock]
The following deadlock is not an ORACLE error. It is a ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
------------Blocker(s)----------- ------------Waiter(s)------------
Resource Name process session holds waits serial process session holds waits serial
TX-03AC0014-0000B33A-00000000-00000000 1562 691 X 1632 946 5441 X 56143
TX-01460020-0001A5C2-00000000-00000000 946 5441 X 56143 1562 691 X 1632
-- too many parse errors
2022-04-19T13:57:26.261176+05:30
WARNING: too many parse errors, count=2965 SQL hash=0x8ce1e2ff ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
PARSE ERROR: ospid=68036, error=942 for statement:
2022-04-19T13:57:26.272805+05:30
SELECT * FROM ( SELECT xxxxxxxxxxxxxxxxxxxxxx FROM xxxxxxxxxxxxxxxxxxxxxx AND xxxxxxxxxxxxxxxxxxxxxx ORDER BY xxxxxxxxxxxxxxxxxxxxxx ASC ) WHERE ROWNUM <= 750
Additional information: hd=0x546ba1be8 phd=0x61edab798 flg=0x20 cisid=113 sid=113 ciuid=113 uid=113 sqlid=ccmkzhy6f3srz
...Current username=xxxxxxxxxxxxxxxxxxxxxx
...Application: xxxxxxxxxxxxxxxxxxxxxx.exe Action:
-- Checkpoint incomplete
2022-04-19T15:03:16.964470+05:30 ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
Thread 1 cannot allocate new log, sequence 186456
Checkpoint not complete
Current log# 11 seq# 186455 mem# 0: +ONLINE_REDO/xxxxxxxxxxxxxxxxxxxxxx/ONLINELOG/group_11.256.1087565999
2022-04-19T15:03:17.785113+05:30
But the alert log was not sufficient to give us any concrete evidences or reasons for CKPT bloking user sessions. So, next we decided to generate the HANGANALYZE and SYSTEMSTATEDUMPs to understand what’s all happening under the hood, through its wait chains. We noticed few interesting things there
Wait chain 1 where a session was waiting on ‘buffer busy waits‘ while doing the “bitmapped file space header” which talks about the space management blocks (nothing to with bitmap indexes) and was related with one SELECT statement.
Wait chain 2 where a session was found waiting on ‘enq: CR – block range reuse ckpt‘ event and was blocked by CKPT process (3817) which was further waiting on ‘control file sequential read‘
Wait chain 4 where SID 1670, was found waiting on ‘buffer busy waits‘ while doing ‘bitmapped file space header’ operations.
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (kpkpkpkpk.kpkpkpkpk)
os id: 46794
process id: 2014, oracle@monkeymachine.prod.fdt.swedish.se
session id: 15
session serial #: 19322
module name: 0 (xxxx.exe)
}
is waiting for 'buffer busy waits' with wait info:
{
p1: 'file#'=0xca
p2: 'block#'=0x2
p3: 'class#'=0xd ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄ bitmapped file space header
time in wait: 2 min 20 sec
timeout after: never
wait id: 10224
blocking: 0 sessions
current sql_id: 1835193535
current sql: SELECT * FROM ( SELECT xxxxx FROM task JOIN request ON xxxxx = xxxxx JOIN xxxxx ON xxxxx = xxxxx JOIN c_task_assignment_view ON xxxxx = xxxxx
.
.
.
and is blocked by
=> Oracle session identified by:
{
instance: 1 (kpkpkpkpk.kpkpkpkpk)
os id: 23090
process id: 3231, oracle@monkeymachine.prod.fdt.swedish.se
session id: 261
session serial #: 39729
module name: 0 (xxxx.exe)
}
which is waiting for 'buffer busy waits' with wait info:
{
p1: 'file#'=0xca
p2: 'block#'=0x2
p3: 'class#'=0xd ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄ bitmapped file space header
time in wait: 18.999227 sec ◄◄◄
timeout after: never
wait id: 47356
blocking: 25 sessions ◄◄◄
current sql_id: 0
current sql: <none>
short stack: ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+223<-__sighandler()<-read()+14<-snttread()+16<-nttfprd()+354<-nsbasic_brc()+399<-nioqrc()+438<-opikndf2()+999<-opitsk()+910<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245
wait history:
* time between current wait and wait #1: 0.000084 sec
1. event: 'SQL*Net message from client'
time waited: 0.008136 sec
wait id: 47355 p1: 'driver id'=0x28444553
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.000043 sec
2. event: 'SQL*Net message to client'
time waited: 0.000002 sec
wait id: 47354 p1: 'driver id'=0x28444553
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000093 sec
3. event: 'SQL*Net message from client'
time waited: 2.281674 sec
wait id: 47353 p1: 'driver id'=0x28444553
p2: '#bytes'=0x1
}
and may or may not be blocked by another session.
.
.
.
Chain 2:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (kpkpkpkpk.kpkpkpkpk)
os id: 10122
process id: 2850, oracle@monkeymachine.prod.fdt.swedish.se
session id: 97
session serial #: 47697
module name: 0 (xxxx.exe)
}
is waiting for 'enq: CR - block range reuse ckpt' with wait info: ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
{
p1: 'name|mode'=0x43520006
p2: '2'=0x10b22
p3: '0'=0x1
time in wait: 2.018004 sec
timeout after: never
wait id: 81594
blocking: 0 sessions
current sql_id: 1335044282
current sql: <none>
.
.
.
and is blocked by
=> Oracle session identified by:
{
instance: 1 (kpkpkpkpk.kpkpkpkpk)
os id: 110528
process id: 24, oracle@monkeymachine.prod.fdt.swedish.se
session id: 3817 ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄ CKPT background process
session serial #: 46215
}
which is waiting for 'control file sequential read' with wait info: ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
{
p1: 'file#'=0x0
p2: 'block#'=0x11e
p3: 'blocks'=0x1
px1: 'disk number'=0x4
px2: 'au'=0x34
px3: 'offset'=0x98000
time in wait: 0.273981 sec
timeout after: never
wait id: 17482450
blocking: 45 sessions ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
current sql_id: 0
current sql: <none>
short stack: ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+223<-__sighandler()<-semtimedop()+10<-skgpwwait()+192<-ksliwat()+2199<-kslwaitctx()+205<-ksarcv()+320<-ksbabs()+602<-ksbrdp()+1167<-opirip()+541<-opidrv()+581<-sou2o()+165<-opimai_real()+173<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245
wait history:
* time between current wait and wait #1: 0.000012 sec
1. event: 'control file sequential read'
time waited: 0.005831 sec
wait id: 17482449 p1: 'file#'=0x0
p2: 'block#'=0x11
p3: 'blocks'=0x1
* time between wait #1 and #2: 0.000012 sec
2. event: 'control file sequential read'
time waited: 0.011667 sec
wait id: 17482448 p1: 'file#'=0x0
p2: 'block#'=0xf
p3: 'blocks'=0x1
* time between wait #2 and #3: 0.000017 sec
3. event: 'control file sequential read'
time waited: 0.009160 sec
wait id: 17482447 p1: 'file#'=0x0
p2: 'block#'=0x1
p3: 'blocks'=0x1
}
.
.
.
Chain 4:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (kpkpkpkpk.kpkpkpkpk)
os id: 46479
process id: 1036, oracle@monkeymachine.prod.fdt.swedish.se
session id: 1670
session serial #: 6238
module name: 0 (xxxx.exe)
}
is waiting for 'buffer busy waits' with wait info:
{
p1: 'file#'=0xca
p2: 'block#'=0x2
p3: 'class#'=0xd ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄ bitmapped file space header
time in wait: 18.954206 sec ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
timeout after: never
wait id: 20681
blocking: 0 sessions
current sql_id: 343919375
current sql: SELECT * FROM ( SELECT xxxxx FROM task JOIN request ON xxxxx = xxxxx JOIN xxxxx ON xxxxx = xxxxx JOIN c_task_assignment_view ON xxxxx = xxxxx
.
.
.
and is blocked by
=> Oracle session identified by:
{
instance: 1 (kpkpkpkpk.kpkpkpkpk)
os id: 44958
process id: 523, oracle@monkeymachine.prod.fdt.swedish.se
session id: 4681
session serial #: 41996
module name: 0 (xxxx.exemonkeymachine.prod.fdt.swedish.se (TNS)
}
which is waiting for 'buffer busy waits' with wait info: ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
{
p1: 'file#'=0xca
p2: 'block#'=0x2
p3: 'class#'=0xd ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄ bitmapped file space header
time in wait: 18.959429 sec ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
timeout after: never
wait id: 153995
blocking: 101 sessions ◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄◄
current sql_id: 343919375
current sql: INSERT INTO xx_xxx_xx(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) VALUES (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx)
short stack: ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+223<-__sighandler()<-semtimedop()+10<-skgpwwait()+192<-ksliwat()+2199<-kslwaitctx()+205<-ksqcmi()+21656<-ksqcnv()+809<-ksqcov()+95<-kcbrbr_int()+2476<-kcbrbr()+47<-ktslagsp()+2672<-ktslagsp_main()+945<-kdliAllocCache()+37452<-kdliAllocBlocks()+1342<-kdliAllocChunks()+471<-kdliWriteInit()+1249<-kdliWriteV()+967<-kdli_fwritev()+904<-kdlxNXWrite()+577<-kdlx_write()+754<-kdlxdup_write1()+726<-kdlwWriteCallbackOld_pga()+1982<-kdlw_write()+1321<-kdld_write()+410<-kdl
wait history:
* time between current wait and wait #1: 0.756195 sec
1. event: 'direct path write temp'
time waited: 0.406543 sec
wait id: 153994 p1: 'file number'=0xc9
p2: 'first dba'=0xb28fc
p3: 'block cnt'=0x4
* time between wait #1 and #2: 0.000001 sec
2. event: 'ASM IO for non-blocking poll'
time waited: 0.000000 sec
wait id: 153993 p1: 'count'=0x4
p2: 'where'=0x2
p3: 'timeout'=0x0
* time between wait #2 and #3: 0.000002 sec
3. event: 'ASM IO for non-blocking poll'
time waited: 0.000001 sec
wait id: 153992 p1: 'count'=0x4
p2: 'where'=0x2
p3: 'timeout'=0x0
}
and may or may not be blocked by another session.
.
.
Though, we wanted to try couple of hidden parameters to enable fast object level truncation and checkpointing, as they had helped us a lot in the past in similar scenarios, but had to involve Oracle support and after carefully analyzing the situation, they too agreed and want us to try them as they started suspecting it as an aftermath of a known bug of 19c.
[oracle@oracleontario ~]$ !sql
sqlplus / as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Sun Apr 24 00:24:47 2022
Version 19.15.0.0.0
Copyright (c) 1982, 2022, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.15.0.0.0
SQL>
SQL> @hidden
Enter value for param: db_fast_obj_truncate
old 5: and a.ksppinm like '%¶m%'
new 5: and a.ksppinm like '%db_fast_obj_truncate%'
Parameter Session Value Instance Value descr
--------------------------------------------- ------------------------- ------------------------- ------------------------------------------------------------
_db_fast_obj_truncate TRUE TRUE enable fast object truncate
_db_fast_obj_ckpt TRUE TRUE enable fast object checkpoint
SQL> ALTER SYSTEM SET "_db_fast_obj_truncate" = false sid = '*';
System altered.
SQL>
SQL> ALTER SYSTEM SET "_db_fast_obj_ckpt" = false sid = '*';
System altered.
SQL>
And soon after setting them, we saw a drastic drop in the waits and system seemed better, much better. But it all happened during an off-peak hour, so there wasn’t much of a workload to see anything odd.
And as we suspected, the issue repeated itself, and next day during peak business hours we started seeing the same issue, same set of events back into existence. This time the ‘latch: cache buffers chains‘ was quite high and prominent which was not that much earlier.
Initially we tried to fix some of the expensive statements on ‘logical IOs’ or memory reads, but that hardly helped. The issue persisted even after setting a higher value for db_block_hash_latches and decreasing cursor_db_buffers_pinned. AWR continues to show ‘latch: cache buffers chains’ in the top ten, foreground timed events, and ‘latch free‘ in first place in the background timed events. Oracle support confirmed the behavior was due to published bug 33025005 where excessive Latch CBC waits were seen after upgrading from 12c to 19c, and suggested to apply patch 33025005 and then to set hidden parameter “_post_multiple_shared_waiters” to value FALSE (in MEMORY only to test) which disables multiple shared waiters in the database.
-- After applying patch 33025005
SQL> ALTER SYSTEM SET "_post_multiple_shared_waiters" = FALSE SCOPE = MEMORY;
System altered.
And even after applying the patch and setting the recommended undocumented parameter, the issue persisted and we were totally clueless.
And as a last resort, we tried to flush the buffer cache of the database, and bingo that crude method of purging the cache helped to drastically to reduce the load on ‘CBC Latching‘ and for ‘enq: cr block range reuse ckpt‘, and the system ran fine soon after the flush of the DB Buffer cache.
So, nothing worked for us, we changed multiple checkpointing and shared writers related parameters in the database, applied a bug fix patch (33025005), but of no use. Finally, the flush of buffer cache worked for us! Oracle support agreed that this was happening due to a new/unpublished bug (33125873 or 31844316) which is not yet fixed in 19.15 and will be included in 23.1, and they are in status 11 that means still being worked by Development so there is no fix for it yet.
Recently while looking into a system (was running on 19.3.0.0.0 standalone) where ‘log file sync’ was bugging the database, and after we tried all other possible solutions, we thought to increase the priroty of the LGWR background process to see if that helps.
Increasing the LGWR priority is putting the LGWR process in the Round-Robin (SCHED_RR) class. You can increase process’s priority both using OS (renice, nice commands) or Database methods, but this post is about setting the priority using ‘_high_priority_process’ an undocumented/hidden parameter that prioritizes your database managed processes.
I am using Oracle 19.3 for the test where the LGWR is not by default comes with any priority in the DB, starting from 21.3.0.0.0 LGWR process is part of _high_priority_processes group along with VKTM & LMS* processes. Note: This being a hidden/undocumented parameter I advise to consult with Oracle support before going and changing the parameter value. Try other possible ways to reduce log file sync, before jumping into this crude method of prioritizing LGWR over others.
[oracle@oracleontario ~]$ !sql
sqlplus / as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Sun Apr 10 03:36:06 2022
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> @hidden
Enter value for param: high_prio
old 5: and a.ksppinm like '%¶m%'
new 5: and a.ksppinm like '%priority_processes%'
Parameter Session Value Instance Value descr
--------------------------------------------- ------------------------- ------------------------- ------------------------------------------------------------
_highest_priority_processes VKTM VKTM Highest Priority Process Name Mask
_high_priority_processes LMS*|VKTM LMS*|VKTM High Priority Process Name Mask
And by default in Oracle version 19.3.0.0 the parameter is set to prioritize VKTM (Virtual keeper of time) and LMS (Lock Manager, a RAC process). Let me check VKTM’s current priority class, and it is set to RR class (SCHED_RR scheduling class) for the process as its defined via _high_priority_processes parameter.
Let’s change the priority and reboot the database to persistent the change!
SQL> alter system set "_high_priority_processes"='LMS*|VKTM|LGWR' scope=spfile;
System altered.
SQL> shut immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 1593831936 bytes
Fixed Size 8897024 bytes
Variable Size 1107296256 bytes
Database Buffers 469762048 bytes
Redo Buffers 7876608 bytes
Database mounted.
Database opened.
SQL> @hidden
Enter value for param: high_priority_processes
old 5: and a.ksppinm like '%¶m%'
new 5: and a.ksppinm like '%high_priority_processes%'
Parameter Session Value Instance Value descr
--------------------------------------------- ------------------------- ------------------------- ------------------------------------------------------------
_high_priority_processes LMS*|VKTM|LGWR LMS*|VKTM|LGWR High Priority Process Name Mask
At the same time I can see the same was logged into the Alert log file.
2022-04-10T03:54:31.488732-04:00
LGWR started with pid=8, OS id=26058 at elevated (RT) priority
So, we have reniced the priority of LGWR on the system, I mean the higher value of priority actually makes the process lower priority; it means the process demands fewer system resources (and therefore is a “nicer” process). Now lets check the scheduling class of the process at the OS, it should be now changed to RR from TS.
Last week I was part of one system/database stability short term assignment where customer running a critical Telco application on 2-Node RAC Cluster (RHEL) on 11.2.0.3.1 reported slowness in few of the critical data processing modules, which in turn had slowed down their entire system. Todays post is about buggy behavior of an adaptive feature which had caused a huge mess in the system in the form of ‘Log File Sync‘ waits.
After few of the initial calls with the customer, we come to know that the database has a history of high ‘log file sync‘ waits, but they simply come and go, and it seemed that the events were never handled correctly or being analyzed! And interestingly, restarting the cluster resolves the issue for few hours, sometimes for few days. LFS event was quite prominently seen in this database and on an average found consuming > 38% of the total DB Time% available with very high average wait times (248 ms).
Below are few of the database statistics captured from the problem time.
About the event, When a user session commits, all redo records generated by that session’s transaction need to be flushed from memory to the redo logfile to insure changes to the database made by that transaction become permanent. The time between the user session posting the LGWR and the LGWR posting the user after the write has completed is the wait time for ‘log file sync’ that the user session will show. This event is known as the time lost as a result of the LGWR process waiting while users initiate a Transaction Commit or Rollback.
Next in order to get more idea about the event, I ran the lfsdiag.sql (Oracle provided script to catch diag info on it – Doc ID 1064487.1)). The script will look at the important parameters involved in log file sync waits, wait histogram data, and at the worst average LFS times in the active session history data and AWR data and dump information to help determine why those times were the highest.
Below are the ASH LFS background process waits caught during its worst minute and it had some really bad stats captured for the worst minute for the same time when customer had worst application performance.
Another intersting section was the Histogram data for LFS and other related events. Here was can see the LFS waits at “wait_time_milli” and specially the high wait times to correlate them with other wait events. From below stats its evident that Node 2 of this RAC cluster was severely impacted with the LFS waits., with very high wait counts and wait times (ms), and lot of ‘gcs log flush sync’ along with LFS events that pushes LGWR process to write data to the disk.
The stats are very bad, with highest wait time of 1048576 ms on Node 2 and average of 99864 ms (1.6 mins), both ‘log file parallel write’ and ‘gcs log flush sync’ were quite high too.
As a possible solution, we tried few of the quick ones i.e. batching redo (commit_logging = batch) for the log writer (I know it has its own risks) to reduce LFS, but that didn’t worked either.
Next thing I’d generated the system state dump to understand the system and situation bettern, and the LFS events were caught in the system state dump as well with wait chains pointing to ‘rdbms ipc message'<=’log file sync’.
Process traces are always considered a wealth of diagnostic information, So I’d checked the LGWR process traces and thats where I saw some strangeness with frequent entries related with switching between post/wait and polling method which is an adaptive way to control switching between post/wait (older way) and polling (new method) for log file syncs. This gave me little hint about the possible reason on why so many LFS waits.
Talking about the adaptive log file sync, there are 2 methods by which LGWR and foreground processes can communicate in order to acknowledge that a commit has completed: Post/wait: traditional method available in previous Oracle releases LGWR explicitly posts all processes waiting for the commit to complete. The advantage of the post/wait method is that sessions should find out almost immediately when the redo has been flushed to disk. Polling: Foreground processes sleep and poll to see if the commit is complete, this was introduced to free high CPU usage by the LGWR.
This behavior is controlled by the parameter “_use_adaptive_log_file_sync” and was introduced in 11gR2 and controls whether adaptive switching between post/wait and polling is enabled. In 11.2.0.1 and 11.2.0.2 the default value for the parameter is false. From 11.2.0.3, the default value has been changed to true.
-- LGWR traces
WARNING:io_submit failed due to kernel limitations MAXAIO for process=128 pending aio=128
WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=1048576 AIO-NR=169402
*** 2022-04-07 06:03:31.916
Warning: log write broadcast wait time 2612477ms (SCN 0xad1.f8c170fe)
*** 2022-04-07 06:03:31.916
Warning: log write broadcast wait time 2598008ms (SCN 0xad1.f8c21251)
kcrfw_update_adaptive_sync_mode: post->poll long#=33 sync#=202 sync=5963 poll=8730 rw=383 rw+=383 ack=3982 min_sleep=1135
*** 2022-04-07 07:46:20.018
Log file sync switching to polling --------------------->>>>>>> It shows current method is polling
Current scheduling delay is 1 usec
Current approximate redo synch write rate is 67 per sec
*** 2022-04-07 07:47:13.877
kcrfw_update_adaptive_sync_mode: poll->post current_sched_delay=0 switch_sched_delay=1 current_sync_count_delta=63 switch_sync_count_delta=202
*** 2022-04-07 07:47:13.877
Log file sync switching to post/wait ------------------>>>>>>>> It shows current method is post/wait
Current approximate redo synch write rate is 21 per sec
-- Below stats shows that the POLLING is happening on the database for LGWR wrtes
SQL> select name,value from v$sysstat where name in ('redo synch poll writes','redo synch polls');
NAME VALUE
---------------------------------------------------------------- ----------
redo synch poll writes 10500129
redo synch polls 10773618
As there were no other symptoms of issues with I/O or in other areas, the problem could be with excessive switching between post/wait and polling wait methods.
After consulting with Oracle support, we found the issue was happening due to BUG (25178179) and the issue gets severe when only while using log file sync “polling mode“. To prevent the problem from happening, they suggested us to turn off the _use_adaptive_log_file_sync by setting it to FALSE in either in the spfile and restarting the database or dynamically in memory. This will force the log file sync waits to use (the traditional) “post/wait mode” rather than the automatically switching between “post/wait mode” and “polling mode” based on performance statistics. Changing the said parameter to FALSE disabled adaptive LFS in the database and that resolved the issue and system performance was restored.
-- Set the parameter _use_adaptive_log_file_sync = false and restart the database:
SQL> ALTER SYSTEM SET "_use_adaptive_log_file_sync" = FALSE;
-- In cases where a restart is not feasible, then you can set in memory and also in the SP file for when a restart does occur:
SQL> ALTER SYSTEM SET "_use_adaptive_log_file_sync"=FALSE; -- to set in memory and spfile as by default scope=both
SQL> ALTER SYSTEM SET "_use_adaptive_log_file_sync"=FALSE scope=sfile sid='*'; -- to set on spfile, so parameter is used at next restart.
Though In the vast majority of cases adaptive log file sync improves the overall performance of log file synchronization, but there are few bugs associated with this feature i.e. 13707904, 13074706 and 25178179.
‘Outline Data’ section displays the list of hints that would be needed to replicate the execution plan, even if the statistics change., but in case of a complex execution plans it comes with lot of strange looking and obscure terms used. As there isn’t any published explanation of stored outline hints, so today’s post is to decipher few of the terms that you see in an outline data of an execution plan.
Okay, coming back to the post, let me quickly generate the outline data for one of the test SQL and will try to explain about each of the hints, query blocks, aliases and other representations used.
SQL>
SQL> explain plan for
SELECT d.department_name,e.employee_name
FROM departments d
LEFT OUTER JOIN employees e ON d.department_id = e.department_id
WHERE d.department_id >= 30
ORDER BY d.department_name, e.employee_name;
Explained.
SQL> select * from table(dbms_xplan.display('PLAN_TABLE',NULL,'+alias +outline'));
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
Plan hash value: 3871261979
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 168 | 7 (15)| 00:00:01 |
| 1 | SORT ORDER BY | | 4 | 168 | 7 (15)| 00:00:01 |
|* 2 | HASH JOIN OUTER | | 4 | 168 | 6 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| DEPARTMENTS | 2 | 44 | 3 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| EMPLOYEES | 6 | 120 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$2BFA4EE4
3 - SEL$2BFA4EE4 / D@SEL$1
4 - SEL$2BFA4EE4 / E@SEL$1
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
USE_HASH(@"SEL$2BFA4EE4" "E"@"SEL$1")
LEADING(@"SEL$2BFA4EE4" "D"@"SEL$1" "E"@"SEL$1")
FULL(@"SEL$2BFA4EE4" "E"@"SEL$1")
FULL(@"SEL$2BFA4EE4" "D"@"SEL$1")
OUTLINE(@"SEL$1")
OUTLINE(@"SEL$2")
ANSI_REARCH(@"SEL$1")
OUTLINE(@"SEL$8812AA4E")
ANSI_REARCH(@"SEL$2")
OUTLINE(@"SEL$948754D7")
MERGE(@"SEL$8812AA4E" >"SEL$948754D7")
OUTLINE_LEAF(@"SEL$2BFA4EE4")
ALL_ROWS
DB_VERSION('19.1.0')
OPTIMIZER_FEATURES_ENABLE('19.1.0')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID"(+))
3 - filter("D"."DEPARTMENT_ID">=30)
4 - filter("E"."DEPARTMENT_ID"(+)>=30)
As per the above outline data captured for the execution plan, the very first entry is USE_HASH(@”SEL$2BFA4EE4″ “E”@”SEL$1”) This represents the use of HASH join methods used in the query. Other join method outline options you might see are USE_NL and USE_MERGE. Here in our example we’ve used the LEFT OUTER JOIN which will return all valid rows from the table on the left side of the JOIN keyword, that’s table EMPLOYEE (alias ‘E’) along with the values from the table on the right side.
i.e LEFT OUTER JOIN employees e ON d.department_id = e.department_id
LEADING(@”SEL$2BFA4EE4″ “D”@”SEL$1” “E”@”SEL$1”) The LEADING hints specifies the exact join order for the SQL that is followed by the optimizer in the plan. It displays the join order as an ordered list of table aliases and query block names. The aliases appears in the ordering in which we access those tables in the query.
The first part of the hint @”SEL$2BFA4EE4″ represents the main SELECT block and if you closely see the full hint definition you will see “D”@”SEL$1”, where “SEL$1” is the query block name for table alias ‘D’ for Department table, which is followed by “E”@”SEL$1”, that represents query block name “SEL$1” on second table joined ‘E’ for Employee table.
FULL(@”SEL$2BFA4EE4″“E”@”SEL$1”) & FULL(@”SEL$2BFA4EE4″ “D”@”SEL$1”) are what you see next and these two hints for FULL TABLE SCANS on table alias “E”, that is EMPLOYEE table, followed by FTS on table alias “D”, that is DEPARTMENT table. Same you can see in the execution plan too.
OUTLINE(@”SEL$1″) & OUTLINE(@”SEL$2″) These two OUTLINE hints correspond to initial and the intermediate query blocks.
ANSI_REARCH(@”SEL$1″) This is the hint that instructs the optimizer to re-architecture of ANSI left, right, and full outer joins. In our case this was to re-arch the left outer join.
MERGE(@”SEL$8812AA4E” >”SEL$948754D7″) is the MERGE query block hint. SEL$8812AA4E and SEL$948754D7 are the transformed query blocks.
OUTLINE_LEAF(@”SEL$2BFA4EE4″) This hint builds an outline leaf for the specified query block. In our example it represents that the query block SEL$2BFA4EE4 is a “final” query block that has actually been subject to independent optimization. Outline leaf cannot be transformed. You can see multiple outline_leaf hints for the query blocks.
IGNORE_OPTIM_EMBEDDED_HINTS is a special hint instructs the CBO to ignore most of all the other supplied hints.
ALL_ROWS This hint instructs to optimize the query/statement block for best throughput with lowest resource utilization.
DB_VERSION(‘19.1.0’) & OPTIMIZER_FEATURES_ENABLE(‘19.1.0’) these two hints allows for the CBO to process the SQL on the said version of the database, 19.1.0 in our case.
Few others that you might see in OUTLINE DATA sections and are easy to decipher are INDEX_RS_ASC (Index Range Scan in ascending order) and it happens when the INDEX RANGE SCAN is used as an access path/method in SQL execution plan. Few others that you can see in case of Nested Loop Joins are NLJ_BATCHING which happens when Oracle batches multiple physical I/O requests and process them using a vector I/O (array) instead of processing them one at a time, and batching improves performance because it uses asynchronous reads.. Few others that you can see in case of NL’s are USE_NL, NLJ_PREFETCH etc.
INDEX_FFS in case of Index Fast Full Scans and the list is long …
Today’s post if about a brilliant tool/script that I frequently use to get the outputs/graphs for customer meetings, presentations and for quick analysis of any performance problem …
This is one of the awesome Oracle provided tool/script that generates a heat map of latency times for a given wait event. The script takes the output from DBA_HIST_EVENT_HISTOGRAM and produces a heat map and a JPG version of the heat map is also produced representing the wait time latency for the given wait event over a specific time frame identified by range of snapshots.
Source : Script to Display Latency Wait Time From DBA_HIST_EVENT_HISTOGRAM Using Heat Map (Doc ID 1931492.1)
Note: This script queries ASH views, specially DBA_HIST_EVENT_HISTOGRAM which requires license as its part of the Diagnostics Pack on Oracle EE
The scripts takes the following input parameters:
Snapshot id range (first snapshot id and last snapshot id)
Recently I had to part-time support a system where customer reported slowness issues with few of the critical SQLs which uses few IN-Memory tables and were impacted exactly since they moved from Oracle 12.1 to 12.2. We tried few things but none of them worked, and with that I started suspecting if the change in CBO version from 12.1 to 12.2 the reason behind those problems with SQLs.
So, today’s post will give you some idea about such scenarios when you’re not sure about the things that broke the performance after the upgrade, and you have to do the try and error approach to find out the cause. So, I decided to give it a try with one of those SQLs to set the OFE (Optimizer Feature Enable) to earlier DB version 12.1 and verify query performance. So I started my test with the optimizer_features_enable parameter to set it to the older version.
SQL> alter session set optimizer_features_enable='12.1.0.2';
And the query ran fine, At this stage we had some sort of understanding of the problem as it was clear that some changes in 12.2 that has trigger this problem. But now I have to identify among all those hidden/underscore parameters and fix control’s that one final parameter which is influencing the optimizer for that odd behavior. So, I started something like this … We extract all the fixes and underscore parameters introduced in oracle 12.2 and I started with the hidden Parameters followed by Fix Control settings in 12.2.0.1 Optimizer environment to come to 12.1.0.2 OFE level
-- Hidden parameters added into Oracle version 12.2
-- Setting them back to Oracle version 12.1
alter session set "_optimizer_undo_cost_change"="12.1.0.2";
alter session set "_optimizer_cbqt_or_expansion"=off;
alter session set "_optimizer_ads_use_partial_results"=false;
alter session set "_query_rewrite_use_on_query_computation"=false;
alter session set "_px_scalable_invdist_mcol"=false;
alter session set "_optimizer_eliminate_subquery"=false;
alter session set "_sqlexec_hash_based_distagg_ssf_enabled"=false;
alter session set "_optimizer_union_all_gsets"=false;
alter session set "_optimizer_enhanced_join_elimination"=false;
alter session set "_optimizer_multicol_join_elimination"=false;
alter session set "_key_vector_create_pushdown_threshold"=0;
alter session set "_optimizer_enable_plsql_stats"=false;
alter session set "_recursive_with_parallel"=false;
alter session set "_recursive_with_branch_iterations"=1;
alter session set "_px_dist_agg_partial_rollup_pushdown"=off;
alter session set "_optimizer_key_vector_pruning_enabled"=false;
alter session set "_pwise_distinct_enabled"=false;
alter session set "_vector_encoding_mode"=off;
alter session set "_ds_xt_split_count"=0;
alter session set "_ds_sampling_method"=NO_QUALITY_METRIC;
alter session set "_optimizer_ads_use_spd_cache"=false;
alter session set "_optimizer_use_table_scanrate"=OFF;
alter session set "_optimizer_use_xt_rowid"=false;
alter session set "_xt_sampling_scan_granules"=off;
alter session set "_optimizer_band_join_aware"=false;
alter session set "_optimizer_vector_base_dim_fact_factor"=0;
alter session set "_ds_enable_view_sampling"=false;
alter session set "_optimizer_inmemory_use_stored_stats"=NEVER;
alter session set "_mv_access_compute_fresh_data"=off;
alter session set "_bloom_filter_ratio"=30;
alter session set "_optimizer_control_shard_qry_processing"=65535;
alter session set "_optimizer_interleave_or_expansion"=false;
Placed all above underscore parameters with the query and ran it again and observed the benefits (make sure to flush the previous plans from the shared pool). In my case this actually worked, but question was which underscore parameter helped. Do same with all fixes (controls).
-- Fix Controls added in Oracle version 12.2
-- Setting them back to Oracle version 12.1
alter session set "_fix_control"="16515789:0";
alter session set "_fix_control"="17491018:0";
alter session set "_fix_control"="17986549:0";
alter session set "_fix_control"="18115594:0";
alter session set "_fix_control"="18182018:0";
alter session set "_fix_control"="18302923:0";
alter session set "_fix_control"="18377553:0";
alter session set "_fix_control"="5677419:0";
alter session set "_fix_control"="18134680:0";
alter session set "_fix_control"="18636079:0";
alter session set "_fix_control"="18415557:0";
alter session set "_fix_control"="18385778:0";
alter session set "_fix_control"="18308329:0";
alter session set "_fix_control"="17973658:0";
alter session set "_fix_control"="18558952:0";
alter session set "_fix_control"="18874242:0";
alter session set "_fix_control"="18765574:0";
alter session set "_fix_control"="18952882:0";
alter session set "_fix_control"="18924221:0";
alter session set "_fix_control"="18422714:0";
alter session set "_fix_control"="18798414:0";
alter session set "_fix_control"="18969167:0";
alter session set "_fix_control"="19055664:0";
alter session set "_fix_control"="18898582:0";
alter session set "_fix_control"="18960760:0";
alter session set "_fix_control"="19070454:0";
alter session set "_fix_control"="19230097:0";
alter session set "_fix_control"="19063497:0";
alter session set "_fix_control"="19046459:0";
alter session set "_fix_control"="19269482:0";
alter session set "_fix_control"="18876528:0";
alter session set "_fix_control"="19227996:0";
alter session set "_fix_control"="18864613:0";
alter session set "_fix_control"="19239478:0";
alter session set "_fix_control"="19451895:0";
alter session set "_fix_control"="18907390:0";
alter session set "_fix_control"="19025959:0";
alter session set "_fix_control"="16774698:0";
alter session set "_fix_control"="19475484:0";
alter session set "_fix_control"="19287919:0";
alter session set "_fix_control"="19386746:0";
alter session set "_fix_control"="19774486:0";
alter session set "_fix_control"="18671960:0";
alter session set "_fix_control"="19484911:0";
alter session set "_fix_control"="19731940:0";
alter session set "_fix_control"="19604408:0";
alter session set "_fix_control"="14402409:0";
alter session set "_fix_control"="16486095:0";
alter session set "_fix_control"="19563657:0";
alter session set "_fix_control"="19632232:0";
alter session set "_fix_control"="19889960:0";
alter session set "_fix_control"="17208933:0";
alter session set "_fix_control"="19710102:0";
alter session set "_fix_control"="18697515:0";
alter session set "_fix_control"="18318631:0";
alter session set "_fix_control"="20078639:0";
alter session set "_fix_control"="19503668:0";
alter session set "_fix_control"="20124288:0";
alter session set "_fix_control"="19847091:0";
alter session set "_fix_control"="12618642:0";
alter session set "_fix_control"="19779920:0";
alter session set "_fix_control"="20186282:0";
alter session set "_fix_control"="20186295:0";
alter session set "_fix_control"="20265690:0";
alter session set "_fix_control"="16047938:0";
alter session set "_fix_control"="19507904:0";
alter session set "_fix_control"="18915345:0";
alter session set "_fix_control"="20329321:0";
alter session set "_fix_control"="20225191:0";
alter session set "_fix_control"="18776755:0";
alter session set "_fix_control"="19882842:0";
alter session set "_fix_control"="20010996:0";
alter session set "_fix_control"="20379571:0";
alter session set "_fix_control"="20129763:0";
alter session set "_fix_control"="19899588:0";
alter session set "_fix_control"="10098852:0";
alter session set "_fix_control"="19663421:0";
alter session set "_fix_control"="20465582:0";
alter session set "_fix_control"="16732417:0";
alter session set "_fix_control"="20732410:0";
alter session set "_fix_control"="20289688:0";
alter session set "_fix_control"="20543684:0";
alter session set "_fix_control"="20506136:0";
alter session set "_fix_control"="20830312:0";
alter session set "_fix_control"="19768896:0";
alter session set "_fix_control"="19814541:0";
alter session set "_fix_control"="17443547:0";
alter session set "_fix_control"="19123152:0";
alter session set "_fix_control"="19899833:0";
alter session set "_fix_control"="20754928:0";
alter session set "_fix_control"="20808265:0";
alter session set "_fix_control"="20808192:0";
alter session set "_fix_control"="20340595:0";
alter session set "_fix_control"="18949550:0";
alter session set "_fix_control"="14775297:0";
alter session set "_fix_control"="17497847:0";
alter session set "_fix_control"="20232513:0";
alter session set "_fix_control"="20587527:0";
alter session set "_fix_control"="19186783:0";
alter session set "_fix_control"="19653920:0";
alter session set "_fix_control"="21211786:0";
alter session set "_fix_control"="21057343:0";
alter session set "_fix_control"="21503478:0";
alter session set "_fix_control"="21476032:0";
alter session set "_fix_control"="20859246:0";
alter session set "_fix_control"="21639419:0";
alter session set "_fix_control"="20951803:0";
alter session set "_fix_control"="21683982:0";
alter session set "_fix_control"="20216500:0";
alter session set "_fix_control"="20906162:0";
alter session set "_fix_control"="20854798:0";
alter session set "_fix_control"="21509656:0";
alter session set "_fix_control"="21833220:0";
alter session set "_fix_control"="21802552:0";
alter session set "_fix_control"="21452843:0";
alter session set "_fix_control"="21800590:0";
alter session set "_fix_control"="21273039:0";
alter session set "_fix_control"="16750133:0";
alter session set "_fix_control"="22013607:0";
alter session set "_fix_control"="22152372:0";
alter session set "_fix_control"="22077191:0";
alter session set "_fix_control"="22123025:0";
alter session set "_fix_control"="16913734:0";
alter session set "_fix_control"="8357294:0";
alter session set "_fix_control"="21979983:0";
alter session set "_fix_control"="22158526:0";
alter session set "_fix_control"="21971099:0";
alter session set "_fix_control"="22090662:0";
alter session set "_fix_control"="21300129:0";
alter session set "_fix_control"="21339278:0";
alter session set "_fix_control"="20270511:0";
alter session set "_fix_control"="21424812:0";
alter session set "_fix_control"="22114090:0";
alter session set "_fix_control"="22159570:0";
alter session set "_fix_control"="22272439:0";
alter session set "_fix_control"="22372694:0";
alter session set "_fix_control"="22514195:0";
alter session set "_fix_control"="22520315:0";
alter session set "_fix_control"="22649054:0";
alter session set "_fix_control"="8617254:0";
alter session set "_fix_control"="22020067:0";
alter session set "_fix_control"="22864730:0";
alter session set "_fix_control"="21099502:0";
alter session set "_fix_control"="22904304:0";
alter session set "_fix_control"="22967807:0";
alter session set "_fix_control"="22879002:0";
alter session set "_fix_control"="23019286:0";
alter session set "_fix_control"="22760704:0";
alter session set "_fix_control"="20853506:0";
alter session set "_fix_control"="22513493:0";
alter session set "_fix_control"="22518491:0";
alter session set "_fix_control"="23103096:0";
alter session set "_fix_control"="22143411:0";
alter session set "_fix_control"="23180670:0";
alter session set "_fix_control"="23002609:0";
alter session set "_fix_control"="23210039:0";
alter session set "_fix_control"="23102649:0";
alter session set "_fix_control"="23071621:0";
alter session set "_fix_control"="23136865:0";
alter session set "_fix_control"="23176721:0";
alter session set "_fix_control"="23223113:0";
alter session set "_fix_control"="22258300:0";
alter session set "_fix_control"="22205301:0";
alter session set "_fix_control"="23556483:0";
alter session set "_fix_control"="21305617:0";
alter session set "_fix_control"="22533539:0";
alter session set "_fix_control"="23596611:0";
alter session set "_fix_control"="22937293:0";
alter session set "_fix_control"="23565188:0";
alter session set "_fix_control"="24654471:0";
alter session set "_fix_control"="24845754:0";
ALTER session set "_fix_control"='5483301:OFF;
In my case none of fix control gives any benefits. This was now clear there was some feature in 12.2 which when disabled in the form of underscore parameter helped to get us back to previous state. But the problem is, they are total 32 different parameters, so I thought to divide the parameter list into half and execute the query after setting first half parameters and see if I get the correct result, if not then tried again setting next half parameter list and executed the query. And with that trial and error approach, I was able to reach that one parameter which caused issues with those set of SQLs. It was ‘_optimizer_inmemory_use_stored_stats‘ which was causing issues with those IN-Memory tables references by those SQLs, as the optimizer NEVER uses the stored statistics for in-memory tables in 12.1, but the same parameter in 12.2 is with default value of AUTO and that causes the issue.
Recently I was asked to test performance of an Active Dataguard or Physical standby database where the DWH/reporting type SELECT workload was running, and slowness was reported by the user. In order to understand the system better its always good to generate the AWR reports, but this being the standby database which is opened in READ ONLY mode, means you cannot directly call the AWR report related scripts on the standby as they will fail with the error ‘database opened in read only’ mode.
This post is all about enabling AWR reporting’s in the standby database. I am doing this test on 12.2.0.1.0 ADG.
About test databases, we have the primary (TESLPH) is opened in Read Write mode and the standby (TESLPRBH) in Read Only Mode With real time Apply.
SQL> select db_unique_name,open_mode, database_role from v$database;
DB_UNIQUE_NAME OPEN_MODE DATABASE_ROLE
------------------------------ -------------------- ----------------
TESLPH READ WRITE PRIMARY
SQL> select db_unique_name,open_mode, database_role from v$database;
DB_UNIQUE_NAME OPEN_MODE DATABASE_ROLE
------------------------------ -------------------- ----------------
TESLPRBH READ ONLY WITH APPLY PHYSICAL STANDBY
Next, you have to enable the SYS$UMF account. The RMF is used for collecting performance statistics for an Oracle Database. The SYS$UMF user is the default database user that has all the privileges to access the system-level RMF views and tables. All the AWR related operations in RMF can be performed only by the SYS$UMF user. The SYS$UMF user is locked by default and it must be unlocked before deploying the RMF topology. You need to provide password for the SYS$UMF user when creating database links in the RMF topology. If the password for the SYS$UMF user is changed, all the database links in the RMF topology must be recreated.
Next you have to make sure if the “_umf_remote_enabled” underscore parameter is set to TRUE else you will receieve “ORA-20501: Remote UMF is disabled” when doing any UMF related operation on the database.
-- On both PRIMARY and STANDBY Databases.
SQL> select username,common,account_status from dba_users where username ='SYS$UMF';
USERNAME COM ACCOUNT_STATUS
--------------- --- --------------------------------
SYS$UMF YES OPEN
-- On both PRIMARY and STANDBY Databases.
SQL> alter system set "_umf_remote_enabled"=TRUE scope=BOTH;
System altered.
Next you need to create two database links for ‘primary to the standby‘ and ‘standby to the primary’ communication using SYS$UMF and DB UNIQUE NAME of the databases. Will test the connectivity of both of them to see if they are working okay.
-- On PRIMARY database.
SQL> CREATE DATABASE LINK "PRIMARY_TO_STANDBY_DBLINK" CONNECT TO "SYS$UMF" IDENTIFIED BY VALUES 'oracle90' USING 'TESLPRBH';
Database link created.
SQL> CREATE DATABASE LINK "STANDBY_TO_PRIMARY_DBLINK" CONNECT TO "SYS$UMF" IDENTIFIED BY VALUES 'oracle90' USING 'TESLPH';
Database link created.
SQL> select db_link, username, host from dba_db_links
DB_LINK USERNAME HOST
------------------------------ --------------- ----------------------------------------
STANDBY_TO_PRIMARY_DBLINK SYS$UMF TESLPH
PRIMARY_TO_STANDBY_DBLINK SYS$UMF TESLPRBH
SQL> select db_unique_name from v$database@STANDBY_TO_PRIMARY_DBLINK;
DB_UNIQUE_NAME
------------------------------
TESLPH
1 row selected.
SQL> select db_unique_name from v$database@PRIMARY_TO_STANDBY_DBLINK;
DB_UNIQUE_NAME
------------------------------
TESLPRBH
1 row selected.
-- On PRIMARY database.
SQL> CREATE DATABASE LINK "PRIMARY_TO_STANDBY_DBLINK" CONNECT TO "SYS$UMF" IDENTIFIED BY VALUES 'oracle90' USING 'TESLPRBH';
Database link created.
SQL> CREATE DATABASE LINK "STANDBY_TO_PRIMARY_DBLINK" CONNECT TO "SYS$UMF" IDENTIFIED BY VALUES 'oracle90' USING 'TESLPH';
Database link created.
SQL> select db_link, username, host from dba_db_links
DB_LINK USERNAME HOST
------------------------------ --------------- ----------------------------------------
STANDBY_TO_PRIMARY_DBLINK SYS$UMF TESLPH
PRIMARY_TO_STANDBY_DBLINK SYS$UMF TESLPRBH
-- Lets try the connectivity using DB Links that we have created
SQL> select db_unique_name from v$database@STANDBY_TO_PRIMARY_DBLINK;
DB_UNIQUE_NAME
------------------------------
TESLPH
1 row selected.
SQL> select db_unique_name from v$database@PRIMARY_TO_STANDBY_DBLINK;
DB_UNIQUE_NAME
------------------------------
TESLPRBH
1 row selected.
Next we have to add the primary database node to the UMF repository, for that you have to run below command on both the primary & standby databases.
I am assigning primary site name as –> ‘primary_site’ and standby database site name as –> ‘standby_site’
-- On PRIMARY database
SQL> exec dbms_umf.configure_node ('primary_site');
PL/SQL procedure successfully completed.
SQL>
-- On STANDBY database.
SQL> exec dbms_umf.configure_node('standby_site','STANDBY_TO_PRIMARY_DBLINK');
PL/SQL procedure successfully completed.
SQL>
-- Use in case want to UNCONFIGURE the node.
SQL> exec DBMS_UMF.UNCONFIGURE_NODE;
Next we need to create the UMF topology, I am creating it as ‘Topology_1‘ on primary database. You can only have a single UMF topology in the database, if you try to create more you will get “ORA-20506: Maximum number of topologies exceeded”
-- On PRIMARY Database
SQL> exec DBMS_UMF.create_topology ('Topology_1');
PL/SQL procedure successfully completed.
-- Lets query if the toplogy is created with no errors and is ACTIVE.
SQL> select * from dba_umf_topology;
TOPOLOGY_NAME TARGET_ID TOPOLOGY_VERSION TOPOLOGY
------------------------------ ---------- ---------------- --------
Topology_1 1530523744 1 ACTIVE
1 row selected.
SQL> select * from dba_umf_registration;
TOPOLOGY_NAME NODE_NAME NODE_ID NODE_TYPE AS_SO AS_CA STATE
------------------------------ ------------------------------ ---------- ---------- ----- ----- --------------------
Topology_1 primary_site 1530523744 0 FALSE FALSE OK
1 row selected.
-- Run in case want to DROP the topology
SQL> exec DBMS_UMF.drop_topology('NAME-OF-TOPOLOGY');
Next you have to add the standby node to the topology, for that you need to use register_node procedure and need to mention both of the DBLinks that we have created earlier. This you have to run on PRIMARY node.
Now everything is set, no errors and we are all good to create some manual snaps. Let’s generate some remote snapshots, for that you have to run below command from PRIMARY database.
Now time to call the AWR report base scripts to generate the AWR reports for the standby database, you can call them on either Primary of the Standby server. For example, below I called the AWR report from the primary database and passed DBID and instance number of the standby database and have got the report for the standby database.
SQL> SQL> @?/rdbms/admin/awrrpti.sql
Specify the Report Type
~~~~~~~~~~~~~~~~~~~~~~~
Enter value for report_type: text
Type Specified: text
Instances in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DB Id Inst Num DB Name Instance Host
------------ ---------- --------- ---------- ------
4265600723 1 TESLPRBH TESLPRBH monkey02lx031
* 5576289360 1 TESLPRBH TESLPH monkey1903nm12
Enter value for dbid: 4265600723
Using 4265600723 for database Id
Enter value for inst_num: 1
Using 1 for instance number
Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for num_days: 1
Listing the last day's Completed Snapshots
Instance DB Name Snap Id Snap Started Snap Level
------------ ------------ ---------- ------------------ ----------
TESLPRBH TESLPRBH 1 13 Jan 2022 04:10 1
2 13 Jan 2022 04:11 1
3 13 Jan 2022 04:14 1
4 13 Jan 2022 04:14 1
5 13 Jan 2022 04:14 1
6 13 Jan 2022 04:15 1
7 13 Jan 2022 04:15 1
8 13 Jan 2022 04:17 1
9 13 Jan 2022 04:18 1
10 13 Jan 2022 04:18 1
11 13 Jan 2022 04:18 1
12 13 Jan 2022 04:18 1
Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap:
WORKLOAD REPOSITORY report for
DB Name DB Id Unique Name DB Role Edition Release RAC CDB
------------ ----------- ----------- ---------------- ------- ---------- --- ---
TESLPRBH 4265600723 TESLPRBH PHYSICAL STANDBY EE 12.2.0.1.0 NO NO
Instance Inst Num Startup Time
------------ -------- ---------------
TESLPRBH 1 13-Feb-21 05:16
Host Name Platform CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
monkey02lx0315b Linux x86 64-bit 4 4 4 15.49
Snap Id Snap Time Sessions Curs/Sess
--------- ------------------- -------- ---------
Begin Snap: 1 13-Jan-22 04:10:57 61 .5
End Snap: 12 13-Jan-22 04:18:27 60 .6
Elapsed: 7.50 (mins)
DB Time: 0.73 (mins)
Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~~~~ --------------- --------------- --------- ---------
DB Time(s): 0.1 44.1 0.00 0.01
DB CPU(s): 0.1 24.1 0.00 0.01
Background CPU(s): 0.0 14.2 0.00 0.00
Redo size (bytes): 0.0 0.0
Logical read (blocks): 1,227.7 552,449.0
Block changes: 7.7 3,465.0
Physical read (blocks): 594.5 267,510.0
Physical write (blocks): 44.1 19,821.0
Read IO requests: 16.6 7,466.0
.......
................
.......................
SQLT (Oracle’s Tool written by Carlos Sierra, read MOS note 215187.1 for more details) is an important tool when it comes to doing in-depth and advance troubleshooting & analysis of a SQL statement, it’s pretty easy to install, configure and use, but things gets little tricky when it comes to running it for the SQL that is slow or in question on the read-only standby database. The problem is obvious as ADG or standby databases doesn’t allow WRITE operations and you will get error “ORA-16000: database open for read-only access” when try on standby database
The only problem being that SQLTXTRACT and SQLTXECUTE need read/write access to the database (to store data in the SQLT repository and to install packages and procedures). How is it possible for SQLT to help us if we can’t even store data about the performance on the database with the performance problem? This is where SQLTXTRSBY comes into play. XTRSBY solves this problem by using local users (on a read/write database) and creating procedures that use database links to the read-only database.
First step : Install SQLTXPLAIN on the primary and allow the DDL to be propagated to the standby database.
-- On primary database
-- DDL Will get propogated to the standby in few secons/mins.
[oracle@monkeyos1931jd install]$ sqlplus / as sysdba
SQL*Plus: Release 12.2.0.1.0 Production on Thu Jan 13 16:13:41 2022
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL>
SQL>
SQL> @1
DB_UNIQUE_NAME OPEN_MODE DATABASE_ROLE
_________________ _____________ ________________
DIXITP READ WRITE PRIMARY
SQL>
SQL> START sqcreate.sql
zip warning: name not matched: *_sq*.log
zip error: Nothing to do! (SQLT_installation_logs_archive.zip)
zip warning: name not matched: *_ta*.log
........
...........
..............
SQUTLTEST completed.
adding: 220113161601_10_squtltest.log (deflated 59%)
SQLT users must be granted SQLT_USER_ROLE before using this tool.
SQCREATE completed. Installation completed successfully.
SQL>
SQL>
Step 2 : Create a database link accessible to the SQLTXPLAIN schema linking to the standby database as changes will be fetched from the standby database SQLTXPLA in schema.
-- On Primary database to connect to the standby database.
SQL> create public database link dblink_tostandby connect to sqltxplain identified by oracle90 using 'DIXITSTAN';
Database link DBLINK_TOSTANDBY created.
SQL>
SQL>
SQL> select * from dba_db_links;
OWNER DB_LINK USERNAME HOST CREATED HIDDEN
_________ ____________________________ _____________ ___________ ____________ _________
PUBLIC DBLINK_TOSTANDBY SQLTXPLAIN DIXITSTAN 16-JAN-22 NO
SQL> select sysdate from dual@DBLINK_TOSTANDBY;
SYSDATE
____________
16-JAN-22
SQL>
Step 3: Let’s run some SQLs on standby database which we will using against the SQLT tool as an input to generate the SQLT report.
-- On Standby Database
SQL> @1
DB_UNIQUE_NAME OPEN_MODE DATABASE_ROLE
------------------------------ -------------------- ----------------
DIXITSTAN READ ONLY WITH APPLY PHYSICAL STANDBY
SQL>
SQL> select count(*) from bigtab where weight in ( select distinct( weight) from bigtab);
COUNT(*)
----------
2431464
SQL> select sql_id, sql_fulltext from v$sqlarea where sql_text like '%distinct( weight) from bigtab%';
SQL_ID SQL_FULLTEXT
------------- --------------------------------------------------------------------------------------------------------------------------------------------
6mg40znnrhzm8 select count(*) from bigtab where weight in ( select distinct( weight) from bigtab)
In the steps above we ran some arbitrary SQL and got the SQL ID for that SQL. Remember we ran the SQL on the standby database (where our reports might have run). We can’t store any data on the Data Guard Physical Standby database so now we have to capture information about the SQL from across the database link from the primary database.
Step 4: Will run the SQLT for SQLID that we have captured from STANDBY database, from PRIMARY database. The script “sqltxtrsby.sql” is present under RUN directory of the tool.
[oracle@monkeyos1931jd run]$ ls *sqltxtrsby*
sqltxtrsby.sql
[oracle@monkeyos1931jd run]$ !sql
sqlplus
SQL*Plus: Release 12.2.0.1.0 Production on Sun Jan 16 06:01:12 2022
Copyright (c) 1982, 2016, Oracle. All rights reserved.
Enter user-name: / as sysdba
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL>
SQL> @1
DB_UNIQUE_NAME OPEN_MODE DATABASE_ROLE
_________________ _____________ ________________
DIXITP READ WRITE PRIMARY
SQL>
SQL> @sqltxtrsby 6mg40znnrhzm8 DBLINK_TOSTANDBY
PL/SQL procedure successfully completed.
Parameter 1:
SQL_ID or HASH_VALUE of the SQL to be extracted (required)
Paremeter 2:
SQLTXPLAIN password (required)
PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.
Parameter 3:
DBLINK to stand-by database (required)
Enter value for 3: DBLINK_TOSTANDBY
Values passed to sqltxtrsby:
SQL_ID_OR_HASH_VALUE: "6mg40znnrhzm8"
DB_LINK : "@DBLINK_TOSTANDBY"
PL/SQL procedure successfully completed.
SQLT_VERSION
----------------------------------------
SQLT version number: 19.1.200226
SQLT version date : 2020-02-26
Installation date : 2022-01-13/16:15:27
... please wait ...
adding: alert_DIXITP.log (deflated 86%)
NOTE:
You used the XTRSBY method connected as SYS.
.......
..............
...............
....
..................
File sqlt_s19812_xtrsby_6mg40znnrhzm8.zip for 6mg40znnrhzm8 has been created.
SQLTXTRSBY completed.
Now look out for sqlt_xxxx_main.html file, We’ll see the main sqlt_xxxx_main.html file, but fewer files than for a “normal” sqltxtract run: no 10053 trace file, no SQL profile script and no SQL Tuning Advisor reports. This is because the read-only status of the standby restricts what can be done.
Someone asked me lately if its possible to fix a plan for a bad/slow SELECT SQL in a Data Guard environment ? The guy was asking as he was using ‘coe_xfr_sql_profile.sql‘, the script that you can use to create custom SQL Profiles to correct optimizer cost estimates to encourage good execution plans for a SQL.
Well, the post is not about using it to fix SQL plans, but is to understand how to fix a plan for a SQL that is slow on standby environment. Because if you go and try to execute the script that SQLT generates you will get ‘ORA-00600: internal error’ as dataguard is always opened in the RO mode and you cannot write anything on the database, so the question is – How to fix the plan then for a standby database ?
Here in the example, I’ve got a ADG setup with name ‘TESTSTANDBY’ and one of the SELECT statement with SQL ID 0y521mggg73pk that is slow and it has 4 different plan hash values (PHV) and there is plan flip that happens and due to that sometimes it picks the wrong plan and that’s when it goes slow. Here 1469174702 is the PHV that is WORST and both 3847062775 and 2797333706 PHVs are BEST and we can any of the two.
SQL> @1
DB_UNIQUE_NAME OPEN_MODE DATABASE_ROLE
------------------------------ -------------------- ----------------
TESTSTANDBY READ ONLY WITH APPLY PHYSICAL STANDBY
SQL_ID SUM(EXECS) MIN_ETIME MAX_ETIME NORM_STDDEV
------------- ---------- ----------- ----------- -------------
0y521mggg73pk 32 .03 9.34 143.6554
SQL> @2
Enter value for sql_id: 0y521mggg73pk
SQL_ID PLAN_HASH_VALUE EXECS AVG_ETIME AVG_CPU_TIME AVG_LIO AVG_PIO
------------- --------------- ------------ ------------ ------------ -------------- ------------
0y521mggg73pk 3847062775 10 .032 .021 1,448.0 .0
0y521mggg73pk 2797333706 10 .032 .021 1,448.0 .0
0y521mggg73pk 1687052623 10 .034 .025 1,448.0 .0
0y521mggg73pk 1469174702 2 9.340 1.021 541,054.0 4,155.5
-- This is what you will receive if try to run it on STANDBY database.
SQL> @coe_xfr_sql_profile_0y521mggg73pk_3847062775.sql
DECLARE
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [2652], [108], [108], [0], [0], [3891200], [], [], [], [], [], []
ORA-06512: at "SYS.DBMS_SQLTUNE_INTERNAL", line 18876
ORA-06512: at "SYS.DBMS_SQLTUNE", line 9368
ORA-06512: at "SYS.DBMS_SQLTUNE", line 9328
ORA-06512: at line 37
So, what you have to do here, you need to move the file (do scp, ftp, cp whatever) ‘coe_xfr_sql_profile_0y521mggg73pk_3847062775.sql‘ that is produced by the COE file, to the PRIMARY database and run that file from there and not on the standby.
-- Run the script on PRIMARY database
SQL>
SQL> @coe_xfr_sql_profile_0y521mggg73pk_3847062775.sql
SQL> REM
SQL> REM $Header: 215187.1 coe_xfr_sql_profile_0y521mggg73pk_3847062775.sql 11.4.4.4 2022/01/13 carlos.sierra $
SQL> REM
SQL> REM Copyright (c) 2000-2012, Oracle Corporation. All rights reserved.
SQL> REM
SQL> REM AUTHOR
SQL> REM carlos.sierra@oracle.com
SQL> REM
SQL> REM SCRIPT
SQL> REM coe_xfr_sql_profile_0y521mggg73pk_3847062775.sql
SQL> REM
SQL> REM DESCRIPTION
SQL> REM This script is generated by coe_xfr_sql_profile.sql
SQL> REM It contains the SQL*Plus commands to create a custom
SQL> REM SQL Profile for SQL_ID 0y521mggg73pk based on plan hash
SQL> REM value 3847062775.
SQL> REM The custom SQL Profile to be created by this script
SQL> REM will affect plans for SQL commands with signature
SQL> REM matching the one for SQL Text below.
SQL> REM Review SQL Text and adjust accordingly.
SQL> REM
SQL> REM PARAMETERS
SQL> REM None.
SQL> REM
SQL> REM EXAMPLE
SQL> REM SQL> START coe_xfr_sql_profile_0y521mggg73pk_3847062775.sql;
SQL> REM
SQL> REM NOTES
SQL> REM 1. Should be run as SYSTEM or SYSDBA.
SQL> REM 2. User must have CREATE ANY SQL PROFILE privilege.
SQL> REM 3. SOURCE and TARGET systems can be the same or similar.
SQL> REM 4. To drop this custom SQL Profile after it has been created:
SQL> REM EXEC DBMS_SQLTUNE.DROP_SQL_PROFILE('coe_0y521mggg73pk_3847062775');
SQL> REM 5. Be aware that using DBMS_SQLTUNE requires a license
SQL> REM for the Oracle Tuning Pack.
SQL> REM 6. If you modified a SQL putting Hints in order to produce a desired
SQL> REM Plan, you can remove the artifical Hints from SQL Text pieces below.
SQL> REM By doing so you can create a custom SQL Profile for the original
SQL> REM SQL but with the Plan captured from the modified SQL (with Hints).
SQL> REM
SQL> WHENEVER SQLERROR EXIT SQL.SQLCODE;
SQL> REM
SQL> VAR signature NUMBER;
SQL> VAR signaturef NUMBER;
SQL> REM
SQL> DECLARE
2 sql_txt CLOB;
3 h SYS.SQLPROF_ATTR;
4 PROCEDURE wa (p_line IN VARCHAR2) IS
5 BEGIN
6 DBMS_LOB.WRITEAPPEND(sql_txt, LENGTH(p_line), p_line);
7 END wa;
8 BEGIN
9 DBMS_LOB.CREATETEMPORARY(sql_txt, TRUE);
10 DBMS_LOB.OPEN(sql_txt, DBMS_LOB.LOB_READWRITE);
11 -- SQL Text pieces below do not have to be of same length.
12 -- So if you edit SQL Text (i.e. removing temporary Hints),
13 -- there is no need to edit or re-align unmodified pieces.
14 wa(q'[/* SQL Analyze(63,1) */ DELETE FROM PRTINSTANCEITER PII WHERE EX]');
15 wa(q'[ISTS ( SELECT 1 FROM PRTPROCESS PP WHERE PP.INSTANCEID = PII.INS]');
16 wa(q'[TANCEID||':'||PII.LASTITERATION AND CURRENTTIME < :B1 AND (STATU]');
17 wa(q'[S = 4 OR STATUS = 5)) AND ROWNUM <=10000]');
18 DBMS_LOB.CLOSE(sql_txt);
19 h := SYS.SQLPROF_ATTR(
20 q'[BEGIN_OUTLINE_DATA]',
21 q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
22 q'[OPTIMIZER_FEATURES_ENABLE('12.2.0.1')]',
23 q'[DB_VERSION('12.2.0.1')]',
24 q'[ALL_ROWS]',
25 q'[OUTLINE_LEAF(@"SEL$3BA1AD7C")]',
26 q'[UNNEST(@"SEL$1")]',
27 q'[OUTLINE(@"DEL$1")]',
28 q'[OUTLINE(@"SEL$1")]',
29 q'[INDEX_RS_ASC(@"SEL$3BA1AD7C" "PP"@"SEL$1" ("PRTPROCESS"."CURRENTTIME" "PRTPROCESS"."STATUS"))]',
30 q'[BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$3BA1AD7C" "PP"@"SEL$1")]',
31 q'[FULL(@"SEL$3BA1AD7C" "PII"@"DEL$1")]',
32 q'[LEADING(@"SEL$3BA1AD7C" "PP"@"SEL$1" "PII"@"DEL$1")]',
33 q'[USE_HASH(@"SEL$3BA1AD7C" "PII"@"DEL$1")]',
34 q'[END_OUTLINE_DATA]');
35 :signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
36 :signaturef := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt, TRUE);
37 DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
38 sql_text => sql_txt,
39 profile => h,
40 name => 'coe_0y521mggg73pk_3847062775',
41 description => 'coe 0y521mggg73pk 3847062775 '||:signature||' '||:signaturef||'',
42 category => 'DEFAULT',
43 validate => TRUE,
44 replace => TRUE,
45 force_match => FALSE /* TRUE:FORCE (match even when different literals in SQL). FALSE:EXACT (similar to CURSOR_SHARING) */ );
46 DBMS_LOB.FREETEMPORARY(sql_txt);
47 END;
48 /
PL/SQL procedure successfully completed.
SQL> WHENEVER SQLERROR CONTINUE
SQL> SET ECHO OFF;
SIGNATURE
---------------------
3905242471056150906
SIGNATUREF
---------------------
3905242471056150906
... manual custom SQL Profile has been created
not spooling currently
COE_XFR_SQL_PROFILE_0y521mggg73pk_3847062775 completed
SQL>
SQL>
Perfect, it ran successfully on the PRIMARY database, and the same will be transferred on to the STANDBY database by its own. Let’s check if its created or moved to the STANDBY database or not. Let’s query dba_sql_profiles to see if its there …
-- Lets check it on Standby
SQL> select NAME, SQL_TEXT, CREATED, STATUS from dba_sql_profiles order by created;
NAME SQL_TEXT CREATED STATUS
------------------------------ -------------------------------------------------------------------------------- ------------------------------ --------
coe_0y521mggg73pk_3847062775 DELETE FROM BIGTABDIXITTAB PII WHERE EXISTS ( SELECT 1 13-JAN-22 06.28.03.000000 AM ENABLED
So, the issue is fixed on the standby and the SQL immediately picked the right plan that we asked it to follow.