Hi Guys,
Recently someone asked me about Oracle debug 10027 trace event which we use in case of a deadlock scenario i.e. ORA-0060. Though the deadlock itself creates a trace file in DIAG directory, but 10027 trace event gives you a better control over the amount and type of DIAG information generated in response to the deadlock case. I mean the default trace file for deadlock (ora-60) contains cached cursors, a deadlock graph, process state info, current SQL Statements of the session involved and session wait history.
Event 10027 may be used to augment the trace information with a system state dump or a call stack in an attempt to find the root cause of the deadlocks. The minimum amount of trace information is written at level 1, at this level it will hardly contain deadlock graph and current AQL statements of the sessions involved.
In todays post I will try to simulate a deadlock scenario in one of my test box and will generate the 10027 trace with level 2 to get more information. Level 2 will give you cached cursors, process state info, session wait history for all sessions and the system state which is not possible in case of level 1. I am going to try with the Level 4 here as I want to get system state dump and don’t want to complicate this scenario.
I am going to create two tables – TableB is child table of TableA and some supporting objects to simulate the deadlock case.
Will try to simulate a deadlock scenario when insert/update/delete happens on TableB we need to sum the amount(amt) and then update it in TableA.total_amt column.
[oracle@oracleontario ~]$ sqlplus dixdroid/oracle90
SQL*Plus: Release 19.0.0.0.0 - Production on Thu Feb 16 03:58:12 2023
Version 19.15.0.0.0
Copyright (c) 1982, 2022, Oracle. All rights reserved.
Last Successful login time: Thu Feb 16 2023 03:04:03 -05:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.15.0.0.0
SQL> create table tableA (pk_id number primary key, total_amt number);
Table created.
SQL> create table tableB (pk_id number primary key, fk_id number references tableA(pk_id) not null, amt number);
Table created.
SQL>
SQL> CREATE OR REPLACE PACKAGE global_pkg
IS
fk_id tableA.pk_id%TYPE;
END global_pkg;
/
Package created.
SQL>
SQL> CREATE OR REPLACE TRIGGER tableB_ROW_TRG
BEFORE INSERT OR UPDATE OR DELETE
ON tableB
FOR EACH ROW
BEGIN
IF INSERTING OR UPDATING
THEN
global_pkg.fk_id := :new.fk_id;
ELSE
global_pkg.fk_id := :old.fk_id;
END IF;
END tableB_ROW_TRG;
/
Trigger created.
SQL>
SQL>
SQL>
SQL> CREATE OR REPLACE TRIGGER tableB_ST_trg
AFTER INSERT OR UPDATE OR DELETE
ON tableB
BEGIN
IF UPDATING OR INSERTING
THEN
UPDATE tableA
SET total_amt =
(SELECT SUM (amt)
FROM tableB
WHERE fk_id = global_pkg.fk_id)
WHERE pk_id = global_pkg.fk_id;
ELSE
UPDATE tableA
SET total_amt =
(SELECT SUM (amt)
FROM tableB
WHERE fk_id = global_pkg.fk_id)
WHERE pk_id = global_pkg.fk_id;
END IF;
END tableB_ST_trg;
Trigger created.
SQL>
SQL>
SQL> insert into tableA values (1, 0);
1 row created.
SQL> insert into tableA values (2, 0);
1 row created.
SQL> insert into tableB values (123, 1, 100);
1 row created.
SQL> insert into tableB values (456,1, 200);
1 row created.
SQL> insert into tableB values (789, 1, 100);
1 row created.
SQL> insert into tableB values (1011, 2, 50);
1 row created.
SQL> insert into tableB values (1213,2, 150);
1 row created.
SQL> insert into tableB values (1415, 2, 50);
1 row created.
SQL> commit;
Commit complete.
SQL>
Lets query the table and see the record count and next will delete an entry from tableB and won’t commit. At the same time will check the locks — TM lock on tableA.
SQL>
SQL> select * from tableA;
PK_ID TOTAL_AMT
---------- ----------
1 400
2 250
SQL> delete tableB where pk_id = 1415;
1 row deleted.
SQL> SELECT sid,
(SELECT username
FROM v$session s
WHERE s.sid = v$lock.sid)
uname,
TYPE,
id1,
id2,
(SELECT object_name
FROM user_objects
WHERE object_id = v$lock.id1)
nm
FROM v$lock
WHERE sid IN (SELECT sid
FROM v$session
WHERE username IN (USER));
SID UNAME TY ID1 ID2 NM
---------- ------------------------------ -- ---------- ---------- ------------------------------
440 DIXDROID AE 134 0
440 DIXDROID TX 589853 3078
440 DIXDROID TM 82284 0 TABLEB
440 DIXDROID TM 82282 0 TABLEA
I will now connect to another session (sesion 2) and delete from tableB and that will induce a locking scenario in the database and session 2 will go into hung/waiting state.
---- from session 2:
SQL>
SQL>
SQL> delete tableB where pk_id = 1213;
.....
.........
............. <HUNG > <HUNG > <HUNG > <HUNG > <HUNG >
Lets see more stats on the blocking session.
SQL>
SQL> SELECT (SELECT username
FROM v$session
WHERE sid = a.sid)
blocker,
a.sid,
' is blocking ',
(SELECT username
FROM v$session
WHERE sid = b.sid)
blockee,
b.sid
FROM v$lock a, v$lock b
WHERE a.block = 1 AND b.request > 0 AND a.id1 = b.id1 AND a.id2 = b.id2;
BLOCKER SID 'ISBLOCKING' BLOCKEE SID
-------------------- ---------- ------------- ------------------------------ ----------
DIXDROID 440 is blocking DIXDROID 427
And as expected 440 SID is now blocking 427. Now lets go back to the original (session 1) and will try to delete from tableB again and this will snipe the session 2 which is still in hung/wait state and will throw a deadlock error (ORA-00060). Though we have already collected blocking information from v$lock + v$session but to get more clarify about blocking sessions, I will set the 10027 trace event with level 2.
This will increase our chance of finding the root cause of the deadlocks. If the setting shall persist across instance startups, you need to use the initialization parameter EVENT.
i.e. EVENT=”10027 trace name contex forever, level 2″ otherwise use ALTER SYSTEM version of it.
------------------------------------------------------------
-- Execute below statement in session1
------------------------------------------------------------
delete tableB where pk_id = 1213;
---------------------------------------------------------------------------------------
-- Oracle is throwing deadlock error as below in session2
---------------------------------------------------------------------------------------
SQL>
SQL> ALTER SYSTEM SET EVENTS '10027 trace name context forever, level 2';
System altered.
SQL> delete tableB where pk_id = 1213;
delete tableB where pk_id = 1213
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "DIXDROID.TABLEB_ST_TRG", line 11
ORA-04088: error during execution of trigger 'DIXDROID.TABLEB_ST_TRG'
-- same captured in alert log too.
2023-02-16T04:02:57.648156-05:00
Errors in file /u01/app/oracle/diag/rdbms/dixitdb/dixitdb/trace/dixitdb_ora_29389.trc:
2023-02-16T04:03:00.019794-05:00
ORA-00060: Deadlock detected. See Note 60.1 at My Oracle Support for Troubleshooting ORA-60 Errors. More info in file /u01/app/oracle/diag/rdbms/dixitdb/dixitdb/trace/dixitdb_ora_29389.trc.
And we done, we were able to simulate the deadlock case in the database. Now lets dig into the trace file generated by the deadlock along with information flushed by 10027 trace event. It has all crucial information associated with the deadlock.
2023-02-16 04:13:34.115*:ksq.c@13192:ksqdld_hdr_dump():
DEADLOCK DETECTED ( ORA-00060 )
See Note 60.1 at My Oracle Support for Troubleshooting ORA-60 Errors
[Transaction Deadlock]
Deadlock graph:
------------Blocker(s)----------- ------------Waiter(s)------------
Resource Name process session holds waits serial process session holds waits serial
TX-00080009-00000C62-00000000-00000000 45 427 X 13193 24 440 X 22765
TX-0004001A-00000C1E-00000000-00000000 24 440 X 22765 45 427 X 13193
----- Information for waiting sessions -----
Session 427:
sid: 427 ser: 13193 audsid: 3390369 user: 115/DIXDROID
flags: (0x41) USR/- flags2: (0x40009) -/-/INC
flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/-
pid: 45 O/S info: user: oracle, term: UNKNOWN, ospid: 30174
image: oracle@oracleontario.ontadomain (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/1, ospid: 30173
machine: oracleontario.ontadomain program: sqlplus@oracleontario.ontadomain (TNS V1-V3)
application name: SQL*Plus, hash value=3669949024
current SQL:
UPDATE TABLEA SET TOTAL_AMT = (SELECT SUM (AMT) FROM TABLEB WHERE FK_ID = :B1 ) WHERE PK_ID = :B1
Session 440:
sid: 440 ser: 22765 audsid: 3380369 user: 115/DIXDROID
flags: (0x41) USR/- flags2: (0x40009) -/-/INC
flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/-
pid: 24 O/S info: user: oracle, term: UNKNOWN, ospid: 29847
image: oracle@oracleontario.ontadomain (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/2, ospid: 29846
machine: oracleontario.ontadomain program: sqlplus@oracleontario.ontadomain (TNS V1-V3)
application name: SQL*Plus, hash value=3669949024
current SQL:
delete tableB where pk_id = 1213
.....
.......
----- Current SQL Statement for this session (sql_id=duw5q5rpd5xvs) -----
UPDATE TABLEA SET TOTAL_AMT = (SELECT SUM (AMT) FROM TABLEB WHERE FK_ID = :B1 ) WHERE PK_ID = :B1
----- PL/SQL Call Stack -----
object line object
handle number name
0x8ed34c98 11 DIXDROID.TABLEB_ST_TRG
.......
.........
----- VKTM Time Drifts Circular Buffer -----
session 427: DID 0001-002D-000000D9 session 440: DID 0001-0018-0000001C
session 440: DID 0001-0018-0000001C session 427: DID 0001-002D-000000D9
Rows waited on:
Session 427: obj - rowid = 0001416A - AAAUFqAAHAAAMgvAAB
(dictionary objn - 82282, file - 7, block - 51247, slot - 1)
Session 440: obj - rowid = 0001416C - AAAUFsAAHAAAMg/AAE
(dictionary objn - 82284, file - 7, block - 51263, slot - 4)
.....
.......
Current Wait Stack:
0: waiting for 'enq: TX - row lock contention'
name|mode=0x54580006, usn<<16 | slot=0x4001a, sequence=0xc1e
wait_id=69 seq_num=70 snap_id=1
wait times: snap=18.173838 sec, exc=18.173838 sec, total=18.173838 sec
wait times: max=infinite, heur=18.173838 sec
wait counts: calls=6 os=6
in_wait=1 iflags=0x15a0
There is at least one session blocking this session.
Dumping 1 direct blocker(s):
inst: 1, sid: 440, ser: 22765
Dumping final blocker:
inst: 1, sid: 440, ser: 22765
There are 1 sessions blocked by this session.
Dumping one waiter:
inst: 1, sid: 440, ser: 22765
wait event: 'enq: TX - row lock contention'
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x80009
p3: 'sequence'=0xc62
row_wait_obj#: 82284, block#: 51263, row#: 4, file# 7
min_blocked_time: 0 secs, waiter_cache_ver: 33942
Wait State:
fixed_waits=0 flags=0x22 boundary=(nil)/-1
Session Wait History:
elapsed time of 0.004677 sec since current wait
........
............
24: USER ospid 29847 sid 440 ser 22765, waiting for 'enq: TX - row lock contention'
Cmd: DELETE
Blocked by inst: 1, sid: 427, ser: 13193
Final Blocker inst: 1, sid: 427, ser: 13193
.....
.............
45: USER ospid 30174 sid 427 ser 13193, waiting for 'enq: TX - row lock contention'
Cmd: UPDATE
Blocked by inst: 1, sid: 440, ser: 22765
Final Blocker inst: 1, sid: 440, ser: 22765
.........
.............
The history is displayed in reverse chronological order.
sample interval: 1 sec, max history 120 sec
---------------------------------------------------
[19 samples, 04:13:16 - 04:13:34]
waited for 'enq: TX - row lock contention', seq_num: 70
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x4001a
p3: 'sequence'=0xc1e
[14 samples, 04:13:01 - 04:13:15]
idle wait at each sample
[session created at: 04:13:01]
---------------------------------------------------
Sampled Session History Summary:
longest_non_idle_wait: 'enq: TX - row lock contention'
[19 samples, 04:13:16 - 04:13:34]
---------------------------------------------------
.........
..........
Virtual Thread:
kgskvt: 0x9b0e7f10, sess: 0x9ccdd4b8, pdb: 0, sid: 427 ser: 13193
vc: (nil), proc: 0x9db42028, idx: 427
consumer group cur: OTHER_GROUPS (pdb 0) (upd? 0)
mapped: DEFAULT_CONSUMER_GROUP, orig: (pdb 0)
vt_state: 0x2, vt_flags: 0xE030, blkrun: 0, numa: 1
inwait: 1, wait event: 307, posted_run: 0
lastmodrngcnt: 0, lastmodrngcnt_loc: '(null)'
lastmodrblcnt: 0, lastmodrblcnt_loc: '(null)'
location where insched last set: kgskbwt
location where insched last cleared: kgskbwt
location where inwait last set: kgskbwt
location where inwait last cleared: NULL
is_assigned: 1, in_scheduler: 0, insched: 0
vt_active: 0 (pending: 1)
vt_pq_active: 0, dop: 0, pq_servers (cur: 0 cg: 0)
ps_allocs: 0, pxstmts (act: 0, done: 0 cg: 0)
used quanta (usecs):
stmt: 57272, accum: 0, mapped: 0, tot: 57272
exec start consumed time lapse: 164377 usec
exec start elapsed time lapse: 18179246 usec
idle time: 0 ms, active time: 57272 (cg: 57272) usec
last updnumps: 0 usec, active time (pq: 0 ps: 0) ms
cpu yields: stmt: 0, accum: 0, mapped: 0, tot: 0
cpu waits: stmt: 0, accum: 0, mapped: 0, tot: 0
cpu wait time (usec): stmt: 0, accum: 0, mapped: 0, tot: 0
ASL queued time outs: 0, time: 0 (cur 0, cg 0)
PQQ queued time outs: 0, time: 0 (cur 0, cg 0)
Queue timeout violation: 0
calls aborted: 0, num est exec limit hit: 0
KTU Session Commit Cache Dump for IDLs:
KTU Session Commit Cache Dump for Non-IDLs:
----------------------------------------
........
.............
Hope It Helped!
Prashant Dixit