Tales From A Lazy Fat DBA

Its all about Databases & their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Script to measure the Source & Contribution of any Wait Event within AWR snapshots.

Posted by FatDBA on January 1, 2018

🤘 Happy New Year Everyone! 🤘

So, here i am with the maiden post of this year.
Today i will discuss about a script that might be handy and useful while you are investigating any performance problem with the database.
In fact this is one of my favorite script which i always considers to run at a point when we have identified the leading wait event and the time-frame of the spike or for any transient variation in system behavior. This script helps you to measure the source and contribution or the impact of any specific wait event.

The script is a join between ASH views and historical views (dba_hist_active_sess_history AND dba_hist_snapshot) and provides you stats based on Average Number of Active Sessions (AAS) were waiting on this event during the period and with what total contribution in terms of percentage.

Lets work on a use case.
Suppose you have a system with huge User IOs happening, as conformed by any report or tool (AWR, ASH, Via any Script, OEM etc.)

For example using AWR we found one of the User IO wait class event ‘db file scattered read’ was leading the ‘Top Foreground Wait Events’ charts on a specific time when we have the high User IO load on system.

                                           Total Wait       Avg   % DB Wait
Event                                Waits Time (sec)      Wait   time Class
------------------------------ ----------- ---------- --------- ------ --------
db file scattered read              10,282       82.8    41.50ms  43.9 User I/O

So, now you have identified the leading wait event and the timeframe with maximum intensity or frequency, Lets call the script.

Script Inputs:
Example:

Enter value for event_class:
User I/O
Enter value for event_name: db file scattered read
Enter value for begin_snap: 193
Enter value for end_snap: 200
Enter value for dbid: 2896132084

WITH
events AS (
SELECT /*+ MATERIALIZE NO_MERGE */
       SUBSTR(TRIM(h.sql_id||' '||h.program||' '||
       CASE h.module WHEN h.program THEN NULL ELSE h.module END), 1, 128) source,
       h.dbid,
       COUNT(*) samples
  FROM dba_hist_active_sess_history h,
       dba_hist_snapshot s
 WHERE h.wait_class = TRIM('&Event_Class') AND h.event = TRIM('&Event_Name')
   AND h.snap_id BETWEEN &Begin_Snap AND &End_Snap
   AND h.dbid = &dbid
   AND s.snap_id = h.snap_id
   AND s.dbid = h.dbid
   AND s.instance_number = h.instance_number
 GROUP BY
       h.sql_id,
       h.program,
       h.module,
       h.dbid
 ORDER BY
       3 DESC
),
total AS (
SELECT SUM(samples) samples,
       SUM(CASE WHEN ROWNUM > 15 THEN samples ELSE 0 END) others
  FROM events
)
SELECT e.source,
       e.samples,
       ROUND(100 * e.samples / t.samples, 1) percent,
       (SELECT DBMS_LOB.SUBSTR(s.sql_text, 1000, 1) FROM dba_hist_sqltext s WHERE s.sql_id = SUBSTR(e.source, 1, 13) AND s.dbid = e.dbid AND ROWNUM = 1) sql_text
  FROM events e,
       total t
 WHERE ROWNUM  0.1
 UNION ALL
SELECT 'Others',
       others samples,
       ROUND(100 * others / samples, 1) percent,
       NULL sql_text
  FROM total
 WHERE others > 0
   AND ROUND(100 * others / samples, 1) > 0.1;

Below is the output of the query.

SOURCE                                                                      SAMPLES      PERCENT       SQL_TEXT
------------------------------------------------------------------------  ---------- ---------------  -------------------------------
5av23g8w7f3ka sqlplus@dixitlab.localdomain (TNS V1-V3)                          1          45.23       select * from dixit.bigtab
9a7gbkahasj1a sqlplus@dixitlab.localdomain (TNS V1-V4)                          1           5          select * from dixit.gianttable
OTHERS                                                                          14         49.77

Okay so result shows one of the SQL statement with SQL id 5av23g8w7f3ka was responsible for more than 45% of these waits.
So, now you have the evidence and can start troubleshooting this specific statement to reduce the USER IOs.

Hope It Helps
Prashant Dixit

Advertisement

3 Responses to “Script to measure the Source & Contribution of any Wait Event within AWR snapshots.”

  1. […] And you have identified the major sources contributing towards this row locking during the probe period of last 7 days till now. You can do a join on dba_hist_active_sess_history and dba_hist_snapshot to get this historical information — Read my previous article on how to get this past information from AWR repository. […]

  2. Vijay Shankar said

    Good articles thanks fir the post nice to learn

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: