Tales From A Lazy Fat DBA

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Posts Tagged ‘performance’

Data Pump Troubleshooting Tips – My favorite 6

Posted by FatDBA on October 26, 2024

There are numerous utilities, options, and methods available for migrating and moving data between Oracle databases, yet Oracle Data Pump remains one of the most widely used tools. A significant number of DBAs are very comfortable with Data Pump, as it has been a trusted utility for a long time (originally as exp and imp). Its stability, user-friendliness, and robust capabilities make it a top choice for handling large data migrations, backup, and restore operations.

However, one area where DBAs still often face challenges is troubleshooting when issues arise. When a Data Pump job fails, performs poorly, or behaves unexpectedly, it can be unclear where to start, what logs to review, or what checks to perform. Many find it difficult to pinpoint the source of the problem and make adjustments to optimize performance or resolve issues.

Today’s post focuses on troubleshooting Data Pump performance and functionality issues, sharing the steps I typically follow when diagnosing problems. We’ll cover key areas to investigate, like log file analysis, parameter tuning, network considerations, and common bottlenecks. These steps aim to provide a practical guide to understanding and resolving Data Pump issues and optimizing your data movement processes.

Option 1: Generate an AWR Report to Assess Database Performance

Start by generating an AWR (Automatic Workload Repository) report to gain insight into the database’s overall performance during the relevant period. Adjusting the AWR snapshot interval to 15 minutes is recommended for a more granular view. This approach reduces the chances of averaging out short performance spikes, allowing you to capture transient issues more effectively.

exec dbms_workload_repository.modify_snapshot_settings(null, 15);
exec dbms_workload_repository.create_snapshot;

Option 2: Enable SQL Trace for Data Pump Processes or Specific SQL IDs
Optionally, you can enable SQL trace for the Data Pump processes (dm for the master process and dw for worker processes) or for specific SQL statements by SQL ID. This will help isolate SQL-level performance issues affecting the Data Pump job.

alter system set events 'sql_trace {process: pname = dw | process: pname = dm} level=8';
alter system set events 'sql_trace[SQL: 8krc88r46raff]';

Option 3: Run Data Pump Job with Detailed Trace Enabled
For enhanced tracing, run the Data Pump job with additional trace options, which provide more comprehensive output. Including metrics=yes, logtime=all, and trace=1FF0300 in the command enables detailed logging of both timing and activity metrics. Tracing can be enabled by specifying an 7 digit hexadecimal mask in the TRACE parameter of Export DataPump (expdp) or Import DataPump (impdp). The first three digits enable tracing for a specific Data Pump component, while the last four digits are usually: 0300.

expdp ... metrics=yes logtime=all trace=1FF0300
impdp ... metrics=yes logtime=all trace=1FF0300

Data Pump tracing can also be started with a line with EVENT 39089 in the initialization parameter file. This method should only be used to trace the Data Pump calls in an early state, e.g. if details are needed about the DBMS_DATAPUMP.OPEN API call. Trace level 0x300 will trace all Data Pump client processes.

-- Enable event
ALTER SYSTEM SET EVENTS = '39089 trace name context forever, level 0x300' ;
-- Disable event
ALTER SYSTEM SET EVENTS = '39089 trace name context off' ;

Option 4: Review Data Pump Trace Files
Locate and analyze the Data Pump trace files stored in the Oracle trace directory. The master control process file names typically contain *dm*, while worker process files include *dw*. These files provide insights into the processes, job details, and potential error sources during execution.

Option 5: Activate SQL_TRACE on specific Data Pump process with higher trace level.
Lets assume we see that the Data Pump Master process (DM00) has SID: 143 and serial#: 50 and the Data Pump Worker process (DW01) has SID: 150 and serial#: 17. These details can be used to activate SQL tracing in SQL*Plus with DBMS_SYSTEM.SET_EV, e.g.:

-- In SQL*Plus, activate SQL tracing with DBMS_SYSTEM and SID/SERIAL#  
-- Syntax: DBMS_SYSTEM.SET_EV([SID],[SERIAL#],[EVENT],[LEVEL],'') 

-- Example to SQL_TRACE Worker process with level 4 (Bind values):   
execute sys.dbms_system.set_ev(150,17,10046,4,''); 

-- and stop tracing: 
execute sys.dbms_system.set_ev(150,17,10046,0,'');  


-- Example to SQL_TRACE Master Control process with level 8 (Waits):  
execute sys.dbms_system.set_ev(143,50,10046,8,'');  

-- and stop tracing:  
execute sys.dbms_system.set_ev(143,50,10046,0,'');

Option 6: Use the Data Pump Log Analyzer

I’ve personally used the Data Pump Log Analyzer for some time and have found it to be incredibly user-friendly, making it simple to understand the performance and runtime statistics of Data Pump jobs. This tool is highly effective in streamlining troubleshooting efforts, quickly identifying bottlenecks, and delivering clear insights into job performance. It’s a fantastic addition to a DBA’s toolkit and provides valuable capabilities that aren’t typically found in standard scripts. The Data Pump Log Analyzer has been tested with Data Pump log files across various database versions, including those generated by Data Pump client (expdp/impdp), Zero Downtime Migration (ZDM), OCI Database Migration Service (DMS), and Data Pump API (DBMS_DATAPUMP).The Data Pump Log Analyzer is a Python-based command-line utility designed for in-depth analysis of Oracle Data Pump log files. It goes beyond basic log review by offering detailed, structured insights into key performance metrics, errors, and process details. This tool can be particularly useful for DBAs needing a quick and comprehensive view of Data Pump job behavior, helping with issue diagnosis and performance optimization. Link to read and download or a more detailed guide on it’s usage Link

With the Data Pump Log Analyzer, you get:

  • Detailed Operations and Processing Metrics: Granular information on data operations for pinpoint analysis.
  • Error and ORA- Code Analysis: Summaries and explanations of encountered errors for easier troubleshooting.
  • Object-Type Breakdown and Processing Times: Insight into performance by object type, aiding in performance tuning.
  • Data Pump Worker Performance: Analyzes individual worker processes for any lagging tasks.
  • Summarized Schema, Table, Partition Details: Overview of data handled by each schema, table, or partition.
  • Instance-Based Data Analysis (for Oracle 21c and later): Statistics by instance for performance evaluation in multitenant setups.
  • Flexible Output Options: Filter, sort, and export analysis results to text or HTML for efficient sharing and record-keeping.

One below is with basic syntax to get operational details.

$ python3 dpla.py import.log
========================
Data Pump Log Analyzer
========================
...
Operation Details
~~~~~~~~~~~~~~~~~
Operation: Import
Data Pump Version: 19.23.0.0.0
DB Info: Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0
Job Name: FATDBAJOB1
Status: COMPLETED
 Processing: -
Errors: 1301
 ORA- Messages: 1267
Start Time: 2024-08-21 01:30:45
End Time: 2024-08-21 11:43:11
Runtime: 35:03:06
Data Processing
~~~~~~~~~~~~~~~
Parallel Workers: 104
Schemas: 47
Objects: 224718
Data Objects: 188131
Overall Size: 19.11 TB

Use flag ‘-e’ to view all ORA- messages encountered during the Data Pump operation, or optionally you can filter our specific errors as well i.e. ‘-e ORA-39082 ORA-31684′.

python3 dpla.py import.log -e
========================
Data Pump Log Analyzer
========================
...
ORA- MESSAGES DETAILS
~~~~~~~~~~~~~~~~~~~~~
(sorted by count):
Message Count
--------------------------------------------------------------------------------------------------- ---------
ORA-39346: data loss in character set conversion for object COMMENT 919
ORA-39082: Object type PACKAGE BODY created with compilation warnings 136
ORA-39346: data loss in character set conversion for object PACKAGE_BODY 54
ORA-39082: Object type TRIGGER created with compilation warnings 36
ORA-39082: Object type PROCEDURE created with compilation warnings 29
ORA-31684: Object type USER already exists 27
ORA-39111: Dependent object type PASSWORD_HISTORY skipped, base object type USER already exists 27
ORA-39346: data loss in character set conversion for object PACKAGE 18
ORA-39082: Object type PACKAGE created with compilation warnings 10
ORA-39082: Object type VIEW created with compilation warnings 7
ORA-39346: data loss in character set conversion for object PROCEDURE 2
ORA-39082: Object type FUNCTION created with compilation warnings 2
--------------------------------------------------------------------------------------------------- ---------
Total 1267
--------------------------------------------------------------------------------------------------- ---------


Use flag ‘-o’ to see details about which types of database objects were involved in the Data Pump operation.

python3 dpla.py import.log -o
========================
Data Pump Log Analyzer
========================
...
Object                                  Count      Seconds      Workers     Duration
----------------------------------      ---------- -----------  ----------- ------------
SCHEMA_EXPORT/TABLE/TABLE_DATA             188296    6759219         128       6759219
CONSTRAINT                                    767      37253           1         37253
TABLE                                        2112       3225          51           156
COMMENT                                     26442        639         128            18
PACKAGE_BODY                                  197        125         128             5
OBJECT_GRANT                                 5279         25           1            25
TYPE                                          270          6           1             6
ALTER_PROCEDURE                               149          5           2             3
ALTER_PACKAGE_SPEC                            208          4           3             2
PACKAGE                                       208          3           3             1
PROCEDURE                                     149          2           2             1

...
---------------------------------- ---------- ----------- ----------- ------------
Total 224755 6800515 128 6796697
---------------------------------- ---------- ----------- ----------- ------------


Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , , , , , , , , , | Leave a Comment »

AWR Report Improvements in Oracle 23AI : New Replication Performance Sections

Posted by FatDBA on October 20, 2024

Oracle 23AI is undoubtedly packed with a range of both small and significant features. Every now and then, I stumble upon some standout additions, and this time, it’s the improvements in AWR reports that caught my attention. One of the key enhancements is the inclusion of more replication-related information captured and presented in these reports.

With this new update, AWR queries and reports are now streamlined and enhanced, offering a simplified yet detailed view of the replication process. This makes troubleshooting replication performance issues much easier. You can now categorize problems into workload-related issues, database-side misconfigurations (such as slow replication SQL due to missing indexes or incorrect database parameter settings), or performance bottlenecks either within the database or in Oracle GoldenGate processes.

The enhanced replication section includes a more comprehensive “Replication System Resource Usage” area. This shows the system resource consumption for all Oracle GoldenGate replication processes, whether they are foreground or background, and breaks it down for each Extract and Replicat process.

Additionally, a dedicated section for replication-related Top SQL statistics has been added, making it much easier to identify performance issues tied to specific SQL statements.

There’s also a separate section for top wait events related to replication processes, enabling faster identification and resolution of replication-related performance bottlenecks.

Lastly, the replication-related sections have been reorganized to present statistics by individual Extracts and different Replicat types, offering clearer insights into replication performance.

These enhancements are really great when it comes to monitoring and improving the performance of replication processes.

Replication System Resource Usage:
Metrics for Oracle GoldenGate replication now include detailed information such as the process name, type, and the number of sessions for each sub-component. Performance statistics are categorized based on the specific functionality of these sub-components and grouped by the process name. This enhancement allows for more granular monitoring of resource usage, giving visibility into the performance of each Extract and Replicat process, along with their individual sub-components.

Replication Wait Events:
There is a dedicated section within the report for replication related wait events (both foreground and background types).

Replication Top SQLs: This highlights the SQL statements executed by various replication processes, organized by key performance metrics such as Elapsed Time, CPU Time, and Execution Count. These sections in the Top SQL report provide a detailed breakdown, allowing administrators to quickly identify the most resource-intensive SQL queries impacting replication performance.

In addition, the replication process name is included for each SQL statement, making it easy to trace which process—whether Extract or Replicat—is responsible for executing the query. This level of granularity helps streamline troubleshooting efforts, as it enables pinpointing of inefficient SQL statements and their impact on replication. The inclusion of these metrics ensures that administrators have the necessary visibility to optimize SQL execution within replication processes, leading to improved overall system performance.

Oracle GoldenGate Extract Performance Metrics : This section explains more in Extract and Capture processes. It provides some really valuable information like ..

  • Lag (in seconds) derived by the time when the most recent LCR was created and received (measured both at the beginning and end)
  • The amount of redo data mined (in bytes).
  • Redo Mined Rate
  • Number of bytes sent by the Capture process to the Extract process since the last time the Extract process attached to the Capture process.
  • Number of LCRs delivered to the Capture from Logminer …. and much more … ….

Oracle GoldenGate Integrated Replicat:

Oracle GoldenGate Replicat: This section presents comprehensive performance statistics for Oracle GoldenGate’s classic, coordinated, and parallel Replicat processes. The SQL operation statistics are aggregated and displayed for each individual Replicat process, providing an overall view of their performance.

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , , , , | Leave a Comment »

Addressing Stuck Undo Segments : How to Safely Drop Problematic Undo Segments

Posted by FatDBA on October 14, 2024

Hi All,

This post discusses an intriguing issue we encountered recently on a 19.22 Oracle database following a CDB restart. After the restart, we observed a peculiar problem where all sessions performing DDL commands were getting locked and hung at the PDB level. This behavior was affecting the entire database, essentially halting all DDL operations.

During our analysis, we discovered that the SMON process was waiting on a latch, leading to high CPU resource consumption. Furthermore, we noticed that the MMON process was blocking SMON, causing additional delays. The alert log revealed multiple error messages, which further complicated the diagnosis.

This issue required a deep dive into Oracle’s background processes and system-level contention to resolve, as it was causing a significant disruption to database operations.

-- Fragments from alert log, smon/mmon process logs and standard diag traces.
kcbzib: encounter logical error ORA-1410, try re-reading from other mirror..
TRCMIR:kcf_reread     :start:3722012:0:+DATA/CDBMONKEY/AA82C21DD440449FE053B4146E0AA55B/DATAFILE/tablespace_test_dataaa.xxx.xxxx
TRCMIR:kcf_reread     :done :3722012:0:+DATA/CDBMONKEY/AA82C21DD440449FE053B4146E0AA55B/DATAFILE/tablespace_test_dataaa.xxx.xxxxx
kcbzibmlt: encounter logical error ORA-1410, try re-reading from other mirror..

---> SMON: Parallel transaction recovery tried
30317 error message received from server=1.70(P01Y) qref:0x8de103cf0 qrser:5121 qrseq:3 mh:0x97fdf9460
Parallel Transaction recovery caught exception 12801
Parallel Transaction recovery caught error 30317

*** 2024-08-19T20:38:23.297997-04:00 (PWS1E(3))
Parallel Transaction recovery caught exception 30319
Parallel Transaction recovery caught error 30319

*** 2024-08-19T20:38:50.613855-04:00 (PWS1E(3))
30317 error message received from server=1.57(P01L) qref:0x8de109fe8 qrser:11265 qrseq:3 mh:0x95fccd3c8
Parallel Transaction recovery caught exception 12801
Parallel Transaction recovery caught error 30317
Parallel Transaction recovery caught exception 30319

TEST1E(3):about to recover undo segment 98 status:6 inst:0
TEST1E(3):mark undo segment 98 as available status:6 ret:1

TEST1E(3):about to recover undo segment 46 status:6 inst:0
TEST1E(3):mark undo segment 46 as available status:6 ret:1

The logs and trace files also highlighted an issue with two specific undo segments, identified by segment numbers 98 and 46, from the UNDO tablespace. Upon further investigation, we found that both segments were in a ‘RECOVERING’ state. What was particularly concerning was that the recovery process for these segments was progressing extremely slowly, with the v$fast_start_transactions view showing an unusually high estimated recovery time.

In fact, based on the progress we monitored, it seemed like the recovery process wasn’t moving forward at all and appeared to be stuck in some kind of loop. This stagnation in recovery added to the overall system’s delay, compounding the performance issues we were already facing. It became clear that this problem was a significant bottleneck in restoring the database to normal operation.

SQL> select * from V$FAST_START_TRANSACTIONS;

USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL PID CPUTIME PARENTUSN PARENTSLT PARENTSEQ XID PXID RCVSERVERS CON_ID
---------- ---------- ---------- ---------------- -------------- --------------- ---------- ---------- ---------- ---------- 
46 46 2313064 RECOVERING 505 24992423 77 5586 0 0 0 10001000684B2300 0000000000000000 1 0
98 25 1352150 RECOVERING 0 226231 78 5586 0 0 0 30001900D6A11400 0000000000000000 1 0
	

SQL> SELECT segment_name, tablespace_name  FROM dba_rollback_segs  WHERE segment_id IN (98, 46);

SEGMENT_NAME		       TABLESPACE_NAME
------------------------------ ------------------------------
_SYSSMU46_5249279471$	       UNDOTEST1
_SYSSMU98_5249279471$	       UNDOTEST1

We attempted to take the segments offline and ultimately drop them, as they were associated with a materialized view (MV) refresh and a bulk insert statement. These operations were part of an ad-hoc activity, so it was acceptable for them to be missed. However, despite our efforts, the segments remained in a ‘PARTLY AVAILABLE’ state, leaving us with no option to drop or take them offline. This left us in a situation where we were essentially stuck, unable to proceed with dropping the segments or the associated tablespace. The inability to release these segments further complicated our recovery efforts.

We’d even checked the status of the those two undo segments using base table x$ktuxe and the KTUXESTA (Status) was coming as ‘DEAD’, means the transaction has failed but is still holding resources and that gave ius more confidence about what happened under the hood.

SQL> select min(sample_time), max(sample_time), sql_id, xid, count(1) from dba_hist_active_sess_history 
where xid in ('10001000684B2300','30001900D6A11400') group by sql_id, xid;

MIN(SAMPLE_TIME) MAX(SAMPLE_TIME) SQL_ID XID COUNT(1)
--------------------------------------------------------------------------- -------------------
15-SEP-24 01.22.25.446 PM 15-SEP-24 05.51.22.340 PM 30001900D6A11400 3213
15-SEP-24 10.22.46.218 AM 15-SEP-24 01.22.15.440 PM ac5hhandj9fh1 30001980D6A11400 2158 --------------> 
13-SEP-24 08.31.54.374 PM 14-SEP-24 02.53.45.723 AM annqr822no0a1 10001090684B2300 4578 -------------->
14-SEP-24 02.53.55.731 AM 15-SEP-24 05.51.22.340 PM 10001000684B2300 27781

SQL> select sql_id, sql_text from dba_hist_sqltext where sql_id in ('annqr822no0a1','ac5hhandj9fh1o');

SQL_ID SQL_TEXT
------------- --------------------------------------------------------------------------------
annqr822no0a1 INSERT INTO monkey.ah_ah3_xaa_131C (
ac5hhandj9fh1o /* MV_REFRESH (INS) */INSERT /*+ BYPASS_RECURSIVE_CHECK */ INTO "monkey"."test_


SQL> ALTER ROLLBACK SEGMENT "_SYSSMU46_5249279471$" offline;

Rollback segment altered.

SQL> ALTER ROLLBACK SEGMENT "_SYSSMU98_5249279471$" offline;

Rollback segment altered.


SQL> SELECT segment_name, status, tablespace_name
FROM dba_rollback_segs
WHERE segment_name IN ('_SYSSMU98_5249279471$', '_SYSSMU46_5249279471$');
  2    3
SEGMENT_NAME		       STATUS		TABLESPACE_NAME
------------------------------ ---------------- ------------------------------
_SYSSMU46_5249279471$	       PARTLY AVAILABLE UNDOTEST1
_SYSSMU98_5249279471$	       PARTLY AVAILABLE UNDOTEST1


SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */ KTUXESTA Status, KTUXECFL Flags FROM x$ktuxe 
WHERE ktuxesta!='INACTIVE' AND ktuxeusn=98;

KTUXEUSN KTUXESLT KTUXESQN STATUS FLAGS
---------- ---------- ---------- ---------------- ------------------------
98 25 1352150 ACTIVE DEAD

SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */ KTUXESTA Status, KTUXECFL Flags FROM x$ktuxe 
WHERE ktuxesta!='INACTIVE' AND ktuxeusn=46;

KTUXEUSN KTUXESLT KTUXESQN STATUS FLAGS
---------- ---------- ---------- ---------------- ------------------------
46 46 2313064 ACTIVE DEAD


Given that this is a critical production system, we couldn’t afford to wait for a complete recovery of the affected undo segments. To mitigate the issue, we created a new undo tablespace and designated it as the default for the database. This action enabled us to resume normal operations while the recovery of the problematic segments continued in the background.

However, the underlying mystery remains: why are we unable to drop these segments in the production environment? To investigate further, we cloned the production database and set up a test instance. To our surprise, we replicated the same situation, where both segments 46 and 98 appeared again in a ‘PARTLY AVAILABLE’ state, providing no options for us to drop them.

In our exploration, we first enabled the FAST_START_PARALLEL_ROLLBACK parameter, which determines the number of processes that participate in parallel rollback during transaction rollbacks, typically following an instance failure or a large manual rollback. We set this parameter to HIGH, as it significantly accelerates the rollback process for large transactions, particularly in scenarios involving instance failures or extensive operations requiring manual rollback.

Additionally, we experimented with the undocumented parameter _OFFLINE_ROLLBACK_SEGMENTS, which is intended to control the state of rollback segments.
Note: When dealing with hidden or undocumented parameters, it’s crucial to consult with Oracle support or rely on prior experience, as these settings can lead to unforeseen consequences in production environments.

Ran below query to dynamically get alter statements for segments which we need to set offline.

SQL>  select 'ALTER SYSTEM SET "_OFFLINE_ROLLBACK_SEGMENTS"='||listagg(''''||segment_name||'''',',') WITHIN GROUP (ORDER BY segment_name)||' scope=spfile;' from dba_rollback_segs 
where tablespace_name='UNDOTBS1' and status ='NEEDS RECOVERY'; 

Alter System set "_OFFLINE_ROLLBACK_SEGMENTS"="_SYSSMU98_5249279471$" scope=spfile;
Alter System set "_OFFLINE_ROLLBACK_SEGMENTS"="_SYSSMU46_5249279471$" scope=spfile;

Shutdown the database and startup as normal after setting the above parameter. 

shutdown immediate;
startup;

and finally the drop statements. 
SQL> select 'drop rollback segment "'||segment_name||'";' from dba_rollback_segs 
where tablespace_name='UNDOTBS1' and status ='NEEDS RECOVERY';

drop rollback segment "_SYSSMU98_5249279471$";
drop rollback segment "_SYSSMU46_5249279471$";
 

Issue above two drop rollback segemnts from the dfatabase and bounce the database again anf finally drop the problematic undo tablespace. Do not forget to reset the ‘_OFFLINE_ROLLBACK_SEGMENTS’ parameter and a one more bounce again.

SQL>  shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>  startup;


SQL>  drop tablespace UNDOTEST1;
Tablespace dropped.


SQL>  Alter System reset "_OFFLINE_ROLLBACK_SEGMENTS";
System altered.

SQL>  shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>  startup;

Although it was a lengthy and demanding process involving numerous experiments, the results were ultimately positive. We encountered no errors and successfully dropped the problematic segments, freeing the database from the issues that had plagued it. This experience not only resolved our immediate concerns but also provided valuable insights into managing similar challenges in the future.

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , , , , | Leave a Comment »

Detecting SQL Elapsed Time Variations – A Script to identify potential plan flips

Posted by FatDBA on October 11, 2024

In the world of database management, one of the most important aspect is understanding and optimizing SQL performance. A common challenge that we DBAs face is runtime variations, particularly in elapsed times of SQL executions. These variations often stem from plan flips, where the database optimizer chooses different execution plans for the same query under similar conditions. This can lead to unpredictable performance and inefficiencies. To tackle this issue, I have developed a script that systematically identifies SQL queries with significant runtime variations, providing valuable insights for performance tuning.

The primary objective of the script is to analyze SQL execution statistics to identify queries that exhibit runtime elapsed time variations. By focusing on these variations, we can pinpoint potential performance bottlenecks, allowing us to take corrective action before they impact application performance. The script achieves this through a series of well-structured SQL queries that aggregate execution statistics, helping to reveal patterns that might indicate plan flips.

The script utilizes cursors to fetch SQL execution statistics from the DBA_HIST_SQLSTAT and DBA_HIST_SNAPSHOT views. It only considers SQLs that have been executed a sufficient number of times, ensuring that we focus on queries that significantly impact performance. By calculating the standard deviation of elapsed times, the script identifies queries that experience substantial fluctuations. A high standard deviation in execution time often points to inconsistencies, potentially indicating plan flips.

The output from the script is structured and formatted for readability, presenting crucial metrics like SQL_ID, execution counts, minimum and maximum elapsed times, and normalized standard deviation values. This makes it easy to spot queries that may require further investigation or optimization.

Talking about the PHV/Plan flips, it occur when the optimizer selects different plans for executing the same query, often leading to varied performance. This can happen due to several reasons, including (some of the top ones) :

  • Changes in Statistics: Outdated or inaccurate statistics can mislead the optimizer into choosing suboptimal plans.
  • Variable Bind Values: Different bind values can cause the optimizer to generate different execution plans.
  • Dynamic Data Changes: Changes in the underlying data can affect the choice of execution plans.
  • Index changes, parameter ‘sniffings’, skewness of data and many more …


Once identified, we can further analyze and delve deeper on reasons behind the flips and if required can consider SQL Profiles or SQL Plan Baselines to stabilize execution plans for critical queries.

SET SERVEROUTPUT ON;
SET LINESIZE 155
COL execs FOR 999,999,999
COL min_etime FOR 999,999.99
COL max_etime FOR 999,999.99
COL avg_etime FOR 999,999.999
COL avg_lio FOR 999,999,999.9
COL norm_stddev FOR 999,999.9999
COL begin_interval_time FOR A30
COL node FOR 99999
SET PAGESIZE 50000 
SET LINESIZE 150
BREAK ON report
BREAK ON plan_hash_value ON startup_time SKIP 1
 
-- Main PL/SQL Block
DECLARE
    -- Cursor for the first SQL query to get SQL_IDs
    CURSOR sql_cursor IS
        SELECT sql_id
        FROM (
            SELECT sql_id, SUM(execs) AS execs,
                   MIN(avg_etime) AS min_etime,
                   MAX(avg_etime) AS max_etime,
                   stddev_etime / MIN(avg_etime) AS norm_stddev
            FROM (
                SELECT sql_id, plan_hash_value, execs, avg_etime,
                       stddev(avg_etime) OVER (PARTITION BY sql_id) AS stddev_etime
                FROM (
                    SELECT sql_id, plan_hash_value,
                           SUM(NVL(executions_delta, 0)) AS execs,
                           (SUM(elapsed_time_delta) / DECODE(SUM(NVL(executions_delta, 0)), 0, 1, SUM(executions_delta)) / 1000000) AS avg_etime
                    FROM DBA_HIST_SQLSTAT S
                    JOIN DBA_HIST_SNAPSHOT SS ON ss.snap_id = S.snap_id
                    WHERE ss.instance_number = S.instance_number
                      AND executions_delta > 0
                      AND elapsed_time_delta > 0
                      AND ss.begin_interval_time >= SYSDATE - 7  -- Last 7 days
                      AND s.snap_id > NVL('&earliest_snap_id', 0)
                    GROUP BY sql_id, plan_hash_value
                )
            )
            GROUP BY sql_id, stddev_etime
            HAVING stddev_etime / MIN(avg_etime) > NVL(TO_NUMBER('&min_stddev'), 2)
            AND MAX(avg_etime) > NVL(TO_NUMBER('&min_etime'), .1)
            ORDER BY norm_stddev
        );
 
    sql_record sql_cursor%ROWTYPE;
    found_sql_id BOOLEAN := FALSE;  -- Declare the variable here
 
BEGIN
    -- Print results of the additional query
    DBMS_OUTPUT.PUT_LINE('****************************************************************************************************'); -- Add two blank lines after each result set
    DBMS_OUTPUT.PUT_LINE('****************************************************************************************************'); -- Add two blank lines after each result set
	
    DBMS_OUTPUT.PUT_LINE('Top SQLs in last 7 days with change in elapsed times and with standard deviation >= 2.');
    DBMS_OUTPUT.PUT_LINE('****************************************************************************************************'); -- Add two blank lines after each result set
    DBMS_OUTPUT.PUT_LINE('****************************************************************************************************'); -- Add two blank lines after each result set
	
    DBMS_OUTPUT.PUT_LINE('SQL_ID         | Execs     | Min Elapsed Time | Max Elapsed Time | Norm Stddev');
    DBMS_OUTPUT.PUT_LINE('----------------|-----------|------------------|------------------|------------');
    -- DBMS_OUTPUT.PUT_LINE('............'); -- Add two blank lines after each result set
    -- DBMS_OUTPUT.PUT_LINE('............'); -- Add two blank lines after each result set

 
    FOR r IN (
        SELECT sql_id, SUM(execs) AS execs,
               MIN(avg_etime) AS min_etime,
               MAX(avg_etime) AS max_etime,
               stddev_etime / MIN(avg_etime) AS norm_stddev
        FROM (
            SELECT sql_id, plan_hash_value, execs, avg_etime,
                   stddev(avg_etime) OVER (PARTITION BY sql_id) AS stddev_etime
            FROM (
                SELECT sql_id, plan_hash_value,
                       SUM(NVL(executions_delta, 0)) AS execs,
                       (SUM(elapsed_time_delta) / DECODE(SUM(NVL(executions_delta, 0)), 0, 1, SUM(executions_delta)) / 1000000) AS avg_etime
                FROM DBA_HIST_SQLSTAT S
                JOIN DBA_HIST_SNAPSHOT SS ON ss.snap_id = S.snap_id
                WHERE ss.instance_number = S.instance_number 
                  AND executions_delta > 0
                  AND elapsed_time_delta > 0
                  AND ss.begin_interval_time >= SYSDATE - 7  -- Last 7 days
                  AND s.snap_id > NVL('&earliest_snap_id', 0)
                GROUP BY sql_id, plan_hash_value
            )
        )
        GROUP BY sql_id, stddev_etime
        HAVING stddev_etime / MIN(avg_etime) > NVL(TO_NUMBER('&min_stddev'), 2)
        AND MAX(avg_etime) > NVL(TO_NUMBER('&min_etime'), .1)
        ORDER BY norm_stddev  -- Sort by Norm Stddev
    ) LOOP
        DBMS_OUTPUT.PUT_LINE(
            RPAD(NVL(r.sql_id, 'N/A'), 15) || ' | ' ||
            LPAD(r.execs, 9) || ' | ' ||
            LPAD(r.min_etime, 17) || ' | ' ||
            LPAD(r.max_etime, 17) || ' | ' ||
            LPAD(r.norm_stddev, 12)
        );
    END LOOP;
 
    -- Print a separator
    DBMS_OUTPUT.PUT_LINE('----------------|-----------|------------------|------------------|------------');
    DBMS_OUTPUT.PUT_LINE('****************************************************************************************************'); -- Add two blank lines after each result set
    DBMS_OUTPUT.PUT_LINE('****************************************************************************************************'); -- Add two blank lines after each result set
	
 
    -- Print results of the first query
    DBMS_OUTPUT.PUT_LINE('Each of above identified top SQL with runtime details:');
    DBMS_OUTPUT.PUT_LINE('****************************************************************************************************'); -- Add two blank lines after each result set
    DBMS_OUTPUT.PUT_LINE('****************************************************************************************************'); -- Add two blank lines after each result set
	
    DBMS_OUTPUT.PUT_LINE('SQL_ID');
    DBMS_OUTPUT.PUT_LINE('-------');
 
    FOR sql_record IN sql_cursor LOOP
        DBMS_OUTPUT.PUT_LINE(sql_record.sql_id);
        found_sql_id := TRUE;  -- Set flag to true if we found any SQL_IDs
 
        -- Execute the second query for each SQL_ID
        DECLARE
            v_sql_id VARCHAR2(13) := sql_record.sql_id;  -- Assuming SQL_ID is 13 characters long
 
            CURSOR sql_details_cursor IS
                SELECT sql_id, plan_hash_value,
                       SUM(execs) AS execs,
                       SUM(etime) AS etime,
                       CASE
                           WHEN SUM(execs) > 0 THEN SUM(etime) / SUM(execs) 
                           ELSE 0 
                       END AS avg_etime,
                       CASE
                           WHEN SUM(execs) > 0 THEN SUM(cpu_time) / SUM(execs) 
                           ELSE 0 
                       END AS avg_cpu_time,
                       CASE
                           WHEN SUM(execs) > 0 THEN SUM(lio) / SUM(execs) 
                           ELSE 0 
                       END AS avg_lio,
                       CASE
                           WHEN SUM(execs) > 0 THEN SUM(pio) / SUM(execs) 
                           ELSE 0 
                       END AS avg_pio
                FROM (
                    SELECT ss.snap_id, ss.instance_number AS node, begin_interval_time, sql_id, plan_hash_value,
                           NVL(executions_delta, 0) AS execs,
                           elapsed_time_delta / 1000000 AS etime,
                           buffer_gets_delta AS lio,
                           disk_reads_delta AS pio,
                           cpu_time_delta / 1000000 AS cpu_time
                    FROM DBA_HIST_SQLSTAT S
                    JOIN DBA_HIST_SNAPSHOT SS ON ss.snap_id = S.snap_id AND ss.instance_number = S.instance_number
                    WHERE sql_id = v_sql_id
                )
                GROUP BY sql_id, plan_hash_value;
 
            sql_details_record sql_details_cursor%ROWTYPE;
 
        BEGIN
            DBMS_OUTPUT.PUT_LINE('Results for SQL_ID: ' || v_sql_id);
            DBMS_OUTPUT.PUT_LINE('Plan Hash Value | Execs     | Total Elapsed Time | Avg Elapsed Time | Avg CPU Time | Avg LIO | Avg PIO');
            DBMS_OUTPUT.PUT_LINE('----------------|-----------|--------------------|------------------|--------------|---------|-----------------');
            -- DBMS_OUTPUT.PUT_LINE('............'); -- Add two blank lines after each result set
            -- DBMS_OUTPUT.PUT_LINE('............'); -- Add two blank lines after each result set
 
            FOR sql_details_record IN sql_details_cursor LOOP
                DBMS_OUTPUT.PUT_LINE(
                    RPAD(NVL(sql_details_record.plan_hash_value, 0), 12) || ' | ' ||  -- Use RPAD for formatting
                    LPAD(sql_details_record.execs, 9) || ' | ' ||
                    LPAD(sql_details_record.etime, 20) || ' | ' ||
                    LPAD(sql_details_record.avg_etime, 17) || ' | ' ||
                    LPAD(sql_details_record.avg_cpu_time, 13) || ' | ' ||
                    LPAD(sql_details_record.avg_lio, 9) || ' | ' ||
                    LPAD(sql_details_record.avg_pio, 9)
                );
            END LOOP;
 
            DBMS_OUTPUT.PUT_LINE('----------------|-----------|--------------------|------------------|--------------|---------|------------------');
            DBMS_OUTPUT.PUT_LINE('.'); -- Add two blank lines after each result set
            DBMS_OUTPUT.PUT_LINE('.'); -- Add two blank lines after each result set
        EXCEPTION
            WHEN NO_DATA_FOUND THEN
                DBMS_OUTPUT.PUT_LINE('No details found for SQL_ID: ' || v_sql_id);
        END;
    END LOOP;
 
    -- If no SQL_IDs were found
    IF NOT found_sql_id THEN
        DBMS_OUTPUT.PUT_LINE('No SQL_IDs found in the first query.');
    END IF;
 
EXCEPTION
    WHEN OTHERS THEN
        DBMS_OUTPUT.PUT_LINE('An error occurred: ' || SQLERRM);
END;
/

Output is below!

Results from the additional SQL query:
SQL_ID	       | Execs	   | Min Elapsed Time | Max Elapsed Time | Norm Stbaev
----------------|-----------|------------------|------------------|------------
gs13vrvs094aw	|	  6 |		.781872 |	   3.178186 | 2.1671704310
0446bazbat9w7	|	  9 | 4.487090285714285 |	  22.554485 | 2.8471852597
2sbdv4y1gh8bz	|	 14 |	       3.971391 |	  24.408304 | 3.6387955174
3svaaatxjqug1w	|	328 | .0508694430379746 |	   4.806558 | 49.192010432
cc1w6bkm0h6ad	|	 21 | .0244223333333333 | 226.5182148888888 | 6557.7393620
----------------|-----------|------------------|------------------|------------
Results from the first query:
SQL_ID
-------
gs13vrvs094aw
Results For SQL_ID: gs13vrvs094aw
Plan Hash Value | Execs     | Total Elapsed Time | avg Elapsed Time | avg CPU Time | avg LIO | avg PIO
----------------|-----------|--------------------|------------------|--------------|---------|--------
1162885451   |	       4 |	       3.127488 |	    .781872 |	   .7762915 |	98945.5 |	  0
3903115306   |	       2 |	       6.356372 |	   3.178186 |	   1.711877 |	  98947 |     97416
----------------|-----------|--------------------|------------------|--------------|---------|--------
0446bazbat9w7
Results For SQL_ID: 0446bazbat9w7
Plan Hash Value | Execs     | Total Elapsed Time | avg Elapsed Time | avg CPU Time | avg LIO | avg PIO
----------------|-----------|--------------------|------------------|--------------|---------|--------
2464618362   |	      58 |	     264.766837 | 4.564945465517241 | 2.08845824137 | 169552.56 | 160914.13
4156457883   |	      14 |	     322.396623 | 23.02833021428571 | 6.21442542857 | 284598.42 | 183070.85
----------------|-----------|--------------------|------------------|--------------|---------|--------
2sbdv4y1gh8bz
Results For SQL_ID: 2sbdv4y1gh8bz
Plan Hash Value | Execs     | Total Elapsed Time | avg Elapsed Time | avg CPU Time | avg LIO | avg PIO
----------------|-----------|--------------------|------------------|--------------|---------|--------
4156457883   |	      17 |	     391.210815 | 23.01241188235294 | 5.98843517647 | 270550.17 | 182924.47
2464618362   |	      86 |	     359.926889 | 4.185196383720930 | 1.91583161627 | 163897.87 | 143981.62
----------------|-----------|--------------------|------------------|--------------|---------|--------
3svaaatxjqug1w
Results For SQL_ID: 3svaaatxjqug1w
Plan Hash Value | Execs     | Total Elapsed Time | avg Elapsed Time | avg CPU Time | avg LIO | avg PIO
----------------|-----------|--------------------|------------------|--------------|---------|--------
0	     |	       0 |		 .11113 |		  0 |		  0 |	      0 |	  0
583453783    |	      14 |	      65.427144 | 4.673367428571428 | 4.49858178571 | 1657010.7 | 1518.6428
2113578721   |	       5 |	       8.748638 |	  1.7497276 |	  1.7271034 |	13059.8 |      65.8
1103538090   |	      95 |	     101.131242 | 1.064539389473684 | 1.05525206315 | 14415.842 | 8.2631578
2328623641   |	    2221 |	     107.231117 | .0482805074290859 | .047307577667 | 7646.0702 | 5.4416929
4064119247   |	       5 |	       8.352578 |	  1.6705156 |	  1.6569552 |	   8235 |	 .8
----------------|-----------|--------------------|------------------|--------------|---------|--------
cc1w6bkm0h6ad
Results For SQL_ID: cc1w6bkm0h6ad
Plan Hash Value | Execs     | Total Elapsed Time | avg Elapsed Time | avg CPU Time | avg LIO | avg PIO
----------------|-----------|--------------------|------------------|--------------|---------|--------
3180225096   |	      20 |	       6.046942 |	   .3023471 |	   .1745638 |	28967.6 |    1114.6
4118768231   |	    2581 |	  280148.114248 | 108.5424270623789 | 72.4430339841 | 1789335.7 | 39453.911
----------------|-----------|--------------------|------------------|--------------|---------|--------


Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , , , | Leave a Comment »

Key Parameter Renaming for Auto Transaction Rollback Feature in 23ai

Posted by FatDBA on July 14, 2024

Hi Everyone,

With Oracle 23ai, the ‘auto transaction rollback‘ feature has undergone some parameter renaming. For more details on this feature, please refer to my previous blog post – Link

The parameter “TXN_AUTO_ROLLBACK_HIGH_PRIORITY_WAIT_TARGET”, which specifies the maximum number of seconds a HIGH priority transaction will wait for a row lock before the Priority Transactions feature rolls back a lower priority transaction holding the lock, is now renamed to PRIORITY_TXNS_HIGH_WAIT_TARGET“.

Similarly, “TXN_AUTO_ROLLBACK_MEDIUM_PRIORITY_WAIT_TARGET“, which specifies the maximum number of seconds a MEDIUM priority transaction will wait for a row lock before the Priority Transactions feature rolls back a lower priority transaction holding the lock, is now renamed to PRIORITY_TXNS_MEDIUM_WAIT_TARGET“.

Finally, TXN_AUTO_ROLLBACK_MODE, which specifies the mode for Priority Transactions (ROLLBACK or TRACK), is now renamed toPRIORITY_TXNS_MODE“.

Hope It Helped.
Prashant Dixit

Posted in Uncategorized | Tagged: , , , | Leave a Comment »

Optimizing PL/SQL Profiler for EBS After Oracle 19c PDB Upgrade

Posted by FatDBA on July 14, 2024

Hi All,

Today’s post covers an intriguing issue I encountered last year after upgrading a system to 19c PDB (19.6) from 12c Non-PDB for an EBS-based application. Our developers frequently run code against the database and use PL/SQL Profilers to identify performance bottlenecks. Running the profiler.sql script generates a detailed HTML report highlighting the top time-consuming operations based on the PL/SQL Profiler’s analysis. While I won’t delve into the specifics of its purpose or how to call the core scripts, you can find more information in my previous blog posts.

After the upgrade to 19c PDB, we noticed that the PL/SQL profiler, which had previously worked fine with the 12c Non-PDB database, started taking an excessively long time and would hang. We experimented with different arguments to determine if the issue was isolated to specific flags or options but found that the slowness was pervasive.

As this was a novel problem, we were the first to report it, prompting quick involvement from Oracle’s development, testing, and R&D divisions. Following several days of analysis with Oracle support, it was confirmed that the new database was affected by a buggy behavior while querying the ALL_ARGUMENTS view, which caused significant slowdowns in 19.6 PDB. The ALL_ARGUMENTS view, which lists the arguments of accessible procedures and functions, is frequently used by the profiler.

After intensive analysis, Oracle’s development team provided patch 31142749. Unfortunately, the issue persisted even after applying the patch. Subsequently, they suggested a solution using the CONTAINER_DATA parameter. This parameter restricts data fetching to the current container only when selecting from views defined as extended data link views, which typically fetch data from CDB$ROOT and can cause performance issues. For cases where fetching data from the current PDB suffices, this method avoids unnecessary overhead.

We applied the following fix:

ALTER SESSION SET CONTAINER_DATA=CURRENT_DICTIONARY;
This solution yielded excellent results, restoring the PL/SQL Profiler’s performance to the levels we experienced with 12c.

This experience underscores the importance of thorough testing and collaboration with vendor support when upgrading critical systems. It also highlights how new features and configurations in database management systems can impact performance in unexpected ways. By sharing this solution, I hope to assist others who might face similar challenges with their PL/SQL Profiler after upgrading to 19c PDB. If you have any questions or need further details, feel free to leave a comment or reach out to me directly. Stay tuned for more insights and solutions in my upcoming posts!

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , | Leave a Comment »

Toronto Oracle Users Group (TOUG) Spring Meetup 2024 Slide deck

Posted by FatDBA on May 16, 2024

I appreciate Toronto Oracle Users Group (TOUG) for providing me with this fantastic opportunity to present. I trust that the audience enjoyed my presentation “Unveiling My Top 5 Favorites: Oracle 23ai Performance!

The meetup was fantastic, featuring some great presentations from Simo Vilmulen (Accenture – Enkitec Group) and Oleksiy Razborshchuk (from Oracle Corp).

Attached is the slide deck I utilized during the session. Please don’t hesitate to inquire about any of the six features or improvements we covered.

Posted in Uncategorized | Tagged: , , | Leave a Comment »

Experiencing ORA-15554 Error During Real Application Testing, Despite Database Being in PREPARE Mode

Posted by FatDBA on February 24, 2024

Recently, I was working to simulate a production workload on a new server and analyze the potential impact of changes made to the Oracle Database. Armed with my favorite testing tool from Oracle, Real Application Testing (RAT), everything seemed to progress smoothly until I encountered a roadblock during the replay phase on the target system via workload clients (wrc replay clients). This incident occurred while conducting workload replay on Oracle Database 21c Release 3 within one of the Pluggable Databases (PDB).

Yes, you read it right – replaying workload within a PDB! Since Oracle Release 19c, a significant enhancement allows capturing and replaying workloads at the individual PDB level. In the past, this capability was confined to capturing and replaying multitenant databases at the root multitenant container database (CDB) level. If you’re curious about this enhancement, I delved into it in a previous blog post, providing comprehensive details and explanations. You can explore it here: Real Application Testing for Capture and Replay in a PDB: A Great Addition Made in 19c.

Now, before we delve into the intricacies of the encountered issue, let’s acknowledge an observation. There were instances where, amidst the excitement of working with new options, we totally forgot to enable PREPARE mode for REPLAY within the PDB, especially when working with versions 19c & above. It might seem trivial, but it’s a common oversight that can lead to unexpected hurdles. In such cases, the CDB might be in PREPARE mode, but the oversight within the PDB can be the root cause.

Assuming you’ve ensured that PREPARE mode is correctly set up for your PDB, and you’re still facing issues, let’s dive into troubleshooting this error. The next steps involve a meticulous investigation to understand what’s happening behind the scenes.

SQL> select id, name, status from dba_workload_replays;

        ID NAME                                     STATUS
---------- ---------------------------------------- ----------------------------------------
         1 REPLAY_MYDB12_3HRS_252024                PREPARE

[oracle@testbox-fatdba ~]$ wrc rattest/rattest@pdb1 mode=replay replaydir=/home/oracle/replaydir
Workload Replay Client: Release 21.3.0.0.0 - Production on Sat Feb 24 12:51:50 2024
Copyright (c) 1982, 2021, Oracle and/or its affiliates.  All rights reserved.

(wrc_main_507216.trc) ORA-15554: cannot start workload replay client because the database server is not in PREPARE mode

Lets see if we have to redo of prepare mode solves the problem.

SQL> execute DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY(synchronization => 'SCN');
BEGIN DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY(synchronization => 'SCN'); END;

*
ERROR at line 1:
ORA-20223: Invalid input. Database already in PREPARE mode.
ORA-06512: at "SYS.DBMS_WORKLOAD_REPLAY_I", line 5519
ORA-06512: at "SYS.DBMS_WORKLOAD_REPLAY", line 153
ORA-06512: at line 1

No, lets try and create a separate user and assign all required permissions to it and see if that solves the issue.

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         3 PDB1                           READ WRITE NO
SQL>
SQL> create user appuser identified by oracle90;
User created.
SQL> grant ratuser to appuser;
Grant succeeded.
SQL>
SQL> grant create session to appuser;
Grant succeeded.
SQL>
SQL> Create role ratuser;
Role created.
SQL> Grant create session to ratuser;
Grant succeeded.
SQL> Grant execute on dbms_workload_repository to ratuser;
Grant succeeded.
SQL> Grant administer SQL tuning set to ratuser;
Grant succeeded.
SQL> Grant create job to ratuser;
Grant succeeded.
SQL> Grant execute on dbms_workload_capture to ratuser;
Grant succeeded.
SQL> Grant execute on dbms_workload_replay to ratuser;
Grant succeeded.
SQL> Grant oem_advisor to ratuser;
Grant succeeded.
SQL> Grant create session to ratuser;
Grant succeeded.
SQL> Grant become user to ratuser;
Grant succeeded.
SQL> Grant create any directory to ratuser;
Grant succeeded.
SQL> Grant select_catalog_role to ratuser;
Grant succeeded.
SQL>

SQL> grant ratuser to appuser;
Grant succeeded.

Let’s try now and see how it goes with this new application user that we just created with all right permissions.

[oracle@testbox-fatdba ~]$ wrc appuser/oracle90@pdb1 mode=replay replaydir=/home/oracle/replaydir
Workload Replay Client: Release 21.3.0.0.0 - Production on Sat Feb 24 13:14:19 2024
Copyright (c) 1982, 2021, Oracle and/or its affiliates.  All rights reserved.

Wait for the replay to start (13:14:19)

Yeah, and this is working fine.

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , , | Leave a Comment »

Oracle’s Evolution: Sweet Little Dual Tables Exiled in 23c

Posted by FatDBA on December 28, 2023

Hi All,

Here comes what is likely the final post of the year 2023!

While there have been previous discussions on this topic, I’d like to delve deeper into the specific enhancements and features related to the use of the DUAL table in Oracle 23c and little more about this sweet little object in Oracle databases …

The DUAL table has been a part of the Oracle Database for a very long time. It has been a fundamental component of Oracle’s SQL language for decades. The DUAL table in Oracle Database serves a specific purpose and is often used for various tasks. Its primary function is to provide a one-row, one-column table that can be used for evaluating expressions or performing certain types of queries. DUAL is a table automatically created by Oracle Database along with the data dictionary. DUAL is in the schema of the user SYS but is accessible by the name DUAL to all users. It has one column, DUMMY, defined to be VARCHAR2(1), and contains one row with a value X.

Some of the very popular use cases of it are – When evaluating expressions (SELECT SYSDATE FROM DUAL), Providing Constants (SELECT 10 * 5 FROM DUAL), In PL/SQL blocks or scripts, the DUAL table can be used to store and retrieve scalar values, we often use the DUAL table during testing or debugging to check the result of an expression or function without the need for an actual table … I always remember it using as a quick way to check connectivity of the database etc. I mean while it seemingly mundane, the DUAL table is a small but significant component in the Oracle database ecosystem, serving a variety of purposes across different contexts.

One of the interesting performance tip related to DUAL tables that I remember is – Starting from Oracle Database 10g Release 1, when calculating an expression that excludes the DUMMY column in the DUAL table, logical I/O is not executed. This optimization is identified as FAST DUAL in the execution plan. However, if you explicitly SELECT the DUMMY column from DUAL, this optimization is bypassed, and logical I/O is performed.

A quick example :

SQL> explain plan for select * from dual;        ---> Logical IO will be performed.

select * from table(dbms_xplan.display(format=>'basic'));

----------------------------------
| Id | Operation | Name |
----------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS FULL| DUAL |
----------------------------------

Yet, when choosing a standard expression from DUAL, the FAST DUAL row source is employed:

SQL> explain plan for select sysdate from dual;    ---> No Logical IO will be performed.

select * from table(dbms_xplan.display(format=>'basic'));

---------------------------------
| Id | Operation | Name |
---------------------------------
| 0 | SELECT STATEMENT | |
| 1 | FAST DUAL | |
---------------------------------

Starting with Oracle 23c, there is no need to explicitly select expressions from the DUAL table; you can simply omit the FROM clause altogether. Lets do a quick demo.

---------------------------------
-- Prior 23c release
---------------------------------
SQL > select sysdate from dual 

SYSDATE
---------
28-DEC-23

SQL > select sysdate;
ORA-00923 FROM keyword not found where expected


-- One more test
SQL> create sequence emp2;
Sequence Created

SQL> select emp2.nextval from dual;

NEXTVAL
---------
1

SQL> select emp2.nextval;
ORA-00923 FROM keyword not found where expected




---------------------------------
-- In Oracle 23c 
---------------------------------
SQL > select sysdate;
SYSDATE
---------
28-DEC-23

SQL> create sequence emp2;
Sequence Created

SQL> select emp2.nextval from dual;

NEXTVAL
---------
1

SQL> select emp2.nextval;

NEXTVAL
---------
2

Hope It Helped
Prashant Dixit

Posted in Uncategorized | Tagged: , , , | 2 Comments »

A simple script to automatically catch those intermittent database performance issues …

Posted by FatDBA on December 16, 2023

Hi All,

Occasionally, anticipating the occurrence of a problem becomes unpredictable, and simultaneously, dealing manually with a transient issue to extract necessary information may not always be feasible (mainly due to the rapid resolution of the problem before we can initiate the extraction process). Additionally, there is a requirement for adaptability in the content we aim to extract.

The ensuing approach allows us to attain our primary objectives without resorting to invasive measures. There could be multiple use cases, few of the scenarios would be :
Case 1 : With little changes to it, you can make it employable for gathering solely the system state upon encountering a specific wait event.
Case 2: Can be utilized for gathering 10046, error stack, etc., for the obstructing session when a designated wait event is triggered concurrently with the execution of a known SQL statement (SQL_ID).
Case 3: Utilizable for obtaining hang analysis, system state, and additionally, 10046, error stack for the session causing the obstruction when a specific wait event is encountered.

Let me explain how we can utilize this quick, short but handy code to generate the diagnostic dump like hanganalyze / systemstate / errorstack / 10046 trace etc if any session is found to be waiting on any specific wait event for any specific period of time.

This is the simple script to collect diagnostic information with few of the customizable parameters or inputs i.e. ‘waitevent’ which is for the event we know we have the problem with, ‘seconds’ is the amount of time we think a session can wait on that session before we decide it is stuck and want to trigger dumping and ‘tid’ which is the session we are going to attach to run oradebug against. This could be the waiting session, a blocking session.

In the example, I have modified the code to collect diag info if any session is found to be waiting on “enq: TX – row lock contention” waits above 10 seconds.

set serverout on
spool logit.sql
declare
waitevent varchar2(100):='enq: TX - row lock contention';
seconds number :=10;      
w1 v$session%rowtype;
w2 v$session%rowtype;
tid number := 0;

begin

loop
dbms_lock.sleep(seconds);
begin
select * into w1 
from v$session 
where event=waitevent and wait_time=0 and rownum=1;
dbms_lock.sleep(seconds);
select * into w2
from v$session
where wait_time=0
and sid=w1.sid
and event=waitevent
and seq#=w1.seq# ;
select spid into tid from v$process
where addr=(select paddr from v$session where sid=w2.blocking_session);
dbms_output.put_line('oradebug setospid '||tid);
dbms_output.put_line('oradebug unlimit');
dbms_output.put_line('oradebug hanganalyze 3');  /**If RAC then use -G all**/
dbms_output.put_line('oradebug tracefile_name');
dbms_output.put_line('exit');
exit;
exception
when no_data_found then null;
end;
end loop;
end;
/
exit


-- Result : 
SQL>  
oradebug setospid 6794
oradebug unlimit
oradebug hanganalyze 3
oradebug tracefile_name
exit

PL/SQL procedure successfully completed.

SQL> 

The subsequent instance is intricate yet showcases versatility of this simple script. In this scenario, I encounter an intermittent issue involving sessions experiencing wait times due to TX enqueue row lock contention. While I may not be overly concerned about these sessions, it is crucial for me to understand the actions of the session causing the blockage. Given that I am aware of the SQL being executed during that period, I can leverage this information to ensure accurate trapping of the relevant event.
Here is something that you can do to alter it accordingly.

Let me create a row locking situation and see how it captures stats for a particular SQLID. Below is the row locking case that I have created to test how the scripts catches details for this SQLID ‘0jy18x4pff06k’

This is the modified version of the script which catches details for the SQL waiting on any specific wait event.

set serverout on
spool logit.sql
declare
waitevent varchar2(100):='enq: TX - row lock contention';
seconds number :=10;
w1 v$session%rowtype;
w2 v$session%rowtype;
tid number := 0;

begin

loop
dbms_lock.sleep(seconds);
begin
select * into w1
from v$session
where event=waitevent and wait_time=0 and sql_id='0jy18x4pff06k' and rownum=1;
dbms_lock.sleep(seconds);
select * into w2
from v$session
where wait_time=0
and sid=w1.sid
and event=waitevent
and seq#=w1.seq# ;
select spid into tid from v$process
where addr=(select paddr from v$session where sid=w2.blocking_session);
dbms_output.put_line('oradebug setospid '||tid);
dbms_output.put_line('oradebug unlimit');
dbms_output.put_line('oradebug dump errorstack 3');
dbms_output.put_line('oradebug dump errorstack 3');
dbms_output.put_line('oradebug dump errorstack 3');
dbms_output.put_line('oradebug event 10046 trace name context forever, level 8');
dbms_output.put_line('oradebug event 10200 trace name context forever, level 1');
dbms_output.put_line('oradebug event 10224 trace name context forever, level 1');
dbms_output.put_line('execute dbms_lock.sleep(20);');
dbms_output.put_line('oradebug event 10046 trace name context off');
dbms_output.put_line('oradebug event 10200 trace name context off');
dbms_output.put_line('oradebug event 10224 trace name context off');
dbms_output.put_line('exit');
exit;
exception
when no_data_found then null;
end;
end loop;
end;
/
exit


-- Result : 
SQL>  
oradebug setospid 6794
oradebug unlimit
oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug event 10046 trace name context forever, level 8
oradebug event 10200 trace name context forever, level 1
oradebug event 10224 trace name context forever, level 1
execute dbms_lock.sleep(20);
oradebug event 10046 trace name context off
oradebug event 10200 trace name context off
oradebug event 10224 trace name context off
exit

PL/SQL procedure successfully completed.

SQL> 

The next what we can do to trap any such wait events of interest when it exceeds the behnchmark, you can put eveything in a shell script and run it in the loop and execute the oradebug code. Once finished you can look it in the diagnostic_dest for the dump file. For 11g and above, look under subdirectory trace under home directory of ADR.

Next once you have put it in the shell script format, you can call it in the nohup mode and the script will run in background and sit in a loop for the defined condition to hit.

[oracle@oracleontario ~]$ more exec
#!/bin/bash
sqlplus -s '/ as sysdba' @waitdumpv1
[oracle@oracleontario ~]$
[oracle@oracleontario ~]$ chmod 777 exec
[oracle@oracleontario ~]$
[oracle@oracleontario ~]$

[oracle@oracleontario ~]$ nohup ./exec > waitdump2.log
nohup: ignoring input and redirecting stderr to stdout
[oracle@oracleontario ~]$

[oracle@oracleontario ~]$ more waitdump1.log
[oracle@oracleontario ~]$ nohup ./exec > waitdump2.log
nohup: ignoring input and redirecting stderr to stdout
[oracle@oracleontario ~]$ more waitdump2.log
oradebug setospid 6794
oradebug unlimit
oradebug hanganalyze 3
oradebug tracefile_name
exit

PL/SQL procedure successfully completed.

[oracle@oracleontario ~]$
[oracle@oracleontario ~]$

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , , | Leave a Comment »