Hi Guys,
Last weekend I was asked to examine a performance problem where customer was observing sporadic resource spikes on CPU & PIOs (Physical IO scans), and had no idea how to handle the situation. They already have identified the problematic SQL and asked me to take a look, while doing initial investigation on the system, I found the resource manager was enabled and was killing the ill SQL every time when it tries to breach the limits set on CPU & other resources.
I spent some time understanding query behavior and performance, and I found this a scenario of SQL plan (PHV) flip where the query optimizer toggles between a good and a worst plan. This being a time sensitive issue and customer needs a quick fix before I identify the reason behind the PHV flip, we have to think of any technique on how we can stop the SQL execution with bad PHVs. This was an Oracle EE 19c system running on Exadata, I immediately proposed for a solution using new feature of ‘SQL Quarantine‘, which helps to prevent reuse of same execution plan which was terminated by resource managers due to resource limits.
Let me show you how to do that, here I will demonstrate how to quarantine a SQL & all its execution plans.
Well this is just a use case, you can also quarantine the SQL for all its PHVs or execution plans or even through the SQL Text. Here I will show how you can quarantine a SQL for all its execution plans.
Note: This feature is currently available only on Enterprise Edition on Engineered Systems, like Exadata and Exadata Cloud Service, but for this demo, I will be enabling the ‘_exadata_feature_on‘ parameter for demo purposes on my sandbox setup.
** Please don’t touch this parameter in your production environments if you’re not running on Oracle EE on the top of Exadata or Exadata Cloud services.
[oracle@canttowinsec ~]$ !sql
sqlplus / as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Jul 3 23:33:41 2021
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
SQL>
SQL> @hidden
Enter value for param: exadata_feature_on
old 5: and a.ksppinm like '%¶m%'
new 5: and a.ksppinm like '%exadata_feature_on%'
Parameter Session Value Instance Value descr
--------------------------------------------- ------------------------- ------------------------- ------------------------------------------------------------
_exadata_feature_on FALSE FALSE Exadata Feature On
SQL>
SQL>
SQL> shut immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL>
SQL> startup
ORACLE instance started.
Total System Global Area 922745216 bytes
Fixed Size 8903040 bytes
Variable Size 843055104 bytes
Database Buffers 67108864 bytes
Redo Buffers 3678208 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL>
SQL>
SQL> @hidden
Enter value for param: exadata_feature_on
old 5: and a.ksppinm like '%¶m%'
new 5: and a.ksppinm like '%exadata_feature_on%'
Parameter Session Value Instance Value descr
--------------------------------------------- ------------------------- ------------------------- ------------------------------------------------------------
_exadata_feature_on TRUE TRUE Exadata Feature On
SQL>
-- For this demo I have created one table with 1000000 rows.
SQL> select count(*) from bigtab;
COUNT(*)
----------
1000000
Next check the SQLID, PHV and TEXT of the SQL for which you want to setup this QUARANTINE feature and will create the quarantine configuration for it. You can set it for SQL ID, SQL ID + PHV or for SQL TEXT.
SQL> select sql_id, plan_hash_value, sql_text from v$sqlarea where sql_text like '%select * from bigtab%';
SQL_ID PLAN_HASH_VALUE
------------- ---------------
SQL_TEXT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
6fwqzurbc8y7k 441133017
select * from bigtab
SQL>
-- You can create CREATE_QUARANTINE BY SQL_ID
quarantine_config := DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_ID(SQL_ID => '&sql_id')
quarantine_config := DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_ID(SQL_ID => '&sql_id', PLAN_HASH_VALUE => '&PLAN_HASH_VALUE');
-- You can create CREATE_QUARANTINE_BY_SQL_TEXT.
quarantine_config := DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_TEXt(SQL_TEXT => to_clob('select count(*) from emp'));
-- This command creates a quarantine configuration for SQL ID '6fwqzurbc8y7k' and all its execution plans.
SQL>
SQL> DECLARE
quarantine_sql VARCHAR2(30);
BEGIN
quarantine_sql := DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_ID(SQL_ID => '6fwqzurbc8y7k');
END;
/
PL/SQL procedure successfully completed.
SQL>
-- Check if quarantine config/profile is created.
SQL> select NAME, SQL_TEXT, ELAPSED_TIME, cpu_time,CREATED, ENABLED from dba_sql_quarantine;
NAME SQL_TEXT ELAPSED_TIME CPU_TIME CREATED ENA
------------------------------ ------------------------------ -------------------- ------------------------------ ------------------------------ ---
SQL_QUARANTINE_2akx3bq44wvc5 select * from bigtab ALWAYS ALWAYS 04-JUL-21 12.09.25.567422 AM YES
Now next we will use the DBMS_SQLQ.ALTER_QUARANTINE procedure to put limits on resource usage i.e. Elapsed time, CPU, IO (MBs), Number of physical (PIOs) requests, Number of logical (LIOs) requests.
SQL>
SQL> BEGIN
DBMS_SQLQ.ALTER_QUARANTINE(
QUARANTINE_NAME => 'SQL_QUARANTINE_2akx3bq44wvc5',
PARAMETER_NAME => 'CPU_TIME',
PARAMETER_VALUE => '5');
DBMS_SQLQ.ALTER_QUARANTINE(
QUARANTINE_NAME => 'SQL_QUARANTINE_2akx3bq44wvc5',
PARAMETER_NAME => 'ELAPSED_TIME',
PARAMETER_VALUE => '10');
END;
/ 2 3 4 5 6 7 8 9 10 11 12
PL/SQL procedure successfully completed.
SQL>
-- let's verify limits that we have set against this SQL Quarantine profile.
NAME SQL_TEXT ELAPSED_TIME CPU_TIME CREATED ENA
------------------------------ ------------------------------ -------------------- ------------------------------ ------------------------------ ---
SQL_QUARANTINE_2akx3bq44wvc5 select * from bigtab 10 5 04-JUL-21 12.12.41.918609 AM YES
Next we will setup the RESOURCE MANAGER.
-- Create a pending area.
SQL> begin
dbms_resource_manager.create_pending_area();
end;
/ 2 3 4
PL/SQL procedure successfully completed.
-- Create a consumer group with name 'DIXITPOC_GROUP'.
SQL> begin
dbms_resource_manager.create_consumer_group(CONSUMER_GROUP=>'DIXITPOC_GROUP',COMMENT=>'To test SQL quarantine feature for one SQL');
end;
/ 2 3 4
PL/SQL procedure successfully completed.
-- Now bind the consumer group with the user 'DIXIT', I mean this could be any group through which you will execute the query.
SQL> begin
dbms_resource_manager.set_consumer_group_mapping(attribute => 'ORACLE_USER',value => 'DIXIT',consumer_group =>'DIXITPOC_GROUP' );
end;
/ 2 3 4
PL/SQL procedure successfully completed.
-- Create a resource plan with name 'POC_FOR_QUARANTINE'.
SQL> begin
dbms_resource_manager.create_plan(plan => 'POC_FOR_QUARANTINE',comment => 'Kill SQL and its other execution plans when exceeding CPU and Elapsed seconds');
end;
/ 2 3 4
PL/SQL procedure successfully completed.
-- Create a plan directive by allocating resource plan 'POC_FOR_QUARANTINE' to consumer group 'DIXITPOC_GROUP'. I am setting execution limit of 5 seconds for the SQL.
SQL> begin
dbms_resource_manager.create_plan_directive(
plan => 'POC_FOR_QUARANTINE',
group_or_subplan => 'DIXITPOC_GROUP',
comment => 'Kill SQL and its other execution plans when exceeding CPU and Elapsed seconds',
switch_group=>'CANCEL_SQL',
switch_time => 5,
switch_estimate=>false);
end;
/ 2 3 4 5 6 7 8 9 10
PL/SQL procedure successfully completed.
-- Allocate full resources to rest of the sessions which are not part of this plan.
SQL> begin
dbms_resource_manager.create_plan_directive(PLAN=> 'POC_FOR_QUARANTINE', GROUP_OR_SUBPLAN=>'OTHER_GROUPS',COMMENT=>'leave rest as is');
end;
/ 2 3 4
PL/SQL procedure successfully completed.
-- Validate and submit pending area.
SQL> begin
dbms_resource_manager.validate_pending_area();
end;
/ 2 3 4
PL/SQL procedure successfully completed.
SQL> begin
dbms_resource_manager.submit_pending_area();
end;
/ 2 3 4
PL/SQL procedure successfully completed.
SQL>
-- Grant switch privilege to the DIXIT user which will switch it to 'DIXITPOC_GROUP' group.
SQL> begin
dbms_resource_manager_privs.grant_switch_consumer_group('DIXIT','DIXITPOC_GROUP',false);
end;
/ 2 3 4
PL/SQL procedure successfully completed.
-- Create initial consumer group for DIXIT user for 'DIXITPOC_GROUP' group.
SQL> begin
dbms_resource_manager.set_initial_consumer_group('DIXIT','DIXITPOC_GROUP');
end;
/ 2 3 4
PL/SQL procedure successfully completed.
SQL>
Next set the resource manager at the database level, we will have to set the RESOURCE_MANAGER_PLAN to value or name of the plan ‘POC_FOR_QUARANTINE’.
SQL> show parameter resource
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
resource_limit boolean TRUE
resource_manage_goldengate boolean FALSE
resource_manager_cpu_allocation integer 5
resource_manager_plan string
SQL>
SQL> alter DIXITtem set RESOURCE_MANAGER_PLAN = 'POC_FOR_QUARANTINE';
DIXITtem altered.
SQL> show parameter resource
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
resource_limit boolean TRUE
resource_manage_goldengate boolean FALSE
resource_manager_cpu_allocation integer 5
resource_manager_plan string POC_FOR_QUARANTINE
SQL>
Execute the SQL for the first time, you will get ‘active time limit exceeded’ due to RM.
SQL>
SQL> select * from emp;
ERROR at line 2:
ORA-00040: active time limit exceeded - call aborted
Now, if you run the SQL for the second time, you will see your quarantine plan/profile will be used and will abort the execution of the SQL.
SQL>
SQL> select * from emp;
ERROR at line 2:
ORA-56955: quarantined plan used
Hope It Helped
Prashant Dixit