Tales From A Lazy Fat DBA

Its all about Databases & their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Quarantine a SQL ? This can save from a system performance pandemic …

Posted by FatDBA on July 6, 2021

Hi Guys,

Last weekend I was asked to examine a performance problem where customer was observing sporadic resource spikes on CPU & PIOs (Physical IO scans), and had no idea how to handle the situation. They already have identified the problematic SQL and asked me to take a look, while doing initial investigation on the system, I found the resource manager was enabled and was killing the ill SQL every time when it tries to breach the limits set on CPU & other resources.

I spent some time understanding query behavior and performance, and I found this a scenario of SQL plan (PHV) flip where the query optimizer toggles between a good and a worst plan. This being a time sensitive issue and customer needs a quick fix before I identify the reason behind the PHV flip, we have to think of any technique on how we can stop the SQL execution with bad PHVs. This was an Oracle EE 19c system running on Exadata, I immediately proposed for a solution using new feature of ‘SQL Quarantine‘, which helps to prevent reuse of same execution plan which was terminated by resource managers due to resource limits.

Let me show you how to do that, here I will demonstrate how to quarantine a SQL & all its execution plans.

Well this is just a use case, you can also quarantine the SQL for all its PHVs or execution plans or even through the SQL Text. Here I will show how you can quarantine a SQL for all its execution plans.

Note: This feature is currently available only on Enterprise Edition on Engineered Systems, like Exadata and Exadata Cloud Service, but for this demo, I will be enabling the ‘_exadata_feature_on‘ parameter for demo purposes on my sandbox setup.

** Please don’t touch this parameter in your production environments if you’re not running on Oracle EE on the top of Exadata or Exadata Cloud services.

[oracle@canttowinsec ~]$ !sql
sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Jul 3 23:33:41 2021
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


SQL>
SQL> @hidden
Enter value for param: exadata_feature_on
old   5:    and a.ksppinm like '%&param%'
new   5:    and a.ksppinm like '%exadata_feature_on%'

Parameter                                     Session Value             Instance Value            descr
--------------------------------------------- ------------------------- ------------------------- ------------------------------------------------------------
_exadata_feature_on                           FALSE                     FALSE                     Exadata Feature On

SQL>
SQL>


SQL> shut immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL>


SQL> startup
ORACLE instance started.

Total System Global Area  922745216 bytes
Fixed Size                  8903040 bytes
Variable Size             843055104 bytes
Database Buffers           67108864 bytes
Redo Buffers                3678208 bytes
Database mounted.
Database opened.
SQL>
SQL>
SQL>



SQL>
SQL> @hidden
Enter value for param: exadata_feature_on
old   5:    and a.ksppinm like '%&param%'
new   5:    and a.ksppinm like '%exadata_feature_on%'

Parameter                                     Session Value             Instance Value            descr
--------------------------------------------- ------------------------- ------------------------- ------------------------------------------------------------
_exadata_feature_on                           TRUE                      TRUE                      Exadata Feature On

SQL>




-- For this demo I have created one table with 1000000 rows. 

SQL> select count(*) from bigtab;

  COUNT(*)
----------
   1000000

Next check the SQLID, PHV and TEXT of the SQL for which you want to setup this QUARANTINE feature and will create the quarantine configuration for it. You can set it for SQL ID, SQL ID + PHV or for SQL TEXT.

SQL> select sql_id, plan_hash_value, sql_text from v$sqlarea where sql_text like '%select * from bigtab%';

SQL_ID        PLAN_HASH_VALUE
------------- ---------------
SQL_TEXT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
6fwqzurbc8y7k       441133017
select * from bigtab

SQL>


-- You can create CREATE_QUARANTINE BY SQL_ID 
quarantine_config := DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_ID(SQL_ID => '&sql_id')
quarantine_config := DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_ID(SQL_ID => '&sql_id', PLAN_HASH_VALUE => '&PLAN_HASH_VALUE');

-- You can create CREATE_QUARANTINE_BY_SQL_TEXT.
quarantine_config := DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_TEXt(SQL_TEXT => to_clob('select count(*) from emp'));



-- This command creates a quarantine configuration for SQL ID '6fwqzurbc8y7k' and all its execution plans.
SQL>
SQL> DECLARE
quarantine_sql VARCHAR2(30);
BEGIN
quarantine_sql :=   DBMS_SQLQ.CREATE_QUARANTINE_BY_SQL_ID(SQL_ID => '6fwqzurbc8y7k');
END;
/  

PL/SQL procedure successfully completed.

SQL>



-- Check if quarantine config/profile is created.

SQL> select NAME, SQL_TEXT, ELAPSED_TIME, cpu_time,CREATED, ENABLED from dba_sql_quarantine;


NAME                           SQL_TEXT                       ELAPSED_TIME         CPU_TIME                       CREATED                        ENA
------------------------------ ------------------------------ -------------------- ------------------------------ ------------------------------ ---
SQL_QUARANTINE_2akx3bq44wvc5   select * from bigtab           ALWAYS               ALWAYS                         04-JUL-21 12.09.25.567422 AM   YES


Now next we will use the DBMS_SQLQ.ALTER_QUARANTINE procedure to put limits on resource usage i.e. Elapsed time, CPU, IO (MBs), Number of physical (PIOs) requests, Number of logical (LIOs) requests.

SQL>
SQL> BEGIN
    DBMS_SQLQ.ALTER_QUARANTINE(
       QUARANTINE_NAME => 'SQL_QUARANTINE_2akx3bq44wvc5',
       PARAMETER_NAME  => 'CPU_TIME',
       PARAMETER_VALUE => '5');

    DBMS_SQLQ.ALTER_QUARANTINE(
       QUARANTINE_NAME => 'SQL_QUARANTINE_2akx3bq44wvc5',
       PARAMETER_NAME  => 'ELAPSED_TIME',
       PARAMETER_VALUE => '10');
END;
/  2    3    4    5    6    7    8    9   10   11   12

PL/SQL procedure successfully completed.

SQL>


-- let's verify limits that we have set against this SQL Quarantine profile.

NAME                           SQL_TEXT                       ELAPSED_TIME         CPU_TIME                       CREATED                        ENA
------------------------------ ------------------------------ -------------------- ------------------------------ ------------------------------ ---
SQL_QUARANTINE_2akx3bq44wvc5   select * from bigtab           10                   5                              04-JUL-21 12.12.41.918609 AM   YES

Next we will setup the RESOURCE MANAGER.

-- Create a pending area.
SQL> begin
 dbms_resource_manager.create_pending_area();
 end;
 /  2    3    4

PL/SQL procedure successfully completed.

-- Create a consumer group with name 'DIXITPOC_GROUP'.
SQL> begin
 dbms_resource_manager.create_consumer_group(CONSUMER_GROUP=>'DIXITPOC_GROUP',COMMENT=>'To test SQL quarantine feature for one SQL');
 end;
 /  2    3    4

PL/SQL procedure successfully completed.


-- Now bind the consumer group with the user 'DIXIT', I mean this could be any group through which you will execute the query.
SQL> begin
 dbms_resource_manager.set_consumer_group_mapping(attribute => 'ORACLE_USER',value => 'DIXIT',consumer_group =>'DIXITPOC_GROUP' );
 end;
  /  2    3    4

PL/SQL procedure successfully completed.


-- Create a resource plan with name 'POC_FOR_QUARANTINE'.
SQL> begin
dbms_resource_manager.create_plan(plan => 'POC_FOR_QUARANTINE',comment => 'Kill SQL and its other execution plans when exceeding CPU and Elapsed seconds');
end;
/  2    3    4

PL/SQL procedure successfully completed.


-- Create a plan directive by allocating resource plan 'POC_FOR_QUARANTINE' to consumer group 'DIXITPOC_GROUP'. I am setting execution limit of 5 seconds for the SQL.
SQL> begin
 dbms_resource_manager.create_plan_directive(
 plan => 'POC_FOR_QUARANTINE',
 group_or_subplan => 'DIXITPOC_GROUP',
 comment => 'Kill SQL and its other execution plans when exceeding CPU and Elapsed seconds',
 switch_group=>'CANCEL_SQL',
 switch_time => 5,
 switch_estimate=>false);
 end;
 /  2    3    4    5    6    7    8    9   10

PL/SQL procedure successfully completed.




-- Allocate full resources to rest of the sessions which are not part of this plan.
SQL> begin
 dbms_resource_manager.create_plan_directive(PLAN=> 'POC_FOR_QUARANTINE', GROUP_OR_SUBPLAN=>'OTHER_GROUPS',COMMENT=>'leave rest as is');
 end;
 /  2    3    4

PL/SQL procedure successfully completed.



-- Validate and submit pending area.
SQL> begin
 dbms_resource_manager.validate_pending_area();
 end;
 /  2    3    4

PL/SQL procedure successfully completed.

SQL> begin
 dbms_resource_manager.submit_pending_area();
 end;
 /  2    3    4

PL/SQL procedure successfully completed.

SQL>


-- Grant switch privilege to the DIXIT user which will switch it to 'DIXITPOC_GROUP' group.
SQL> begin
dbms_resource_manager_privs.grant_switch_consumer_group('DIXIT','DIXITPOC_GROUP',false);
end;
/  2    3    4

PL/SQL procedure successfully completed.

-- Create initial consumer group for DIXIT user for 'DIXITPOC_GROUP' group.
SQL> begin
dbms_resource_manager.set_initial_consumer_group('DIXIT','DIXITPOC_GROUP');
end;
/  2    3    4

PL/SQL procedure successfully completed.

SQL>

Next set the resource manager at the database level, we will have to set the RESOURCE_MANAGER_PLAN to value or name of the plan ‘POC_FOR_QUARANTINE’.

SQL> show parameter resource

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
resource_limit                       boolean     TRUE
resource_manage_goldengate           boolean     FALSE
resource_manager_cpu_allocation      integer     5
resource_manager_plan                string
SQL> 


SQL> alter DIXITtem set RESOURCE_MANAGER_PLAN = 'POC_FOR_QUARANTINE';
DIXITtem altered.

SQL> show parameter resource

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
resource_limit                       boolean     TRUE
resource_manage_goldengate           boolean     FALSE
resource_manager_cpu_allocation      integer     5
resource_manager_plan                string      POC_FOR_QUARANTINE
SQL>

Execute the SQL for the first time, you will get ‘active time limit exceeded’ due to RM.

SQL>
SQL> select * from emp;

ERROR at line 2:
ORA-00040: active time limit exceeded - call aborted

Now, if you run the SQL for the second time, you will see your quarantine plan/profile will be used and will abort the execution of the SQL.

SQL>
SQL> select * from emp;

ERROR at line 2:
ORA-56955: quarantined plan used

Hope It Helped
Prashant Dixit

Advertisement

One Response to “Quarantine a SQL ? This can save from a system performance pandemic …”

  1. […] you want to read more about SQL Quarantine, please click this link to my earlier post with a demo about […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: