Tales From A Lazy Fat DBA

How to fix SQL Plan issues using OUTLINE DATA.

Posted by FatDBA on November 30, 2017

Hi Mates,
Today would like to discuss about the ‘SQL Plan Stability’, specially situations when the performance of a SQL statement degrades and performs poor in one environment and working absolutely good in other spheres.
I experienced such issues a lot during past few years while working as a performance consultant for many of the projects, but recently i encountered similar issue and would like to discuss about the same.
In short the situation is given below for that problematic SQL

Same request they say takes in SIT 268 sec:
• I run in SIT again –> 500 secs
• I run also in DEV –> 20 secs
• I run in preSIT –> 5 secs

Below are the details for the given Ill-SQL.

SQL Text:
SELECT a.port_inst_id AS mdf_es_port_inst_id, a.status AS mdf_es_port_status, d.port_inst_id AS mdf_ls_port_inst_id, d.status AS mdf_ls_port_status, regexp_substr( c.descr, :"SYS_B_00", :"SYS_B_01", :"SYS_B_02" ) AS naming_area, a.status AS mdf_es_port_status2, DECODE( a.reservation_inst_id, NULL, :"SYS_B_03", :"SYS_B_04" ) AS mdf_es_active_reservation, a.port_access_id AS lic_identity, egi_get_uda_value( a.equip_inst_id, :"SYS_B_05", :"SYS_B_06", :"SYS_B_07" ) AS kc, b.name AS coordinate, a.port_hum_id egi_port_name, egi_get_uda_value( d.equip_inst_id, :"SYS_B_08", :"SYS_B_09", :"SYS_B_10" ) AS cabinet_name, e.name AS termination_block_name, d.port_access_ id AS pair_name, egi_get_uda_value( d.equip_inst_id, :"SYS_B_11", :"SYS_B_12", :"SYS_B_13" ) AS signing_system, egi_get_uda_value( a.port_inst_id, :"SYS_B_14", :"SYS_B_15", :"SYS_B_16" ) AS processor_exchange, a.circ_path_inst_id AS es_circ_path_inst_id, a.next_path_inst_id AS es_next_path_inst_id, d.circ_path_inst_id AS ls_circ_path_inst_id, d.next_path_inst_id AS ls_next_path_inst_id, DECODE( d.role, :"SYS_B_17", :"SYS_B_18", :"SYS_B_19", :"SYS_B_20", :"SYS_B_21", :"SYS_B_22", :"SYS_B_23", :"SYS_B_24", NULL ) AS cabling_type FROM epa a, card_inst b, equip_inst c, epa d, card_inst e, xxx_xxxx_xxxxxxxx f WHERE a.port_inst_id = f.port_inst_id AND f.val_attr_inst_id = ( SELECT val_attr_inst_id FROM xxx_xxxx_xxxxxxxx WHERE group_name = :"SYS_B_25" AND attr_name = :"SYS_B_26" ) AND a.card_inst_id = b.card_inst_id AND b.type = :"SYS_B_27" AND a.equip_inst_id = c.equip_inst_id AND a.z_wired_port_inst_id = d.port_inst_id (+) AND d.card_inst_id = e.card_inst_id (+) AND regexp_substr( a.port_access_id, :"SYS_B_28", :"SYS_B_29", :"SYS_B_30" ) = :"SYS_B_31" AND to_number(regexp_substr( a.port_access_id, :"SYS_B_32", :"SYS_B_33", :"SYS_B_34" ) ) >= to_number(: "SYS_B_35") AND to_number(regexp_substr( a.port_access_id, :"SYS_B_36", :"SYS_B_37", :"SYS_B_38" ) ) <= to_number(:"SYS_B_39") AND f.attr_value = :"SYS_B_40" ORDER BY :"SYS_B_41"

I see around 5 different PHV’s or plans coming for this statement and last two 18733286 and 2202641467 are the worst plans here, first one with PHV 1663132818 is the BEST among all others.
1663132818 is with very less elapsed time, CPU Usage, IO time, LIOs, Disk reads, Rows Processed and Cost.
** From DBA_HIST_SQLSTAT view.

Let’s collect the OUTLINE and QUERY Block details of the statement, and this can be collected using
select * from table(dbms_xplan.display_awr(‘SQL_ID’, PHV, format => ‘ADVANCED’));

Query Block Name / Object Alias (identified by operation id):
————————————————————-

1 - SEL$841DDE77 10 - SEL$841DDE77 / VAL_ATTR_NAME@SEL$2 11 - SEL$841DDE77 / VAL_ATTR_NAME@SEL$2 12 - SEL$841DDE77 / F@SEL$1 13 - SEL$841DDE77 / F@SEL$1 14 - SEL$841DDE77 / A@SEL$1 15 - SEL$841DDE77 / A@SEL$1 16 - SEL$841DDE77 / C@SEL$1 17 - SEL$841DDE77 / C@SEL$1 18 - SEL$841DDE77 / D@SEL$1 19 - SEL$841DDE77 / D@SEL$1 20 - SEL$841DDE77 / E@SEL$1 21 - SEL$841DDE77 / E@SEL$1 22 - SEL$841DDE77 / B@SEL$1 23 - SEL$841DDE77 / B@SEL$1

Outline Data
————-
/*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('12.1.0.2') DB_VERSION('12.1.0.2') OPT_PARAM('optimizer_index_cost_adj' 10) FIRST_ROWS(1) OUTLINE_LEAF(@"SEL$841DDE77") MERGE(@"SEL$683B0107") OUTLINE(@"SEL$C772B8D1") UNNEST(@"SEL$2") OUTLINE(@"SEL$683B0107") OUTLINE(@"SEL$7511BFD2") OUTLINE(@"SEL$2") OUTLINE(@"SEL$1") INDEX_RS_ASC(@"SEL$841DDE77" "VAL_ATTR_NAME"@"SEL$2" ("VAL_ATTR_NAME"."GROUP_NAME" "VAL_ATTR_NAME"."ATTR_NAME")) INDEX_RS_ASC(@"SEL$841DDE77" "F"@"SEL$1" ("PORT_ATTR_SETTINGS"."ATTR_VALUE")) BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$841DDE77" "F"@"SEL$1") INDEX_RS_ASC(@"SEL$841DDE77" "A"@"SEL$1" ("EPA"."PORT_INST_ID")) INDEX_RS_ASC(@"SEL$841DDE77" "C"@"SEL$1" ("EQUIP_INST"."EQUIP_INST_ID")) INDEX_RS_ASC(@"SEL$841DDE77" "D"@"SEL$1" ("EPA"."PORT_INST_ID")) INDEX_RS_ASC(@"SEL$841DDE77" "E"@"SEL$1" ("CARD_INST"."CARD_INST_ID")) INDEX(@"SEL$841DDE77" "B"@"SEL$1" ("CARD_INST"."CARD_INST_ID")) LEADING(@"SEL$841DDE77" "VAL_ATTR_NAME"@"SEL$2" "F"@"SEL$1" "A"@"SEL$1" "C"@"SEL$1" "D"@"SEL$1" "E"@"SEL$1" "B"@"SEL$1") USE_NL(@"SEL$841DDE77" "F"@"SEL$1") USE_NL(@"SEL$841DDE77" "A"@"SEL$1") USE_NL(@"SEL$841DDE77" "C"@"SEL$1") USE_NL(@"SEL$841DDE77" "D"@"SEL$1") USE_NL(@"SEL$841DDE77" "E"@"SEL$1") USE_NL(@"SEL$841DDE77" "B"@"SEL$1") NLJ_BATCHING(@"SEL$841DDE77" "B"@"SEL$1") END_OUTLINE_DATA */

Let’s use the outline of the good plan as a hint to force the good execution plan to be used and check the execution time for the query.
Run the query from SQLPLUS using the hint:

SELECT
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE(‘12.1.0.2’)
DB_VERSION(‘12.1.0.2’)
OPT_PARAM(‘optimizer_index_cost_adj’ 10)
FIRST_ROWS(1)
OUTLINE_LEAF(@”SEL$841DDE77″)
MERGE(@”SEL$683B0107″)
OUTLINE(@”SEL$C772B8D1″)
UNNEST(@”SEL$2″)
OUTLINE(@”SEL$683B0107″)
OUTLINE(@”SEL$7511BFD2″)
OUTLINE(@”SEL$2″)
OUTLINE(@”SEL$1″)
INDEX_RS_ASC(@”SEL$841DDE77″ “VAL_ATTR_NAME”@”SEL$2” (“VAL_ATTR_NAME”.”GROUP_NAME” “VAL_ATTR_NAME”.”ATTR_NAME”))
INDEX_RS_ASC(@”SEL$841DDE77″ “F”@”SEL$1” (“PORT_ATTR_SETTINGS”.”ATTR_VALUE”))
BATCH_TABLE_ACCESS_BY_ROWID(@”SEL$841DDE77″ “F”@”SEL$1″)
INDEX_RS_ASC(@”SEL$841DDE77” “A”@”SEL$1” (“EPA”.”PORT_INST_ID”))
INDEX_RS_ASC(@”SEL$841DDE77″ “C”@”SEL$1” (“EQUIP_INST”.”EQUIP_INST_ID”))
INDEX_RS_ASC(@”SEL$841DDE77″ “D”@”SEL$1” (“EPA”.”PORT_INST_ID”))
INDEX_RS_ASC(@”SEL$841DDE77″ “E”@”SEL$1” (“CARD_INST”.”CARD_INST_ID”))
INDEX(@”SEL$841DDE77″ “B”@”SEL$1” (“CARD_INST”.”CARD_INST_ID”))
LEADING(@”SEL$841DDE77″ “VAL_ATTR_NAME”@”SEL$2” “F”@”SEL$1” “A”@”SEL$1” “C”@”SEL$1” “D”@”SEL$1” “E”@”SEL$1”
“B”@”SEL$1″)
USE_NL(@”SEL$841DDE77” “F”@”SEL$1″)
USE_NL(@”SEL$841DDE77” “A”@”SEL$1″)
USE_NL(@”SEL$841DDE77” “C”@”SEL$1″)
USE_NL(@”SEL$841DDE77” “D”@”SEL$1″)
USE_NL(@”SEL$841DDE77” “E”@”SEL$1″)
USE_NL(@”SEL$841DDE77” “B”@”SEL$1″)
NLJ_BATCHING(@”SEL$841DDE77” “B”@”SEL$1”)
END_OUTLINE_DATA
*/
A.PORT_INST_ID AS MDF_ES_PORT_INST_ID, A.STATUS AS MDF_ES_PORT_STATUS, D.PORT_INST_ID AS MDF_LS_PORT_INST_ID, D.STATUS AS MDF_LS_PORT_STATUS, REGEXP_SUBSTR(C.DESCR,'[^ ]+’, 1,1) AS NAMING_AREA, A.STATUS AS MDF_ES_PORT_STATUS2, DECODE(A.RESERVATION_INST_ID, NULL, ‘N’, ‘Y’) AS MDF_ES_ACTIVE_RESERVATION, A.PORT_ACCESS_ID AS LIC_IDENTITY, EGI_GET_UDA_VALUE(A.EQUIP_INST_ID,’EQUIPMENT’,’Eqp Legacy Names’,’TAD-N’) AS KC, B.NAME AS COORDINATE, A.PORT_HUM_ID EGI_PORT_NAME, EGI_GET_UDA_VALUE(D.EQUIP_INST_ID,’EQUIPMENT’,’Eqp Legacy Names’,’TAD-N’) AS CABINET_NAME, E.NAME AS TERMINATION_BLOCK_NAME, D.PORT_ACCESS_ID AS PAIR_NAME, EGI_GET_UDA_VALUE(D.EQUIP_INST_ID,’EQUIPMENT’,’Eqp Legacy Names’,’SIGNSYST’) AS SIGNING_SYSTEM, EGI_GET_UDA_VALUE(A.PORT_INST_ID,’PORT’,’Copper Port Info’,’Processor Exchange’) AS PROCESSOR_EXCHANGE, A.CIRC_PATH_INST_ID AS ES_CIRC_PATH_INST_ID, A.NEXT_PATH_INST_ID AS ES_NEXT_PATH_INST_ID, D.CIRC_PATH_INST_ID AS LS_CIRC_PATH_INST_ID, D.NEXT_PATH_INST_ID AS LS_NEXT_PATH_INST_ID, DECODE(D.ROLE,’Primary’,’P’,’Secondary’,’S’,’Spare’,’R’,’Junction’,’F’,NULL) AS CABLING_TYPE FROM EPA A, CARD_INST B, EQUIP_INST C, EPA D, CARD_INST E, PORT_ATTR_SETTINGS F WHERE A.PORT_INST_ID =F.PORT_INST_ID AND F.VAL_ATTR_INST_ID= (SELECT VAL_ATTR_INST_ID FROM VAL_ATTR_NAME WHERE GROUP_NAME=’Copper Port Info’ AND ATTR_NAME =’Processor Exchange’ ) AND A.CARD_INST_ID =B.CARD_INST_ID AND B.TYPE =’ES’ AND A.EQUIP_INST_ID =C.EQUIP_INST_ID AND A.Z_WIRED_PORT_INST_ID =D.PORT_INST_ID(+) AND D.CARD_INST_ID =E.CARD_INST_ID(+) AND REGEXP_SUBSTR(A.PORT_ACCESS_ID,'[^-]+’, 1,1) =:B4 AND TO_NUMBER(REGEXP_SUBSTR(A.PORT_ACCESS_ID,'[^-]+’, 1,2))>=TO_NUMBER(:B3 ) AND TO_NUMBER(REGEXP_SUBSTR(A.PORT_ACCESS_ID,'[^-]+’, 1,2))<=TO_NUMBER(:B2 ) AND F.ATTR_VALUE =:B1 ORDER BY 8

I’ve executed the query in SIT and it’s taking no more than 5 seconds now with those OUTLINES, that is reeeeally good 😊

Now when the best plan is identified and tested, Question comes “HOW TO FIX THIS PERMANENTLY ??”

This can be done using one of the Oracle provided script named “coe_xfr_sql_profile.sql” which is part of popular SQLTXPLAIN/SQLT tool.
Download it from Metalink website (Direct Link: https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=458593899434965&parent=DOCUMENT&sourceId=1955195.1&id=215187.1&_afrWindowMode=0&_adf.ctrl-state=11co9htqxp_387#aref_section13)

In order to fix or correct the cost estimates for CBO to adopt a good execution plan we can follow below steps.
• Run the script coe_xfr_sql_profile.sql as SYSDBA user providing the sql_id and the good Plan Hash Value (PHV)
SQL> START coe_xfr_sql_profile.sql SQL_ID PHV

• Above step generates a script named in the format: “coe_xfr_sql_profile_SQL_ID_PLAN_HASH_VALUE.sql” i.e. including the sql_id and plan hash value.
Example: “coe_xfr_sql_profile_ SQLID_PHV.sql”

• If you like to use the profile for all literals, you can modify the force_match parameter in the generated script from false to true: force_match => TRUE
This will ensure the profile will be enabled even when different literals are used in the SQL query.

• Run the generated script as SYSDBA user in order to correct the optimizer cost estimates used in the execution plan for the sql_id and encourage it to use the desired plan obtained from the stated plan hash value.

• Next we can check if profile is accepted successfully
select SQL_ID, SQL_PROFILE,PLAN_HASH_VALUE from V$SQL where SQL_ID=’ SQL ID’;

• In case if profile is required to be disabled/dropped then do following steps.
The profile can be disabled or dropped as follows using the the SQL_PROFILE name returned from the query above:

EXEC DBMS_SQLTUNE.ALTER_SQL_PROFILE(Name => ”, Attribute_Name => ‘STATUS’, Value => ‘DISABLED’);

To Drop the Profile:
EXEC DBMS_SQLTUNE.DROP_SQL_PROFILE(Name => ”);

This way we can test and fix a Plan Stability issues with any SQL statement!

Hope It Helps!
Prashant Dixit

Posted in Advanced | Tagged: performance, SQL, Tuning | 4 Comments »

MySQL ERROR 1054 (42S22): Unknown column ‘Password’ in ‘field list’ – Version 5.7

Posted by FatDBA on November 27, 2017

mysql> update mysql.user set Password = PASSWORD(‘mysql’) where user =’root’;
ERROR 1054 (42S22): Unknown column ‘Password’ in ‘field list’

WHY ??????
This was working all good in other instances of MySQL where i had earlier versions installed, why not this one – Puzzled, Perplexed!
Let me check version information of this instance.

Well, starting from MySQL version 5.7 the PASSWORD column from mysql.user table has been removed and now replaced with ‘authentication_string’.
So the all new syntax for this password reset would be like this …

mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql>

mysql> update user set authentication_string=password(‘mysql’) where user=’root’;
Query OK, 2 rows affected, 1 warning (0.00 sec)
Rows matched: 3 Changed: 2 Warnings: 1

Hope That Helps
Prashant Dixit

Posted in Basics | Tagged: mysql | 2 Comments »

MYSQL startup error: [ERROR] Fatal error: mysql.user table is damaged.

Posted by FatDBA on November 15, 2017

Hi Mates,

While working with one of the client for his brand new installation i’ve encountered a weird problem while starting the MYSQL (5.7.20) daemon on RHEL6 where the MYSQLD service failed to start with below errors or issues captured in error logs.

[root@dixitlab ~]# service mysqld start MySQL Daemon failed to start. Starting mysqld: [FAILED]

Snippet from the error Logs:

2017-11-15T10:21:03.957212Z 0 [Note] InnoDB: File ‘./ibtmp1’ size is now 12 MB.
2017-11-15T10:21:11.147615Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2017-11-15T10:21:11.147902Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2017-11-15T10:21:11.291204Z 0 [Note] InnoDB: Creating sys_virtual system tables.
2017-11-15T10:21:11.300921Z 0 [Note] InnoDB: sys_virtual table created
2017-11-15T10:21:11.301245Z 0 [Note] InnoDB: Waiting for purge to start
2017-11-15T10:21:11.354201Z 0 [Note] InnoDB: 5.7.20 started; log sequence number 0
2017-11-15T10:21:11.354623Z 0 [Note] Plugin ‘FEDERATED’ is disabled.
2017-11-15T10:21:11.354976Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 9560ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
2017-11-15T10:21:11.355390Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2017-11-15T10:21:11.569467Z 0 [Warning] System table ‘plugin’ is expected to be transactional.
2017-11-15T10:21:11.570388Z 0 [Note] Salting uuid generator variables, current_pid: 29102, server_start_time: 1510741261, bytes_sent: 0,
2017-11-15T10:21:11.570971Z 0 [Note] Generated uuid: ‘b3e664f7-c9ee-11e7-9b23-000c29593ffb’, server_start_time: 8191484773744281275, bytes_sent: 44900352
2017-11-15T10:21:11.571109Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: b3e664f7-c9ee-11e7-9b23-000c29593ffb.
2017-11-15T10:21:11.573332Z 0 [Warning] Gtid table is not ready to be used. Table ‘mysql.gtid_executed’ cannot be opened.
2017-11-15T10:21:11.573745Z 0 [Warning] Failed to set up SSL because of the following SSL library error: SSL context is not usable without certificate and private key
2017-11-15T10:21:11.574116Z 0 [Note] Server hostname (bind-address): ‘*’; port: 3306
2017-11-15T10:21:11.574540Z 0 [Note] IPv6 is available.
2017-11-15T10:21:11.574745Z 0 [Note] – ‘::’ resolves to ‘::’;
2017-11-15T10:21:11.574891Z 0 [Note] Server socket created on IP: ‘::’.

2017-11-15T10:21:11.580607Z 0 [ERROR] Fatal error: mysql.user table is damaged. Please run mysql_upgrade.
2017-11-15T10:21:11.580879Z 0 [ERROR] Aborting

So after taking a look at the error log it’s quite clear that the startup failed with a ‘Fatal Error’ which in turn crashed the entire startup process for the instance with error message “mysql.user table is damaged”. At the same time it gives a solution or a fix to run the mysql_upgrade, but as the instance failed to start it was not possible to execute the command.

Here is what happened when i tried to execute the mysql_upgrade

bash-4.1$ mysql_upgrade
mysql_upgrade: Got error: 2002: Can’t connect to local MySQL server through socket ‘/var/lib/mysql/mysql.sock’ (2) while connecting to the MySQL server
Upgrade process encountered error and will not continue.

*******SOLUTION*********
As a fix to avoid this deadlock, I’ve started the server with skip-grant-tables option.
This can be done by adding the ‘skip-grant-tables’ line to the my.cnf (Configuration File) withing section [mysqld].

bash-4.1$ su –
Password:
[root@dixitlab ~]#
[root@dixitlab ~]# vi /etc/my.cnf

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
innodb_data_file_path = ibdata1:10M:autoextend
skip-grant-tables

Now, lets try to start the mysql server now.

[root@dixitlab ~]# service mysqld start
Starting mysqld: [ OK ]
[root@dixitlab ~]#

Boom! It worked. Now quickly try to run the mysql_upgrade step to fix the initial problem.

-bash-4.1$ mysql_upgrade
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
Checking system database.
mysql.columns_priv OK
mysql.db OK
mysql.engine_cost OK
mysql.event OK
mysql.func OK
mysql.general_log OK
mysql.gtid_executed OK
mysql.help_category OK
mysql.help_keyword OK
mysql.help_relation OK
mysql.help_topic OK
mysql.host OK
mysql.innodb_index_stats OK
mysql.innodb_table_stats OK
mysql.ndb_binlog_index OK
mysql.plugin OK
mysql.proc OK
mysql.procs_priv OK
mysql.proxies_priv OK
mysql.server_cost OK
mysql.servers OK
mysql.slave_master_info OK
mysql.slave_relay_log_info OK
mysql.slave_worker_info OK
mysql.slow_log OK
mysql.tables_priv OK
mysql.time_zone OK
mysql.time_zone_leap_second OK
mysql.time_zone_name OK
mysql.time_zone_transition OK
mysql.time_zone_transition_type OK
mysql.user OK
Upgrading the sys schema.
Checking databases.
sys.sys_config OK
Upgrade process completed successfully.
Checking if update is needed.
-bash-4.1$
-bash-4.1$

Now when it is done, lets revert the changes that we have made to the configuration file and remove the skip-grant-table entry from my.cnf file and restart the MYSQLD service.

[root@dixitlab ~]# vi /etc/my.cnf
[root@dixitlab ~]#
[root@dixitlab ~]#
[root@dixitlab ~]# service sqld restart
sqld: unrecognized service
[root@dixitlab ~]# service mysqld restart
Stopping mysqld: [ OK ]
Starting mysqld: [ OK ]
[root@dixitlab ~]#

Lets try to connect with the database now.

bash-4.1$
bash-4.1$ mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.20 MySQL Community Server (GPL)

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the current input statement.

mysql>

Hope This Helps
Prashant Dixit

Posted in Advanced | Tagged: mysql | 1 Comment »

Why my ASM Command Line (ASMCMD) is so slow, How to make ASMCMD run faster ?

Posted by FatDBA on November 1, 2017

ASMCMD is a command-line utility that you can use to easily view and manipulate files and directories within Automatic Storage Management (ASM) disk groups. It can list the contents of disk groups, perform searches, create and remove directories and aliases, display space utilization, and more.

But some of the times i have noticed some errors or slowness in command executions with ASMCMD and i believe you guys have too faced the same in the past. And the problem with ASMCMD errors are that they are not much detailed and are obscure which makes the troubleshooting more complicated and direction less.

There are few of the methods or the ways that i follow to handle performance issues with the asmcmd command line are given below.

1. Use ORADEBUG
What happens when you connect with ASMCMD ?
It actually connects with the ASM instance with SYSASM privilege and the same moment a background local process spawns with name BEQ.
Now once you recognize the process using ps -ef commands you can bind it to the ORADEBUG with errostack flag.

2. Truss or STRACE of ASMCMD and its processes.

example:

$ strace -aeft -o /dixit/labtest/asmcmdtrbsst.log asmcmd
ASMCMD>

3. Set the DBI_TRACE for ASMCMD perl tracing
Asmcmd is a wrapper for asmcmdcore script which is a shell script that starts a Perl program. If you are a Perl programmer, you can easily extend this script to add additional commands and security checks. We can use the DBI_TRACE argument to collect more diagnostic information on asm command line.

$ export DBI_TRACE=1
ASMCMD>

Hope That Helps
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: ASM, troubleshooting | Leave a Comment »

CKPT process blocking table gather stats session intermittently … Why ?

Posted by FatDBA on November 1, 2017

Hi Folks,
Today i would like to share one of the experience that we had while working in one of the production system with a customer with a weird situation where the Gather stats session getting intermittently blocked by CKPT database background process in database and sometimes stays as it is for more than 30 mins.

We were getting the “enq: RO – fast object reuse” wait contention when gathering schema/table statistics in parallel using DBMS_STATS package with DEGREE>1

During the analysis i’ve generated the System State dump and saw a clear blocking situation on object Enq RO-00010059-00000001 .

Snippet from SS Dump.

Resource Holder State
Enq RO-00010059-00000001 14: waiting for ‘rdbms ipc message’
Enq RO-00010059-00000001 89: 89: is waiting for 14: 89:

Workaround for the problem is either of the two solutions
– We can try flush the Buffer Cache.
Though flushing the buffer cache causes dirty blocks to be written to disk and will have some performance impact.
– Setting the parameter “_db_fast_obj_truncate” to FALSE.
This will revert back to 9i way of invalidating buffers in buffer cache.

Hope That Helps
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: performance, troubleshooting | Leave a Comment »

What’s new in coming few days here on FATDBA ….

Posted by FatDBA on October 24, 2017

Some of the performance troubleshooting that i have done using few of the specialized tools/scripts – SYSTEMTAP, CALIPER, STRACE, PTRACE, Few of the Profilers, Representation using PGraph, FlameGraphs and few of the beautifully written scripts/tools from legends like Brendan Gregg (www.brendangregg.com/), Luca Canali (cern.ch/canali) etc.

Posted in Advanced | Tagged: performance | Leave a Comment »

DB Sick/Hung/Slow :( …. How to troubleshoot using ‘Real Time ADDM’ ??

Posted by FatDBA on October 24, 2017

The new emergency monioring feature of “Real Time ADDM” or RT ADDM (Performance Menu within the EM Console) of Enterprise Manager Cloud provides a new feature which allows the DBAs to connect with the non responsive or hang/stalled database. Now someone might ask that this was earlier achieved using ‘-prelim’ option which allows the execution of ORADEBUG commands but if you remember it was not that flexible and allows only limited arguments like hanganalyze, ashdump etc.

This all new feature of 12c is one step ahead as it directly connects (Using the light weight lock/latch-free connection) with the SGA area and hits the hang Analysis and Active Session History tables and also helps to view the blockers etc. and other in-memory performance statistics.

So this is quite helpful at times when we have …
– Sick Systems
– DB is very slow
– DB Hung due to any contention for resources etc.
– DBAs are unable to login to Database.

So before that moment when we are all set to bounce the database there is this option available to take a look inside the database to understand the system.

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: performance | Leave a Comment »

12c all new Parallel Upgrade utility.

Posted by FatDBA on October 24, 2017

With Oracle Database 12c enters the all new Parallel Upgrade Utility, catctl.pl. This utility exchanges the catupgrd.sql script that was used in earlier releases.
Although you can still use the catupgrd.sql script, it is vilipended starting with Oracle Database 12c and will be removed in future releases.
Oracle urges database upgrades be performed with the new Parallel Upgrade Utility, catctl.pl.

If you choose to run the catupgrd.sql script instead of running catctl.pl, doing so now requires an additional input argument as follows:

SQL> @catupgrd.sql PARALLEL=NO

If you run catupgrd.sql without the parameter, then Oracle displays the following error message:

NOTE:

The catupgrd.sql script is being deprecated in the 12.1 release of Oracle Database. Customers are encouraged to use catctl.pl as the replacement for catupgrd.sql when upgrading the database dictionary.

cd $ORACLE_HOME/rdbms/admin
$ORACLE_HOME/perl/bin/perl catctl.pl -n 4 catupgrd.sql

Refer to the Oracle Database Upgrade Guide for more information.

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: 12c | Leave a Comment »

Are the Cardinality Estimates Correct in my Execution Plan ?

Posted by FatDBA on September 26, 2017

Struck in a difficult performance issue related with a SQL and you have to verify if the Cardinality estimates made by the MIGHTY CBO are correct, No idea how t0 do that 😦 😦

Lets things make little easy for ourselves!
Let me take an example and explain how to do this.

SQL Statement (From my Personal Test Environment):
SELECT COUNT (DISTINCT SB_NO) FROM OPS$EXP.C_AL_SB WHERE SB_NO IN (SELECT DISTINCT SB_NO FROM OPS$EXP.C_AL_AWB WHERE EGM_DT BETWEEN :1 AND :1 ) AND ERR_MESG =’S’

Below is the execution plan for the SQL (Lets forgot about the behemoth elapsed time and Cost and Rows Processed in the plan for a minute 🙂 ) ….

So the above plan doesn’t show any estimations or Cardinality details what it considered during the creation of the plan, But starting from 10g we have GATHER_PLAN_STATISTICS hint. The GATHER_PLAN_STATISTICS hint tells Oracle to collect execution statistics for a SQL statement.

These execution statistics are then shown next to the original Optimizer estimates in the execution plan if you use the function DBMS_XPLAN.DISPLAY_CURSOR to display the plan. You also have to set the FORMAT parameter to ‘ALLSTATS LAST’ (DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>’ALLSTATS LAST’)).

SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT (DISTINCT SB_NO) FROM OPS$EXP.C_AL_SB WHERE SB_NO IN (SELECT DISTINCT SB_NO FROM OPS$EXP.C_AL_AWB WHERE EGM_DT BETWEEN :1 AND :1 ) AND ERR_MESG =’S’;

The execution plan for the query is as follows:

The original Optimizer estimates are shown in the E-Rows column while the actual statistics gathered during execution are shown in the A-Rows column.

Posted in Advanced | Tagged: Tuning | 2 Comments »

How to tune the IO contentions related with the Compaction in Cassandra ?

Posted by FatDBA on August 20, 2017

Hi Fellas,
Back and this time with some performance tuning scopes for Cassandra DB during the ‘Compaction’ process.
Before i proceed, would like to explain a bit about the compaction in Cassandra and what exactly is this and why a necessary evil …

Compaction in Cassandra refers to the operation of merging multiple SSTables into a single new one. Typically, compaction is done in a database for two primary reasons:

– To reduce the storage usage.
– To improve read performance by merging keys and obtaining a consolidated index.

For example, in Apache Cassandra, data files are merged periodically to form compacted SSTables.

There is a good chance of contention happening in database due to Compaction activity as the Compaction increases I/O contention on SSTable data read. Writing data in Cassandra database is generally fast and the write impacts may not be seen but reading data from SSTables will be slow in case when I/O contention increases due to compaction activities and degrades the performance of the database.

First would like to discuss how to identify the compaction related contentions on the database.
– We can use the “nodetool tablestats” or the old “nodetool cfstats” command to
monitor or watch-keep SSTables.
Below is a sample result from one of the Cassandra database server, here we need to check
– Check if the count is keep on growing, because that points out that there may be contention between reading SST
and the compaction process.
– Read generally slows down due to an obvious reason of data distributed or fragmented across many SSTs and
Compaction running continuous in the background.

%nodetool tablestats -H dixit.playlist
Keyspace: dixit
Read Count: 182849
Read Latency: 0.11363755339104945 ms.
Write Count: 435355
Write Latency: 0.01956930550929701 ms.
Pending Flushes: 0
Table: standard1
SSTable count: 2
Space used (live): 51.62 MB
Space used (total): 51.62 MB
Space used by snapshots (total): 0 bytes
Off heap memory used (total): 302.36 KB
SSTable Compression Ratio: 0.0
Number of keys (estimate): 376390
Memtable cell count: 200120
Memtable data size: 45.16 MB
Memtable off heap memory used: 0 bytes
Memtable switch count: 2
Local read count: 182849
Local read latency: 0.125 ms
Local write count: 435355
Local write latency: 0.022 ms
Pending flushes: 0
Bloom filter false positives: 11
Bloom filter false ratio: 0.00000
Bloom filter space used: 265.81 KB
Bloom filter off heap memory used: 265.8 KB
Index summary off heap memory used: 36.57 KB
Compression metadata off heap memory used: 0 bytes
Compacted partition minimum bytes: 216 bytes
Compacted partition maximum bytes: 258 bytes
Compacted partition mean bytes: 258 bytes
Average live cells per slice (last five minutes): 1.0
Maximum live cells per slice (last five minutes): 1
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1

Below is the command that can be used to check for compaction statistics, here you need to look at the ‘pending tasks’, and ‘bytes total in progress’.

$ nodetool compactionstats
pending tasks: 5
compaction type keyspace table completed total unit progress
Compaction Keyspace1 Standard1 282310680 302170540 bytes 93.43%
Compaction Keyspace1 Standard1 58457931 307520780 bytes 19.01%
Active compaction remaining time : 0h00m16s

Solution to the problem
1. First one is quite simple – Avoid merging of update/delete requests.
2. Reduce the frequency of in-memory objects (In Memtables) flush.

This can be done by increasing the size of the memtables to avoid or stop database to perform frequent flushes.
– Less number of flushes leads to fewer SSTs compaction.
– Less Compaction reduces the I/Contentions and this in turn improve reads.
– There are couple of parameters that you can adjust in your cassandra.yaml file to control the flushing.
i.e. memtable_flush_after_mins, memtable_throughput_in_mb , memtable_operations_in_millions.

3. One more solution but that only applies on systems where this stress in IO is not much frequent, we can reduce
the “thread priority” which reduces the IOs.
As lowering the priority slows down the compaction writes but only applies if it doesn’t happen frequently.

Add below lines in cassandra-env.conf file (Under /conf folder) to lower the compaction priority.

JVM_OPTS=”$JVM_OPTS -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Dcassandra.compaction.priority=1″

One last line would like to add.
In case when the IO is a genuine problem, you will need to add more nodes or replace disks with better performing one’s or high IO disks.

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: cassandra, performance, Tuning | Leave a Comment »

« Previous Entries

Next Entries »

Tales From A Lazy Fat DBA

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes

How to fix SQL Plan issues using OUTLINE DATA.

MySQL ERROR 1054 (42S22): Unknown column ‘Password’ in ‘field list’ – Version 5.7

MYSQL startup error: [ERROR] Fatal error: mysql.user table is damaged.

Why my ASM Command Line (ASMCMD) is so slow, How to make ASMCMD run faster ?

CKPT process blocking table gather stats session intermittently … Why ?

What’s new in coming few days here on FATDBA ….

DB Sick/Hung/Slow :( …. How to troubleshoot using ‘Real Time ADDM’ ??

12c all new Parallel Upgrade utility.

Are the Cardinality Estimates Correct in my Execution Plan ?

How to tune the IO contentions related with the Compaction in Cassandra ?

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: