Tales From A Lazy Fat DBA

Opatch Failed error code 73: OUI-67073: UtilSession failed: Prerequisite check “CheckActiveFilesAndExecutables”

Posted by FatDBA on March 2, 2015

Issue:
Upgrade error from 11.2.0.2 to 11.2.0.4

Error Description:
Oracle SPU / CPU patch deployment using Opatch filed with following error message.
Following executables are active :
/u01/app/oracle/product/11.2.0.2/home/lib/libclntsh.so.11.1
UtilSession failed: Prerequisite check “CheckActiveFilesAndExecutables” failed.
Log file location: /u01/app/oracle/product/11.2.0.2/home/cfgtoollogs/opatch/opatch2014-9-14_12-10-00PM.log

OPatch failed with error code 73

Cause:
There are some files which are locked or some processes still running while applying patch . those should be avoided

Full Error in log:

[Mar 1, 2015 4:19:20 PM] Finish fuser command /sbin/fuser /u01/app/oracle/product/11.2.0.2/home/lib/libclntsh.so.11.1 at Fri Nov 22 14:10:20 CET 2014
[Mar 1, 2015 4:19:20 PM] Following executables are active:
/u01/app/oracle/product/11.2.0.2/home/lib/libclntsh.so.11.1
[Mar 1, 2015 4:19:20 PM] Prerequisite check “CheckActiveFilesAndExecutables” failed.
The details are:
Following executables are active:
/u01/app/oracle/product/11.2.0.2/home/lib/libclntsh.so.11.1
[Mar 1, 2015 4:19:20 PM] OUI-67073:UtilSession failed: Prerequisite check “CheckActiveFilesAndExecutables” failed.
[Mar 1, 2015 4:19:20 PM] Finishing UtilSession at Fri Nov 22 14:10:20 CET 2014

Solution Description
==================================
This error is simple. Firstly make sure DB and listener are down .

Solution 1:
some processes are still in running .to find out them try
ps -ef|grep db_name
then kill each process by using kill -9 1196 command

Solution 2:
check which process is locking this library file using below command and kill it:

$ /sbin/fuser /u01/app/oracle/product/11.2.0.2/home/lib/libclntsh.so.11.1
/u01/app/oracle/product/11.2.0.2/home/lib/libclntsh.so.11.1: 1196m 2215m

$ kill -9 1196

now run opatch apply and it will run without any issues this time.

Hope That Helps
Prashant Dixit

Posted in Uncategorized | Tagged: Errors, Patching, troubleshooting | 1 Comment »

Using _ALLOW_RESETLOGS_CORRUPTION in case of corruption: How to recover & open the database ?

Posted by FatDBA on March 2, 2015

Recently while opening a TEST RAC database we found the database down and was not available, we tried to start the database but received communication errors and left us clueless. Due to some urgent POC activity scheduled on the database we started investigating the root cause right away for error.
ORA-03113: end-of-file on communication channel

[oracle@testdbdixit ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Feb 26 01:04:23 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.

Connected to an idle instance.
SQL> startup
ORA-03113: end-of-file on communication channel

We tried to open the database in MOUNT mode and it reached to phase easily without any error.

SQL> STARTUP MOUNT;
ORACLE instance started.

Total System Global Area 1.6034E+10 bytes
Fixed Size 2269072 bytes
Variable Size 2449473648 bytes
Database Buffers 1.3556E+10 bytes
Redo Buffers 26480640 bytes
Database mounted.

But the recovery of any kind got failed because of the database running in No Archivelog Mode & being a test instance there is also not any RMAN backups configured as well … #TotalDisaster 😦 😦

SQL> ALTER DATABASE RECOVER DATABASE UNTIL CANCEL;
ALTER DATABASE RECOVER DATABASE UNTIL CANCEL
*
ERROR at line 1:
ORA-00279: change 7311130 generated at 02/25/2015 22:00:18 needed for thread 2
ORA-00289: suggestion : +FRA
ORA-15173: entry ‘ARCHIVELOG’ does not exist in directory ‘DIXITDB’
ORA-00280: change 7311130 for thread 2 is in sequence #207

SQL> archive log list
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 233
Current log sequence 234

When tried to start the cancel based incomplete recovery, it went successful but received few inconsistency errors for system datafile if we try to open the database in RESETLOGS mode.

SQL> ALTER DATABASE RECOVER CANCEL;
ALTER DATABASE RECOVER CANCEL
*
ERROR at line 1:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: ‘+DATA/DIXITDB/datafile/system.256.871197447’

As expected RESETLOGS mode failed too.

SQL> ALTER DATABASE OPEN RESETLOGS;
ALTER DATABASE OPEN RESETLOGS
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: ‘+DATA/DIXITDB/datafile/system.256.871197447’

Resolution:
====================
*Note: Underscore (hidden/undocumented) parameters should only be used after consent with Oracle Support and should always tried and tested in sandbox environments before applying them in prod environment.

There is a hidden parameter _ALLOW_RESETLOGS_CORRUPTION=TRUE which will allow us to open database even though it’s not properly recovered.
ALTER SYSTEM SET “_allow_resetlogs_corruption”= TRUE SCOPE = SPFILE;
Tip: Also change the undo_management to “Manual”

After the two changes in the spfile you can open the database with:

sqlplus “/ as sysdba”
startup force

Note: Well there is no 100% guarantee that setting _ALLOW_RESETLOGS_CORRUPTION=TRUE will open the database. However, once the database is opened, then we must immediately rebuild the database. Database rebuild means doing the following, namely: (1) perform a full-database export, (2) create a brand new and separate database, and finally (3) import the recent export dump. This option can be tedious and time consuming, but once we successfully open the new database, then we expect minimal or perhaps no data loss at all. Before you try this option, ensure that you have a good and valid backup of the current database.

Previous Settings:
SQL> show parameter undo

NAME TYPE VALUE
———————————— ———– ——————————
undo_management string AUTO
undo_retention integer 900
undo_tablespace string UNDOTBS1

SQL> alter system set undo_management=manual scope=spfile;
System altered.

SQL> ALTER SYSTEM SET “_allow_resetlogs_corruption”= TRUE SCOPE = SPFILE;
System altered.

SQL> shut immediate;

And after setting all the requisite parameters to defined values, we finally saw that ‘Database Opened’ message on the SQL prompt … 🙂 🙂

SQL> startup force;
ORACLE instance started.

Total System Global Area 1.6034E+10 bytes
Fixed Size 2269072 bytes
Variable Size 2449473648 bytes
Database Buffers 1.3556E+10 bytes
Redo Buffers 26480640 bytes
Database mounted.
Database opened.

SQL> ALTER DATABASE OPEN RESETLOGS;
Database altered.

SQL> archive log list
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 1
Current log sequence 2

SQL> alter system switch logfile;
System altered.

And a new incarnation of the database happened after opening database in RESETLOGS mode.

SQL> select INCARNATION#,RESETLOGS_TIME,STATUS, RESETLOGS_ID from v$database_incarnation;

INCARNATION# RESETLOGS STATUS RESETLOGS_ID
———— ——— ——- ————
1 24-AUG-13 PARENT 824297850
2 09-FEB-15 PARENT 871197521
3 26-FEB-15 CURRENT 872646322

In Short:
=====================
1) Set _ALLOW_RESETLOGS_CORRUPTION=TRUE in init.ora file.
2) Startup Mount
3) Recover database
4) Alter database open resetlogs.
5) reset undo_management to “manual” in init.ora file.
6) startup database
7) Create new undo tablespace (There is a strict need of changing the UNDO_MANAGEMENT to AUTO in Prod databases)
change undo_management to “AUTO” and undo_tablespace to “NewTablespace”
9) Bounce database.

Hope That Helps!
Prashant Dixit

Posted in Uncategorized | Tagged: Errors, troubleshooting | 6 Comments »

How to monitor/troubleshoot RAC Stack … Answer is “ORAchk”

Posted by FatDBA on March 2, 2015

Below provided are steps to call and a sample health check report generated for a 2 node RAC system via ORAchk Tool.
link to download: https://support.oracle.com/epmos/faces/SearchDocDisplay?_adf.ctrl-state=kfoe5ynno_4&_afrLoop=178033262862018
Note ID: 1268927.2

=======================================================================

[oracle@dixitdb12v dixit]$ ./orachk
This version of orachk was released on 09-Oct-2014 and its older than 120 days. No new version of orachk is available in RAT_UPGRADE_LOC. It is highly recommended that you download the latest version of orachk from my oracle support to ensure the highest level of accuracy of the data contained within the report.

Do you want to continue running this version? [y/n][y]y

CRS stack is running and CRS_HOME is not set. Do you want to set CRS_HOME to /opt/app/grid/11.2.0/grid?[y/n][y]y

Checking ssh user equivalency settings on all nodes in cluster

Node dixitdb13v is configured for ssh user equivalency for oracle user

Searching for running databases . . . . .

. .
List of running databases registered in OCR
1. TESTRAC
2. None of above

Select databases from list for checking best practices. For multiple databases, select 1 for All or comma separated number like 1,2 etc [1-2][1].1
. .

Checking Status of Oracle Software Stack – Clusterware, ASM, RDBMS

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
——————————————————————————————————-
Oracle Stack Status
——————————————————————————————————-
Host Name CRS Installed RDBMS Installed CRS UP ASM UP RDBMS UP DB Instance Name
——————————————————————————————————-
dixitdb12v Yes Yes Yes Yes Yes TESTRAC1
dixitdb13v Yes Yes Yes Yes Yes TESTRAC2
——————————————————————————————————-

Copying plug-ins

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

18 of the included audit checks require root privileged data collection . If sudo is not configured or the root password is not available, audit checks which require root privileged data collection can be skipped.

1. Enter 1 if you will enter root password for each host when prompted

2. Enter 2 if you have sudo configured for oracle user to execute root_orachk.sh script

3. Enter 3 to skip the root privileged collections

4. Enter 4 to exit and work with the SA to configure sudo or to arrange for root access and run the tool later.

Please indicate your selection from one of the above options for root access[1-4][1]:- 1

*** Checking Best Practice Recommendations (PASS/WARNING/FAIL) ***

Collections and audit checks log file is
/home/oracle/dixit/orachk_dixitdb12v_TESTRAC_022715_032812/log/orachk.log

Running orachk in serial mode because expect(/usr/bin/expect) is not available to supply root passwords on remote nodes

NOTICE: Installing the expect utility (/usr/bin/expect) will allow orachk to gather root passwords at the beginning of the process and execute orachk on all nodes in parallel speeding up the entire process. For more info – http://www.nist.gov/el/msid/expect.cfm. Expect is available for all major platforms. See User Guide for more details.

Checking for prompts in /home/oracle/.bash_profile on dixitdb12v for oracle user…

Checking for prompts in /home/oracle/.bash_profile on dixitdb13v for oracle user…

=============================================================
Node name – dixitdb12v
=============================================================

Collecting – ASM Disk Groups
Collecting – ASM Disk I/O stats
Collecting – ASM Diskgroup Attributes
Collecting – ASM disk partnership imbalance
Collecting – ASM diskgroup attributes
Collecting – ASM diskgroup usable free space
Collecting – ASM initialization parameters
Collecting – Active sessions load balance for TESTRAC database
Collecting – Archived Destination Status for TESTRAC database
Collecting – Cluster Interconnect Config for TESTRAC database
Collecting – Database Archive Destinations for TESTRAC database
Collecting – Database Files for TESTRAC database
Collecting – Database Instance Settings for TESTRAC database
Collecting – Database Parameters for TESTRAC database
Collecting – Database Parameters for TESTRAC database
Collecting – Database Properties for TESTRAC database
Collecting – Database Registry for TESTRAC database
Collecting – Database Sequences for TESTRAC database
Collecting – Database Undocumented Parameters for TESTRAC database
Collecting – Database Undocumented Parameters for TESTRAC database
Collecting – Database Workload Services for TESTRAC database
Collecting – Dataguard Status for TESTRAC database
Collecting – Files not opened by ASM
Collecting – Log Sequence Numbers for TESTRAC database
Collecting – Percentage of asm disk Imbalance
Collecting – Process for shipping Redo to standby for TESTRAC database
Collecting – RDBMS Feature Usage for TESTRAC database
Collecting – Redo Log information for TESTRAC database
Collecting – Standby redo log creation status before switchover for TESTRAC database
Collecting – /proc/cmdline
Collecting – /proc/modules
Collecting – CPU Information
Collecting – CRS active version
Collecting – CRS oifcfg
Collecting – CRS software version
Collecting – CSS Reboot time
Collecting – CSS disktimout
Collecting – Cluster interconnect (clusterware)
Collecting – Clusterware OCR healthcheck
Collecting – Clusterware Resource Status
Collecting – DiskFree Information
Collecting – DiskMount Information
Collecting – Huge pages configuration
Collecting – Interconnect network card speed
Collecting – Kernel parameters
Collecting – Maximum number of semaphore sets on system
Collecting – Maximum number of semaphores on system
Collecting – Maximum number of semaphores per semaphore set
Collecting – Memory Information
Collecting – NUMA Configuration
Collecting – Network Interface Configuration
Collecting – Network Performance
Collecting – Network Service Switch
Collecting – OS Packages
Collecting – OS version
Collecting – Operating system release information and kernel version
Collecting – Oracle Executable Attributes
Collecting – Patches for Grid Infrastructure
Collecting – Patches for RDBMS Home
Collecting – Shared memory segments
Collecting – Table of file system defaults
Collecting – Voting disks (clusterware)
Collecting – number of semaphore operations per semop system call
Preparing to run root privileged commands dixitdb12v. Please enter root password when prompted.
root@dixitdb12v’s password:
Collecting – ACFS Volumes status
Collecting – Broadcast Requirements for Networks
Collecting – CRS user time zone check
Collecting – Custom rc init scripts (rc.local)
Collecting – Disk Information
Collecting – Grid Infastructure user shell limits configuration
Collecting – Interconnect interface config
Collecting – Network interface stats
Collecting – Number of RDBMS LMS running in real time
Collecting – OLR Integrity
Collecting – Root user limits
Collecting – Verify no database server kernel out of memory errors
Collecting – root time zone check
Collecting – slabinfo

Data collections completed. Checking best practices on dixitdb12v.
————————————————————————————–

WARNING => Cluster Health Monitor (CHM) repository does not provide recommended level of retention
INFO => Important Automatic Storage Management (ASM) Notes and Technical White Papers
FAIL => Bash is vulnerable to code injection (CVE-2014-6271)
WARNING => ARCHIVELOG mode is disabled for TESTRAC
INFO => $CRS_HOME/log/hostname/client directory has too many older log files.
INFO => ORA-00600 errors found in alert log for TESTRAC
INFO => At some times checkpoints are not being completed for TESTRAC
INFO => Some data or temp files are not autoextensible for TESTRAC
INFO => oracleasm (asmlib) module is NOT loaded
WARNING => Shell limit soft nproc for DB is NOT configured according to recommendation
WARNING => kernel.shmmax parameter is NOT configured according to recommendation
WARNING => Database Parameter memory_target is not set to the recommended value on TESTRAC1 instance
FAIL => Operating system hugepages count does not satisfy total SGA requirements
WARNING => NIC bonding is not configured for interconnect
WARNING => NIC bonding is NOT configured for public network (VIP)
WARNING => OSWatcher is not running as is recommended.
INFO => Jumbo frames (MTU >= 8192) are not configured for interconnect
WARNING => NTP is not running with correct setting
WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for TESTRAC
FAIL => Flashback on PRIMARY is not configured for TESTRAC
INFO => Operational Best Practices
INFO => Database Consolidation Best Practices
INFO => Computer failure prevention best practices
INFO => Data corruption prevention best practices
INFO => Logical corruption prevention best practices
INFO => Database/Cluster/Site failure prevention best practices
INFO => Client failover operational best practices
WARNING => fast_start_mttr_target should be greater than or equal to 300. on TESTRAC1 instance

INFO => Information about hanganalyze and systemstate dump
WARNING => Package unixODBC-2.2.14-11.el6-i686 is recommended but NOT installed
WARNING => Package unixODBC-devel-2.2.14-11.el6-i686 is recommended but NOT installed
FAIL => Table AUD$[FGA_LOG$] should use Automatic Segment Space Management for TESTRAC
INFO => Database failure prevention best practices
WARNING => Database Archivelog Mode should be set to ARCHIVELOG for TESTRAC
FAIL => Primary database is NOT protected with Data Guard (standby database) for real-time data protection and availability for TESTRAC
INFO => Parallel Execution Health-Checks and Diagnostics Reports for TESTRAC
WARNING => Package unixODBC-devel-2.2.14-11.el6-x86_64 is recommended but NOT installed
WARNING => Linux transparent huge pages are enabled
WARNING => vm.min_free_kbytes should be set as recommended.
INFO => Oracle recovery manager(rman) best practices
WARNING => RMAN controlfile autobackup should be set to ON for TESTRAC
INFO => Consider increasing the COREDUMPSIZE size
INFO => Consider investigating changes to the schema objects such as DDLs or new object creation for TESTRAC
INFO => Consider investigating the frequency of SGA resize operations and take corrective action for TESTRAC

Best Practice checking completed.Checking recommended patches on dixitdb12v.
———————————————————————————

Collecting patch inventory on CRS HOME /opt/app/grid/11.2.0/grid
Collecting patch inventory on ORACLE_HOME /opt/app/oracle/product/11.2.0/dbhome_1
———————————————————————————
1 Recommended CRS patches for 112040 from /opt/app/grid/11.2.0/grid on dixitdb12v
———————————————————————————
Patch# CRS ASM RDBMS RDBMS_HOME Patch-Description
———————————————————————————
19769489 no yes /opt/app/oracle/product/11.2.0/dbhome_1Patch description: “Database Patch Set Update : 11.2.0.4.5 (19769489)”
———————————————————————————

———————————————————————————
1 Recommended RDBMS patches for 112040 from /opt/app/oracle/product/11.2.0/dbhome_1 on dixitdb12v
———————————————————————————
Patch# RDBMS ASM type Patch-Description
———————————————————————————
19769489 yes merge Patch description: “Database Patch Set Update : 11.2.0.4.5 (19769489)”
———————————————————————————
———————————————————————————

———————————————————————————
Clusterware patches summary report
———————————————————————————
Total patches Applied on CRS Applied on RDBMS Applied on ASM
———————————————————————————
1 0 1 0
———————————————————————————

———————————————————————————
RDBMS homes patches summary report
———————————————————————————
Total patches Applied on RDBMS Applied on ASM ORACLE_HOME
———————————————————————————
1 1 0 /opt/app/oracle/product/11.2.0/dbhome_1
———————————————————————————

=============================================================
Node name – dixitdb13v
=============================================================

Collecting – /proc/cmdline
Collecting – /proc/modules
Collecting – CPU Information
Collecting – CRS active version
Collecting – CRS oifcfg
Collecting – CRS software version
Collecting – Cluster interconnect (clusterware)
Collecting – DiskFree Information
Collecting – DiskMount Information
Collecting – Huge pages configuration
Collecting – Interconnect network card speed
Collecting – Kernel parameters
Collecting – Maximum number of semaphore sets on system
Collecting – Maximum number of semaphores on system
Collecting – Maximum number of semaphores per semaphore set
Collecting – Memory Information
Collecting – NUMA Configuration
Collecting – Network Interface Configuration
Collecting – Network Performance
Collecting – Network Service Switch
Collecting – OS Packages
Collecting – OS version
Collecting – Operating system release information and kernel version
Collecting – Oracle Executable Attributes
Collecting – Patches for Grid Infrastructure
Collecting – Patches for RDBMS Home
Collecting – Shared memory segments
Collecting – Table of file system defaults
Collecting – number of semaphore operations per semop system call
Preparing to run root privileged commands dixitdb13v. Please enter root password when prompted.
root@dixitdb13v’s password:

Data collections completed. Checking best practices on dixitdb13v.
————————————————————————————–

FAIL => Bash is vulnerable to code injection (CVE-2014-6271)
INFO => $CRS_HOME/log/hostname/client directory has too many older log files.
INFO => ORA-00600 errors found in alert log for TESTRAC
INFO => At some times checkpoints are not being completed for TESTRAC
INFO => oracleasm (asmlib) module is NOT loaded
WARNING => Shell limit soft nproc for DB is NOT configured according to recommendation
WARNING => kernel.shmmax parameter is NOT configured according to recommendation
WARNING => Database Parameter memory_target is not set to the recommended value on TESTRAC2 instance
FAIL => Operating system hugepages count does not satisfy total SGA requirements
WARNING => NIC bonding is not configured for interconnect
WARNING => NIC bonding is NOT configured for public network (VIP)
WARNING => OSWatcher is not running as is recommended.
INFO => Jumbo frames (MTU >= 8192) are not configured for interconnect
WARNING => NTP is not running with correct setting
WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for TESTRAC
WARNING => fast_start_mttr_target should be greater than or equal to 300. on TESTRAC2 instance

INFO => IMPORTANT: Oracle Database Patch 17478514 PSU is NOT applied to RDBMS Home /opt/app/oracle/product/11.2.0/dbhome_1
WARNING => Package unixODBC-2.2.14-11.el6-i686 is recommended but NOT installed
WARNING => Package unixODBC-devel-2.2.14-11.el6-i686 is recommended but NOT installed
WARNING => Package unixODBC-devel-2.2.14-11.el6-x86_64 is recommended but NOT installed
WARNING => Linux transparent huge pages are enabled
WARNING => vm.min_free_kbytes should be set as recommended.
INFO => Consider increasing the COREDUMPSIZE size

Best Practice checking completed.Checking recommended patches on dixitdb13v.
———————————————————————————

Collecting patch inventory on CRS HOME /opt/app/grid/11.2.0/grid
Collecting patch inventory on ORACLE_HOME /opt/app/oracle/product/11.2.0/dbhome_1
———————————————————————————
1 Recommended CRS patches for 112040 from /opt/app/grid/11.2.0/grid on dixitdb13v
———————————————————————————
Patch# CRS ASM RDBMS RDBMS_HOME Patch-Description
———————————————————————————
18706472 no no /opt/app/oracle/product/11.2.0/dbhome_1GRID INFRASTRUCTURE SYSTEM PATCH 11.2.0.4.3
———————————————————————————

———————————————————————————
1 Recommended RDBMS patches for 112040 from /opt/app/oracle/product/11.2.0/dbhome_1 on dixitdb13v
———————————————————————————
Patch# RDBMS ASM type Patch-Description
———————————————————————————
18706472 no merge GRID INFRASTRUCTURE SYSTEM PATCH 11.2.0.4.3
———————————————————————————
———————————————————————————

———————————————————————————
Clusterware patches summary report
———————————————————————————
Total patches Applied on CRS Applied on RDBMS Applied on ASM
———————————————————————————
1 0 0 0
———————————————————————————

———————————————————————————
RDBMS homes patches summary report
———————————————————————————
Total patches Applied on RDBMS Applied on ASM ORACLE_HOME
———————————————————————————
1 0 0 /opt/app/oracle/product/11.2.0/dbhome_1
———————————————————————————

———————————————————————————
CLUSTERWIDE CHECKS
———————————————————————————
———————————————————————————

Detailed report (html) – /home/oracle/dixit/orachk_dixitdb12v_TESTRAC_022715_032812/orachk_dixitdb12v_TESTRAC_022715_032812.html

UPLOAD(if required) – /home/oracle/dixit/orachk_dixitdb12v_TESTRAC_022715_032812.zip

Thanks
Prashant Dixit

Posted in Uncategorized | Tagged: RAC, troubleshooting | Leave a Comment »

Online Patching & Bug Fixing — > : Bug 16342845 : EXCESSIVE CPU IN DBW PROCESSES FOR FAST OBJECT CHECKPOINTSDB

Posted by FatDBA on March 2, 2015

During further investigation we found that all the Database Writers started consuming high resources and waits on ‘DB FILE ASYNC I/O SUBMIT’ event and all rest of the database sessions started waiting on ‘enq: KO – Fast Object Checkpoint’

We have one most critical production system running on oracle 11.2.0.3 on the top of RHEL.
Recently we started experiencing a very bizarre and weird issue where applications timeouts happens intermittently and during that specific period extremely slow IO operations happens at the database level.

While investigation found that during the issue DB started throwing huge and frequent waits on ‘enq: KO – fast object checkpoint’ and ‘db file async I/O submit’ events and all of the queries in the database during the period starts snailing.
At the same time all the 8 available DBWR processes started waiting on “db file async I/O Submit” wait event and seems takes time to write blocks to the disk and this leads to late check-pointing in the system which in turn slow down database.

From AWR
—————
foreground wait event
enq: KO – fast object checkpoint — 70% of the DB time.

background wait event
db file async I/O submit — 90% of bg time

Wait Event Histogram
———————-
db file async I/O submit — 100% >1s (% of waits)
enq: CR – block range reuse ckpt — 100% >1s (% of waits)
enq: KO – fast object checkpoint — 100% >1s (% of waits)
enq: RO – fast object reuse — 100% >1s (% of waits)

After weeks of probing and hours of analyzing system state dumps, Hang analyze and writer process traces we reached to a conclusion that we have strike by a Bug 16342845 : EXCESSIVE CPU IN DBW PROCESSES FOR FAST OBJECT CHECKPOINTSDB

Below mentioned are steps performed to apply the recommended patch.
PRIMARY: disable log shipping
STANDBY: stop MRP

1. Disable Log Shipping on Primary Database.
2. Stop MRP process on Physical Standby Database.
3. Apply the patch first in to Physical Standby (online)
4. Monitor the performance on the Phy STDBY db.
5. Will apply the patch on Logical Standby Database.
6. Monitor the logical database performance.
7. Will apply the patch on Primary Database.
8. Monitor the system performance.
9. Start Log shipping on Primary DB end.
10.Start MRP process on Standby database.

Step 1.
Disable log shipping on Primary DB.
SQL> show parameter LOG_ARCHIVE_DEST_

NAME TYPE VALUE
———————————— ——————————– ——————————
log_archive_dest_1 string location=”/vol5/oracle/dixit_db_
arc/dixitdb/arch”, valid_for=(A
LL_LOGFILES,ALL_ROLES)

log_archive_dest_2 string service=”dixitdb_al”, LGWR ASYNC
NOAFFIRM delay=0 optional com
pression=disable max_failure=0
max_connections=1 reopen=300
db_unique_name=”dixitdb_al” net_
timeout=30, valid_for=(all_log
files,primary_role)

log_archive_dest_3 string service=”dixitdb_r”, NOAFFIRM AS
YNC VALID_FOR=(ALL_LOGFILES,PR
IMARY_ROLE) DB_UNIQUE_NAME=one
tm_r

alter system set log_archive_dest_state_2=defer scope=both;

Step 2:
Stop MRP process on Physical Standby Database.

Before
PROCESS STATUS DELAY_MINS
——— ———— ———-
MRP0 WAIT_FOR_LOG 0

SQL> alter database recover managed standby database cancel;

After cancelling recover process the MRP automatically stops.

PROCESS STATUS DELAY_MINS
——— ———— ———-

Step 3:
Opatch online apply – common to all environments.

Recommendations
1. Take binaries backup (ORACLE_HOME).
2. Take Full DB Backup.
3. Check Opatch Prerequisites in order to identify and resolve any patch conflicts.

-bash-3.2$ export PATH=$PATH:$ORACLE_HOME/OPatch

-bash-3.2$ opatch prereq CheckConflictAgainstOHWithDetail -ph ./
Oracle Interim Patch Installer version 11.2.0.3.0
Copyright (c) 2012, Oracle Corporation. All rights reserved.

PREREQ session

Oracle Home : /opt/app/oracle/product/11gR2
Central Inventory : /opt/app/oracle/oraInventory
from : /opt/app/oracle/product/11gR2/oraInst.loc
OPatch version : 11.2.0.3.0
OUI version : 11.2.0.3.0
Log file location : /opt/app/oracle/product/11gR2/cfgtoollogs/opatch/opatch2015-03-01_05-34-42AM_1.log

Invoking prereq “checkconflictagainstohwithdetail”

Prereq “checkConflictAgainstOHWithDetail” passed.

-bash-3.2$ opatch apply online -connectString dixitdb_al:sys:xxxxxxx
Oracle Interim Patch Installer version 11.2.0.3.0
Copyright (c) 2012, Oracle Corporation. All rights reserved.

The patch should be applied/rolled back in ‘-all_nodes’ mode only.
Converting the RAC mode to ‘-all_nodes’ mode.
Applying interim patch ‘20505138’ to OH ‘/opt/app/oracle/product/11gR2’
Verifying environment and performing prerequisite checks…
All checks passed.
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name:

You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]: Y
Backing up files…

Patching component oracle.rdbms, 11.2.0.3.0…
Installing and enabling the online patch ‘bug20505138.pch’, on database ‘dixitdb_al’.

Verifying the update…
Patch 20505138 successfully applied
Log file location: /opt/app/oracle/product/11gR2/cfgtoollogs/opatch/20505138_Mar_01_2015_05_35_29/apply2015-03-01_05-35-29AM_1.log

OPatch succeeded.

From Alert log during the patch apply:

Sun Mar 01 06:17:13 2015
Patch file bug20505138.pch is out of sync with oracle binary; performing fixup
Patch file bug20505138.pch has been synced with oracle binary
Patch bug20505138.pch Installed – Update #1
Patch bug20505138.pch Enabled – Update #2
Sun Mar 01 06:17:15 2015
Online patch bug20505138.pch has been installed
Online patch bug20505138.pch has been enabled

-bash-3.2$ opatch lsinventory
Oracle Interim Patch Installer version 11.2.0.3.0
Copyright (c) 2012, Oracle Corporation. All rights reserved.

Lsinventory Output file location : /opt/app/oracle/product/11gR2/cfgtoollogs/opatch/lsinv/lsinventory2015-03-01_05-35-53AM.txt

——————————————————————————–
Installed Top-level Products (1):

Oracle Database 11g 11.2.0.3.0
There are 1 products installed in this Oracle Home.

Interim patches (5) :

Patch (online) 20505138: applied on Sun Mar 01 05:35:42 CET 2015
Unique Patch ID: 18568953
Created on 9 Feb 2015, 19:34:29 hrs PST8PDT
Bugs fixed:
16367081, 16463153, 16342845

‘Hope That Helps’
Thanks
Prashant Dixit

Posted in Uncategorized | Tagged: Bugs, Patching, troubleshooting | Leave a Comment »

How to check/Identify archival gaps in Standby Environment.

Posted by FatDBA on March 2, 2015

ON PRIMARY DATABASE
=====================

SQL> SELECT THREAD# “Thread”,SEQUENCE# “Last Sequence Generated”
FROM V$ARCHIVED_LOG
WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)
ORDER BY 1; 2 3 4

Thread Last Sequence Generated
———- ———————–
1                  105334
1                  105334
1                  105334

SQL> set time on
06:26:03 SQL>

06:28:04 SQL> alter system switch logfile;

System altered.

06:28:15 SQL> /.

System altered.

06:28:17 SQL> /

System altered.

06:28:17 SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /vol5/oracle/dixit_db_arc/dixitdb/arch
Oldest online log sequence     105334
Next log sequence to archive   105338
Current log sequence           105338

ON STANDBY DATABASE
======================

SQL> select process, status, THREAD#,SEQUENCE#,DELAY_MINS from v$managed_standby;

PROCESS   STATUS          THREAD# SEQUENCE# DELAY_MINS
——— ———— ———- ———- ———-
ARCH      CLOSING               1     105298          0
ARCH      CLOSING               1     105299          0
ARCH      CLOSING               1     105326          0
ARCH      CLOSING               1     105327          0
ARCH      CLOSING               1     105328          0
ARCH      CLOSING               1     105329          0
ARCH      CLOSING               1     105330          0
ARCH      CLOSING               1     105334          0
ARCH      CLOSING               1     105306          0
ARCH      CLOSING               1     105307          0
ARCH      CLOSING               1     105308          0

PROCESS   STATUS          THREAD# SEQUENCE# DELAY_MINS
——— ———— ———- ———- ———-
ARCH      CLOSING               1     105309          0
ARCH      CLOSING               1     105310          0
ARCH      CLOSING               1     105311          0
ARCH      CLOSING               1     105312          0
ARCH      CLOSING               1     105285          0
ARCH      CLOSING               1     105313          0
ARCH      CLOSING               1     105314          0
ARCH      CLOSING               1     105315          0
ARCH      CLOSING               1     105316          0
ARCH      CLOSING               1     105317          0
ARCH      CLOSING               1     105318          0

PROCESS   STATUS          THREAD# SEQUENCE# DELAY_MINS
——— ———— ———- ———- ———-
ARCH      CLOSING               1     105319          0
ARCH      CLOSING               1     105320          0
ARCH      CLOSING               1     105321          0
ARCH      CLOSING               1     105322          0
ARCH      CLOSING               1     105323          0
ARCH      CLOSING               1     105324          0
ARCH      CONNECTED             0          0          0
ARCH      CLOSING               1     105325          0
RFS       IDLE                  1     105335          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0

PROCESS   STATUS          THREAD# SEQUENCE# DELAY_MINS
——— ———— ———- ———- ———-
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
MRP0      WAIT_FOR_LOG          1     105335          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0

40 rows selected.

SQL> SELECT ARCH.THREAD# “Thread”, ARCH.SEQUENCE# “Last Sequence Received”, APPL.SEQUENCE# “Last Sequence Applied”, (ARCH.SEQUENCE# – APPL.SEQUENCE#) “Difference”
FROM
(SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH,
(SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL
WHERE
ARCH.THREAD# = APPL.THREAD#
ORDER BY 1;

Thread Last Sequence Received Last Sequence Applied Difference
———- ———————- ——————— ———-
1 105334 105334 0

SQL> SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;

no rows selected

SQL> set time on
06:26:06 SQL> select name, open_mode, database_role from v$database;

NAME OPEN_MODE DATABASE_ROLE
——— ——————– —————-
dixitdb READ ONLY WITH APPLY PHYSICAL STANDBY

06:28:20 SQL> SELECT ARCH.THREAD# “Thread”, ARCH.SEQUENCE# “Last Sequence Received”, APPL.SEQUENCE# “Last Sequence Applied”, (ARCH.SEQUENCE# – APPL.SEQUENCE#) “Difference”
FROM
(SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH,
(SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL
WHERE
ARCH.THREAD# = APPL.THREAD#
ORDER BY 1;

Thread Last Sequence Received Last Sequence Applied Difference
———- ———————- ——————— ———-
1 105337 105337 0

06:28:32 SQL>

06:29:35 SQL> select process, status, THREAD#,SEQUENCE#,DELAY_MINS from v$managed_standby;

PROCESS   STATUS          THREAD# SEQUENCE# DELAY_MINS
——— ———— ———- ———- ———-
ARCH      CLOSING               1     105298          0
ARCH      CLOSING               1     105299          0
ARCH      CLOSING               1     105326          0
ARCH      CLOSING               1     105327          0
ARCH      CLOSING               1     105328          0
ARCH      CLOSING               1     105329          0
ARCH      CLOSING               1     105330          0
ARCH      CLOSING               1     105334          0
ARCH      CLOSING               1     105306          0
ARCH      CLOSING               1     105335          0
ARCH      CLOSING               1     105336          0

PROCESS   STATUS          THREAD# SEQUENCE# DELAY_MINS
——— ———— ———- ———- ———-
ARCH      CLOSING               1     105337          0
ARCH      CLOSING               1     105310          0
ARCH      CLOSING               1     105311          0
ARCH      CLOSING               1     105312          0
ARCH      CLOSING               1     105285          0
ARCH      CLOSING               1     105313          0
ARCH      CLOSING               1     105314          0
ARCH      CLOSING               1     105315          0
ARCH      CLOSING               1     105316          0
ARCH      CLOSING               1     105317          0
ARCH      CLOSING               1     105318          0

PROCESS   STATUS          THREAD# SEQUENCE# DELAY_MINS
——— ———— ———- ———- ———-
ARCH      CLOSING               1     105319          0
ARCH      CLOSING               1     105320          0
ARCH      CLOSING               1     105321          0
ARCH      CLOSING               1     105322          0
ARCH      CLOSING               1     105323          0
ARCH      CLOSING               1     105324          0
ARCH      CONNECTED             0          0          0
ARCH      CLOSING               1     105325          0
RFS       IDLE                  1     105338          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0

PROCESS   STATUS          THREAD# SEQUENCE# DELAY_MINS
——— ———— ———- ———- ———-
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
MRP0      WAIT_FOR_LOG          1     105338          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0

Thanks
Prashant Dixit

Posted in Uncategorized | Tagged: standby | Leave a Comment »

Large Pages & use_large_pages parameter in Oracle — BMC Patrol Alerts global parameter ”MEMUsedMemPerc” triggered on ”MEMORY.MEMORY”

Posted by FatDBA on December 31, 2014

All of sudden one morning we started constant alerts from one of our production system used for SAP and other BI/BO applications which reads about Memory Used Percentage threshold breached. Being one of the very rarely used and least bust production system out of all, this was never expected from the server. I have previously written about the benefits and usage of large pages in Linux.

Summary: prashantdb053 global parameter ”MEMUsedMemPerc” triggered on ”MEMORY.MEMORY”. 95 <= 95.03 <= 100
Notes: prashantdb053 global parameter ”MEMUsedMemPerc” triggered on ”MEMORY.MEMORY”. 95 <= 95.03 <= 100 Object=MEMORY Object Class=MEMORY Parameter=MEMUsedMemPerc
BMC Impact Manager Initiated Incident

The database I have running doesn’t use large pages, as shown in the alert.log:
****************** Large Pages Information *****************
Total Shared Global Region in Large Pages = 0 KB (0%)
Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide = 0 (0 KB) (alloc incr 16 MB)
Large Pages configured system wide = 0 (0 KB)
Large Page size = 2048 KB

RECOMMENDATION:
Total Shared Global Region size is 2514 MB. For optimal performance,
prior to the next instance restart increase the number
of unused Large Pages by atleast 1257 2048 KB Large Pages (2514 MB)
system wide to get 100% of the Shared
Global Region allocated with Large pages
***********************************************************

Load statistics on the server at the time when we got the error message.

top – 07:26:24 up 30 days, 34 min, 3 users, load average: 0.07, 0.05, 0.01
Tasks: 183 total,   1 running, 182 sleeping,   0 stopped,   0 zombie
Cpu(s): 0.1%us, 0.1%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem:   3913512k total, 3886112k used,    27400k free,   149432k buffers
Swap: 4947960k total,   284392k used, 4663568k free, 3036064k cached

Memory Statistics on the server.

[oracle@prashantdb053 /backup/dixit/scripts]# free -m
total       used       free     shared    buffers     cached
Mem:          3821       3795         26          0        146       2965
-/+ buffers/cache:        684       3137
Swap:         4831        277       4554

[oracle@prashantdb053 /backup/dixit/scripts]# cat /proc/meminfo
MemTotal:      3913512 kB
MemFree:         26504 kB
Buffers:        149548 kB
Cached:        3036664 kB
SwapCached:       4620 kB
Active:        2250272 kB
Inactive:      1377312 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      3913512 kB
LowFree:         26504 kB
SwapTotal:     4947960 kB
SwapFree:      4663568 kB
Dirty:             132 kB
Writeback:           0 kB
AnonPages:      437440 kB
Mapped:        1668000 kB
Slab:            86592 kB
PageTables:     143008 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   6904716 kB
Committed_AS: 4125228 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    265480 kB
VmallocChunk: 34359472519 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

Below stats shows that Large Page are not utlized or used.
[oracle@prashantdb053 /backup/dixit/scripts]# grep Huge /proc/meminfo
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

[oracle@prashantdb053 /backup/dixit/scripts]# vmstat -s
3913512 total memory
3884496 used memory
2248032 active memory
1376892 inactive memory
29016 free memory
149560 buffer memory
3036676 swap cache
4947960 total swap
284392 used swap
4663568 free swap
4399893 non-nice user cpu ticks
123135 nice user cpu ticks
1173193 system cpu ticks
497863320 idle cpu ticks
9827196 IO-wait cpu ticks
156731 IRQ cpu ticks
337956 softirq cpu ticks
0 stolen cpu ticks
14560315 pages paged in
286499680 pages paged out
1086328 pages swapped in
1030486 pages swapped out
2808993403 interrupts
2839798366 CPU context switches
1417413111 boot time
333424 forks

As you can see the parameter is static and requires an instance restart, so this is what I did next. Here is an interesting side effect of setting the parameter to “auto”: it doesn’t have an effect if you didn’t prepare the system for use of large pages in /etc/security/limits.conf.

07:33:48 SQL> select value,isdefault from V$PARAMETER_VALID_VALUES where name = ‘use_large_pages’;

VALUE                ISDEFAULT
——————– —————
TRUE                 TRUE
AUTO                 FALSE
ONLY                 FALSE
FALSE                FALSE

07:44:05 SQL> show parameter use_large_pages

NAME TYPE VALUE
———————————— ———– ——————————
use_large_pages string TRUE

So let’s change that, but dynamically and not manually.
07:57:32 SQL> alter system set use_large_pages=auto scope=spfile;
System altered.

HugePagesTotal is still 0, which means system is still not using Large Pages, To allow oracle to lock memory you need to grant it the privilege. I had to edit /etc/security/limits.conf and set the memlock parameters.

[oracle@prashantdb053 /backup/dixit/scripts]# grep Huge /proc/meminfo
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

Being a static parameter we need to bounce the database to make changes persistent.
07:58:00 SQL> shut immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

07:58:18 SQL> startup
ORACLE instance started.

07:58:54 SQL> show parameter use_large_pages

NAME TYPE VALUE
———————————— ———– ——————————
use_large_pages string AUTO

Entries from alert log which shows that it has been changed to ‘AUTO’ values and started using large pages.

****************** Large Pages Information *****************
Parameter use_large_pages = AUTO
Total Shared Global Region in Large Pages = 2048 KB (0%)
Large Pages used by this instance: 1 (2048 KB)
Large Pages unused system wide = 3 (6144 KB) (alloc incr 16 MB)
Large Pages configured system wide = 4 (8192 KB)
Large Page size = 2048 KB
Time taken to allocate Large Pages = 0.033721 sec

RECOMMENDATION:
Total Shared Global Region size is 2514 MB. For optimal performance,
prior to the next instance restart increase the number
of unused Large Pages by atleast 1253 2048 KB Large Pages (2506 MB)
system wide to get 100% of the Shared
Global Region allocated with Large pages
***********************************************************

Finally after the change of parameter and hard coding limits system finally started using large pages and alerts coming from the Patrol Agents.

[oracle@prashantdb053 /opt/oracle/diag/rdbms/prashantdb/prashantdb/trace]# grep Huge /proc/meminfo
HugePages_Total:     4
HugePages_Free:      3
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

Thanks
Prashant Dixit

Posted in Advanced | Tagged: Errors, linux | Leave a Comment »

Want to monitor your Exa Health — Answer is ‘Exachk’ …

Posted by FatDBA on December 30, 2014

Recently while working on one of the test exadata machine came across one of the very useful tool to monitoring exadata system health. Being one of the most popular topics that customers talk about is the monitoring of your exadata health.

The best tool for this is the Exachk (see MOS Doc ID 1070954.1)
This document contains the current Exachk release.

The recommendation or advices for Exachk is to:

1) Run the exachk (at a minimum) quarterly, and after any changes are made to the configuration
2) ALWAYS run the current exachk. This script is periodically updated/improved upon so it is very important to be current
3) Keep track of any failures to ensure that you can identify any new items that appear in the report
4) A score of 80 or above is a good score for production. It is very rare to have a score that is 99+.

There are also a great whitepaper released in September 2013. This white paper can be referred and downloaded from here.

Click to access exadata-health-resource-usage-2021227.pdf

Thanks
Prashant Dixit

Posted in Advanced | Tagged: exadata, performance | 2 Comments »

EXECUTION PLAN: “automatic DOP: skipped because of IO calibrate statistics are missing”

Posted by FatDBA on September 23, 2014

Recently during one Performance Problem i have attached one better SQL Profile to the statement which includes to add DOP (Degree Of Parallelism) to reduce the impacts of a definite FTS (Full Table Scan) but found one NOTE coming during the execution plan generation which reads
“automatic DOP: skipped because of IO calibrate statistics are missing”

SQL> execute dbms_sqltune.accept_sql_profile(task_name => ‘TASK_33935’,task_owner => ‘SYS’, replace => TRUE, profile_type =>DBMS_SQLTUNE.PX_PROFILE);

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.86
SQL> explain plan for select count(*) from DIXIT_EVW_ETAILQ;

Explained.

Elapsed: 00:00:00.03
SQL> @xplan

PLAN_TABLE_OUTPUT
———————————————————————————————————————————————————————–

————————————————————————————————————————————-
Plan hash value: 584586630

—————————————————————————————————————-
| Id | Operation              | Name             | Rows | Cost (%CPU)| Time     |    TQ |IN-OUT| PQ Distrib |
—————————————————————————————————————-
|   0 | SELECT STATEMENT       |                  |     1 | 36776   (1)| 00:07:22 |        |      |            |
|   1 | SORT AGGREGATE        |                  |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |                  |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000         |     1 |            |          | Q1,00 | P->S | QC (RAND) |
|   4 |     SORT AGGREGATE     |                  |     1 |            |          | Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |                  |   177M| 36776   (1)| 00:07:22 | Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS FULL| DIXIT_EVW_ETAILQ |   177M| 36776   (1)| 00:07:22 | Q1,00 | PCWP |            |
—————————————————————————————————————-

Note
—–
– automatic DOP: skipped because of IO calibrate statistics are missing
– SQL profile “SYS_SQLPROF_0148a0b0821b0005” used for this statement

The ‘AUTOMATIC DOP’ is skipped because I/O calibration is not run to gather the required statistics. Required statistics can be collected using DBMS_RESOURCE_MANAGER.CALIBRATE_IO procedure.

SQL> select status from V$IO_CALIBRATION_STATUS;

STATUS
——————————-
NOT AVAILABLE

DECLARE
lat INTEGER;
iops INTEGER;
mbps INTEGER;
BEGIN
–DBMS_RESOURCE_MANAGER.CALIBRATE_IO(, ,iops, mbps, lat);
DBMS_RESOURCE_MANAGER.CALIBRATE_IO (28, 10, iops, mbps, lat);
DBMS_OUTPUT.PUT_LINE (‘max_iops = ‘ || iops);
DBMS_OUTPUT.PUT_LINE (‘latency = ‘ || lat);
DBMS_OUTPUT.PUT_LINE (‘max_mbps = ‘ || mbps);
end;
/

max_iops = 5944
latency = 9
max_mbps = 75

18 rows selected.

Elapsed: 00:00:00.11

Issue:
If using DBMS_RESOURCE_MANAGER.CALIBRATE_IO there are times when you might recieve beloe error message
ORA-56708: Could not find any datafiles with asynchronous i/o capability

Resolution:
Then we need to enable asynch I/O, set below two values to mentioned settings in the init.ora file.

disk_asynch_io = true
filesystemio_options = asynch

SQL> select status from V$IO_CALIBRATION_STATUS;

STATUS
——————–
AVAILABLE

Now you can implement the DOP for the query and this way you can reduce FTS impacts.

Hope That Helps
Prashant Dixit

Posted in Uncategorized | Tagged: Errors, performance, Tuning | Leave a Comment »

Active Session History (ASH) performed an emergency flush. ASH Undersized ?

Posted by FatDBA on August 13, 2014

One day suddenly, I got a warning on my Warehouse system that,
Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized.

If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 16777216 bytes. Both ASH size and the total number of emergency flushes since instance start-up can be monitored by running the following query:

SQL> select total_size,awr_flush_emergency_count from v$ash_info;

TOTAL_SIZE AWR_FLUSH_EMERGENCY_COUNT
———- ————————-
33554432 8

Reasons:
Typically some activity on system causes more active sessions, therefore filling the ASH buffers faster than usual causing this message to be displayed. This indicates the buffers might need to be increased to support peak activity on the database.
It is not a problem per session, just indicates the buffers might need to be increased to support peak activity on the database.

Fix:
Increase the size of ASH (Consider increasing around 50%) — (current_value+50% of current_value);

SQL> alter system set “_ASH_SIZE”=25165824 scope=both;
System altered.

Posted in Advanced | Tagged: Errors | Leave a Comment »

Where are my DBA_BLOCKERS and DBA_WAITERS ??

Posted by FatDBA on July 12, 2014

While doing performance checks and tuning for one of the database found that two of the very well known dynamic views are not working and throwing “Does Not Exist”. We are using – Oracle Database 11g Release 11.2.0.3.0 – 64bit Production.
I was trying to call the views using SYS user.

Purpose of using discussed two dynamic views are:
DBA_BLOCKERS displays a session if it is not waiting for a locked object but is holding a lock on an object for which another session is waiting.
DBA_WAITERS shows all the sessions that are waiting for a lock, but do not hold locks for which another session is waiting.

SQL> conn / as sysdba
Connected.

SQL> desc dba_blockers;
ERROR:
ORA-04043: object dba_blockers does not exist

SQL> desc dba_waiters
ERROR:
ORA-04043: object dba_waiters does not exist

Solution:
——————-
Call one script catblock.sql from ORACLE_HOME/rdbms/admin folder which creates required views, synonyms and provides required grants to the DBA privileged user and it will be back.

SQL> @?/rdbms/admin/catblock.sql

View created.

Synonym created.

Grant succeeded.

—–

SQL>
SQL>
SQL>
SQL> desc dba_blockers
Name Null? Type
—————————————- ——– —————————
HOLDING_SESSION NUMBER

SQL> desc dba * waiters;
Usage: DESCRIBE [schema.]object[@db_link]
SQL> desc dba_waiters
Name                                     Null?    Type
—————————————- ——– —————————
WAITING_SESSION                                   NUMBER
HOLDING_SESSION                                   NUMBER
LOCK_TYPE                                         VARCHAR2(26)
MODE_HELD                                         VARCHAR2(40)
MODE_REQUESTED                                    VARCHAR2(40)
LOCK_ID1                                          NUMBER
LOCK_ID2                                          NUMBER

Posted in Advanced | Tagged: performance | Leave a Comment »

« Previous Entries

Next Entries »

Tales From A Lazy Fat DBA

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes

Opatch Failed error code 73: OUI-67073: UtilSession failed: Prerequisite check “CheckActiveFilesAndExecutables”

Using _ALLOW_RESETLOGS_CORRUPTION in case of corruption: How to recover & open the database ?

How to monitor/troubleshoot RAC Stack … Answer is “ORAchk”

Online Patching & Bug Fixing — > : Bug 16342845 : EXCESSIVE CPU IN DBW PROCESSES FOR FAST OBJECT CHECKPOINTSDB

How to check/Identify archival gaps in Standby Environment.

Large Pages & use_large_pages parameter in Oracle — BMC Patrol Alerts global parameter ”MEMUsedMemPerc” triggered on ”MEMORY.MEMORY”

Want to monitor your Exa Health — Answer is ‘Exachk’ …

EXECUTION PLAN: “automatic DOP: skipped because of IO calibrate statistics are missing”

Active Session History (ASH) performed an emergency flush. ASH Undersized ?

Where are my DBA_BLOCKERS and DBA_WAITERS ??

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: