Tales From A Lazy Fat DBA

Its all about Databases & their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Posts Tagged ‘ASM’

Disk Goes Offline after rebalance! – Is this due to a BUG ?

Posted by FatDBA on January 30, 2020

HI Everyone,

Today during one of the activity where we migrated the ASM Storage for one of our 2 Node RAC cluster (running on 11gR2), where we had to perform the disk rebalancing to copy/mirror the contents from older/existing storage to the new storage before we go and drop the older storage partitions, we faced some some weirdness. The disks goes offline in this multi-node ASM and we left stranded with initially no idea behind this behavior, but finally we were able to locate a metalink page for the same issue.

Yes, this was due to a known Bug with number 13476583
Oracle Server (Rdbms) Version
This problem is introduced in the
11.2.0.2.3 Patch Set Update
11.2.0.2.5 Patch Set Update
11.2.0.2.4 Patch Set Update
11.2.0.2.3 Patch Set Update
and in 11.2.0.3, by the fix for bug 10040921.

Problem:
When disks are dropped, a forcible diskgroup dismount is performed on other ASM instance/s.

Workaround or Fix:
1. The problem does not cause diskgroup corruption. So mostly diskgroup can be mounted again.
2. Apply fix
Interim patches here: Patch:13476583
11.2.0.2.6 Patch Set Update
11.2.0.2 Patch 17 on Windows Platforms

Oracle Notes: 245840.1

Hope That Helps
Prashant Dixit

Advertisement

Posted in Advanced, troubleshooting | Tagged: , | Leave a Comment »

Convert Standard ASM to Flex ASM.

Posted by FatDBA on March 15, 2018

Hi Mates,

Okay, you have an ASM instance crashed and at the same time the db instance failed on the instance … Expected behavior and many of us have faced this scenario in production RAC setups.
Answer to the question is ‘Flex ASM’ which provides us with something that was previously unattainable: the ability to run multiple, independent in cardinality, ASM instances. You can think of it what SCAN is to Database 11gR2.

Its been a while we have the Flex ASM available for 12c users, now the question – How to convert a Non-Flex ASM setup to Flex enabled ASM.
Below is the method to it, i performed a POC for one of the customer some time back and here are the steps.

Configuration:
RHEL 6, 64 Bit
2 Node 12cR1 RAC setup.
Hostname: rac1, rac2.
DB Instances: dixitdb1, dixitdb2

Let’s first check the network information of the cluster, the network interfaces and their IPv4 addresses, you can collect this info using oifcfg tool.

[oracle@rac1 ~]$ oifcfg getif
eth0  192.168.56.0  global  public
eth1  192.168.10.0  global  cluster_interconnect
[oracle@rac1 ~]$

Next lets check the ASM information and current mode.

[oracle@rac1 ~]$ srvctl status asm
ASM is running on rac1,rac2

[oracle@rac1 ~]$  srvctl config asm
ASM home: 
Password file: +DATA/orapwASM
ASM listener: LISTENER

[oracle@rac1 ~]$ asmcmd showclustermode
ASM cluster : Flex mode disabled
[oracle@rac1 ~]$

Okay, now lets do the conversion, we will be doing the silent conversion. You can use the ASMCA GUIas well to do the same.
Here used 192.168.10.0 as the IP and a free port for ASM LISTENER, we will use 1526 port here for listening all requests.

[oracle@rac1 ~]$ asmca -silent -convertToFlexASM -asmNetworks eth1/192.168.10.0 -asmListenerPort 1526

To complete ASM conversion, run the following script as privileged user in local node.
/u01/app/oracle/cfgtoollogs/asmca/scripts/converttoFlexASM.sh

Okay, so the last step generated an auto script which needs to be executed from root to do the real work. This will bounce all RAC components one by one on each node. By the end of the step we will have a new LISTENER exclusively created for the ASM instance and both of the two instances (ASM1, ASM2) will be registered with it.

[oracle@rac1 ~]$ su - root
Password:
[root@rac1 ~]# /u01/app/oracle/cfgtoollogs/asmca/scripts/converttoFlexASM.sh
CRS-2673: Attempting to stop 'ora.crsd' on 'rac1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.cvu' on 'rac1'
CRS-2673: Attempting to stop 'ora.mgmtdb' on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.dixitdb.db' on 'rac1'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'rac1'
CRS-2677: Stop of 'ora.cvu' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cvu' on 'rac2'
CRS-2677: Stop of 'ora.DATA.dg' on 'rac1' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'rac1' succeeded
CRS-2676: Start of 'ora.cvu' on 'rac2' succeeded
CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.scan2.vip' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.scan3.vip' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.rac1.vip' on 'rac1'
CRS-2677: Stop of 'ora.scan2.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.scan2.vip' on 'rac2'
CRS-2677: Stop of 'ora.dixitdb.db' on 'rac1' succeeded
CRS-2677: Stop of 'ora.rac1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.rac1.vip' on 'rac2'
CRS-2677: Stop of 'ora.mgmtdb' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.MGMTLSNR' on 'rac1'
CRS-2677: Stop of 'ora.scan3.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.scan3.vip' on 'rac2'
CRS-2676: Start of 'ora.scan2.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'rac2'
CRS-2677: Stop of 'ora.MGMTLSNR' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.MGMTLSNR' on 'rac2'
CRS-2676: Start of 'ora.rac1.vip' on 'rac2' succeeded
CRS-2676: Start of 'ora.scan3.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'rac2'
CRS-2676: Start of 'ora.MGMTLSNR' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.mgmtdb' on 'rac2'
CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'rac2' succeeded
CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'rac2' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.oc4j' on 'rac2'
CRS-2676: Start of 'ora.oc4j' on 'rac2' succeeded
CRS-2676: Start of 'ora.mgmtdb' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'rac1'
CRS-2677: Stop of 'ora.ons' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'rac1'
CRS-2677: Stop of 'ora.net1.network' on 'rac1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rac1' has completed
CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.storage' on 'rac1'
CRS-2677: Stop of 'ora.storage' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'rac1'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.evmd' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'rac1'
CRS-2676: Start of 'ora.storage' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded
Oracle Grid Infrastructure restarted in node rac1
PRCC-1014 : ASMNET1LSNR_ASM was already running
PRCR-1004 : Resource ora.ASMNET1LSNR_ASM.lsnr is already running
PRCR-1079 : Failed to start resource ora.ASMNET1LSNR_ASM.lsnr
CRS-5702: Resource 'ora.ASMNET1LSNR_ASM.lsnr' is already running on 'rac1'
CRS-5702: Resource 'ora.ASMNET1LSNR_ASM.lsnr' is already running on 'rac2'
ASM listener ASMNET1LSNR_ASM running already
CRS-2673: Attempting to stop 'ora.crsd' on 'rac2'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac2'
CRS-2673: Attempting to stop 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac2'
CRS-2673: Attempting to stop 'ora.cvu' on 'rac2'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'rac2'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'rac2'
CRS-2673: Attempting to stop 'ora.mgmtdb' on 'rac2'
CRS-2673: Attempting to stop 'ora.dixitdb.db' on 'rac2'
CRS-2673: Attempting to stop 'ora.oc4j' on 'rac2'
CRS-2677: Stop of 'ora.cvu' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.cvu' on 'rac1'
CRS-2676: Start of 'ora.cvu' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac2' succeeded
CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.scan3.vip' on 'rac2'
CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.scan2.vip' on 'rac2'
CRS-2677: Stop of 'ora.scan3.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.scan3.vip' on 'rac1'
CRS-2677: Stop of 'ora.scan2.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.scan2.vip' on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac2'
CRS-2677: Stop of 'ora.dixitdb.db' on 'rac2' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.rac2.vip' on 'rac2'
CRS-2677: Stop of 'ora.mgmtdb' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.MGMTLSNR' on 'rac2'
CRS-2676: Start of 'ora.scan3.vip' on 'rac1' succeeded
CRS-2677: Stop of 'ora.rac2.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.rac2.vip' on 'rac1'
CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'rac1'
CRS-2677: Stop of 'ora.MGMTLSNR' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.MGMTLSNR' on 'rac1'
CRS-2676: Start of 'ora.scan2.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'rac1'
CRS-2676: Start of 'ora.rac2.vip' on 'rac1' succeeded
CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'rac1' succeeded
CRS-2676: Start of 'ora.MGMTLSNR' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.mgmtdb' on 'rac1'
CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'rac1' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.oc4j' on 'rac1'
CRS-2676: Start of 'ora.mgmtdb' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'rac2'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rac2'
CRS-2677: Stop of 'ora.FRA.dg' on 'rac2' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'rac2' succeeded
CRS-2676: Start of 'ora.oc4j' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'rac2'
CRS-2677: Stop of 'ora.ons' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'rac2'
CRS-2677: Stop of 'ora.net1.network' on 'rac2' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rac2' has completed
CRS-2677: Stop of 'ora.crsd' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac2'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac2'
CRS-2673: Attempting to stop 'ora.storage' on 'rac2'
CRS-2677: Stop of 'ora.storage' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rac2'
CRS-2677: Stop of 'ora.ctssd' on 'rac2' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rac2' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac2'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac2'
CRS-2677: Stop of 'ora.cssd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac2'
CRS-2672: Attempting to start 'ora.evmd' on 'rac2'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac2'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac2'
CRS-2676: Start of 'ora.diskmon' on 'rac2' succeeded
CRS-2676: Start of 'ora.evmd' on 'rac2' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac2'
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac2'
CRS-2676: Start of 'ora.ctssd' on 'rac2' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac2'
CRS-2676: Start of 'ora.asm' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'rac2'
CRS-2676: Start of 'ora.storage' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac2'
CRS-2676: Start of 'ora.crsd' on 'rac2' succeeded
Oracle Grid Infrastructure restarted in node rac2
[root@rac1 ~]#

Okay, so it is done with the reboot of clusterware components and back to the prompt.
Let’s verify if it has been done or not …

[root@rac1 ~]# srvctl status asm
ASM is running on rac1,rac2

[root@rac1 ~]# asmcmd showclustermode
ASM cluster : Flex mode enabled           >>>> Flex Mode is ON now.

[root@rac1 ~]#  srvctl config asm
ASM home: 
Password file: +DATA/orapwASM
ASM listener: LISTENER
ASM instance count: ALL
Cluster ASM listener: ASMNET1LSNR_ASM

And we have a new LISTENER named ‘ASMNET1LSNR_ASM’ created for ASM.

[root@rac2 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE


[root@rac1 ~]# ps -ef|grep tns
root        10     2  0 15:59 ?        00:00:00 [netns]
oracle   22167     1  0 18:46 ?        00:00:00 /u01/app/12.1.0/grid_1/bin/tnslsnr ASMNET1LSNR_ASM -no_crs_notify -inherit
oracle   22291     1  0 18:46 ?        00:00:00 /u01/app/12.1.0/grid_1/bin/tnslsnr LISTENER -no_crs_notify -inherit
oracle   22532     1  0 18:46 ?        00:00:00 /u01/app/12.1.0/grid_1/bin/tnslsnr LISTENER_SCAN3 -no_crs_notify -inherit
oracle   22535     1  0 18:46 ?        00:00:00 /u01/app/12.1.0/grid_1/bin/tnslsnr MGMTLSNR -no_crs_notify -inherit
oracle   22544     1  0 18:46 ?        00:00:00 /u01/app/12.1.0/grid_1/bin/tnslsnr LISTENER_SCAN2 -no_crs_notify -inherit
root     30044 19089  0 18:52 pts/1    00:00:00 grep tns

[root@rac1 ~]# ps -ef|grep pmon
oracle   21494     1  0 18:46 ?        00:00:00 asm_pmon_+ASM1
oracle   22439     1  0 18:46 ?        00:00:00 ora_pmon_dixitdb1
oracle   22839     1  0 18:46 ?        00:00:00 mdb_pmon_-MGMTDB
root     30089 19089  0 18:52 pts/1    00:00:00 grep pmon
[root@rac1 ~]#


[root@rac1 ~]# lsnrctl status ASMNET1LSNR_ASM

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 13-MAR-2018 18:54:09

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM)))
STATUS of the LISTENER
------------------------
Alias                     ASMNET1LSNR_ASM
Version                   TNSLSNR for Linux: Version 12.1.0.2.0 - Production
Start Date                13-MAR-2018 18:46:25
Uptime                    0 days 0 hr. 7 min. 43 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0/grid_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/rac1/asmnet1lsnr_asm/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.10.1)(PORT=1526)))
Services Summary...
Service "+ASM" has 2 instance(s).
  Instance "+ASM1", status READY, has 2 handler(s) for this service...
  Instance "+ASM2", status READY, has 1 handler(s) for this service...
The command completed successfully
[root@rac1 ~]#


[root@rac1 ~]# srvctl config listener -l ASMNET1LSNR_ASM
Name: ASMNET1LSNR_ASM
Type: ASM Listener
Owner: oracle
Subnet: 192.168.10.0
Home: 
End points: TCP:1526
Listener is enabled.
Listener is individually enabled on nodes:
Listener is individually disabled on nodes:

[root@rac1 ~]# srvctl status listener -l ASMNET1LSNR_ASM
Listener ASMNET1LSNR_ASM is enabled
Listener ASMNET1LSNR_ASM is running on node(s): rac1,rac2

Let’s do some testing, i will here try to stop one of the ASM instance (+ASM1) on Node1 and will see if the DB Instance still alive and listens to requests.

[root@rac1 ~]# srvctl status database -db dixitdb -f -v
Instance dixitdb1 is running on node rac1. Instance status: Open.
Instance dixitdb2 is running on node rac2. Instance status: Open.

[root@rac1 ~]# srvctl modify asm -count 1
PRCA-1123 : The specified ASM cardinality 1 is less than the minimum cardinality of 2.

Well, this is an expected error because we are running on a 2 Node RAC and Flex ASM (Same as SCAN Listeners) needs at-least 2 Instance up and running which is not possible here in my case. But i will now kill the asm instance manually (Killing the PMON)

[root@rac1 ~]# ps -ef|grep pmon
root      4167  4142  0 19:41 pts/1    00:00:00 grep pmon
oracle   21494     1  0 18:46 ?        00:00:00 asm_pmon_+ASM1
oracle   22439     1  0 18:46 ?        00:00:00 ora_pmon_dixitdb1
oracle   22839     1  0 18:46 ?        00:00:00 mdb_pmon_-MGMTDB

[root@rac1 ~]# kill -9 21494
[root@rac1 ~]#  ps -ef|grep pmon
root      4200  4142  0 19:42 pts/1    00:00:00 grep pmon
oracle   22439     1  0 18:46 ?        00:00:00 ora_pmon_dixitdb1
oracle   22839     1  0 18:46 ?        00:00:00 mdb_pmon_-MGMTDB

Next, let’s see the ASM client connections info on avaialble instance (+ASM2)

SQL> select GROUP_NUMBER, DB_NAME, STATUS, INSTANCE_NAME from  v$asm_client;

GROUP_NUMBER DB_NAME  STATUS       INSTANCE_NAME
------------ -------- ------------ ----------------------------------------------------------------
           1 +ASM     CONNECTED    +ASM2
           2 +ASM     CONNECTED    +ASM2
           1 dixitdb  CONNECTED    dixitdb1
           2 dixitdb  CONNECTED    dixitdb1
           1 dixitdb  CONNECTED    dixitdb2
           2 dixitdb  CONNECTED    dixitdb2
           1 _mgmtdb  CONNECTED    -MGMTDB

7 rows selected.

And we have the Instance 1 (dixitdb1) connected with the +ASM2 instance, as +ASM1 is crashed/dead.
It’s listening all requests via ASM LISTENER, same can be verified or checked in asm listener logs.

13-MAR-2018 19:47:10 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=rac2.localdomain)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=ASMNET1LSNR_ASM)(VERSION=202375680)) * status * 0
13-MAR-2018 19:47:14 * (CONNECT_DATA=(SERVICE_NAME=+ASM)(INSTANCE_NAME=+ASM2)(CID=(PROGRAM=oracle)(HOST=rac1.localdomain)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.10.1)(PORT=39062)) * establish * +ASM * 0
 

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: , | Leave a Comment »

Why my ASM Command Line (ASMCMD) is so slow, How to make ASMCMD run faster ?

Posted by FatDBA on November 1, 2017

ASMCMD is a command-line utility that you can use to easily view and manipulate files and directories within Automatic Storage Management (ASM) disk groups. It can list the contents of disk groups, perform searches, create and remove directories and aliases, display space utilization, and more.

But some of the times i have noticed some errors or slowness in command executions with ASMCMD and i believe you guys have too faced the same in the past. And the problem with ASMCMD errors are that they are not much detailed and are obscure which makes the troubleshooting more complicated and direction less.

There are few of the methods or the ways that i follow to handle performance issues with the asmcmd command line are given below.

1. Use ORADEBUG
What happens when you connect with ASMCMD ?
It actually connects with the ASM instance with SYSASM privilege and the same moment a background local process spawns with name BEQ.
Now once you recognize the process using ps -ef commands you can bind it to the ORADEBUG with errostack flag.

2. Truss or STRACE of ASMCMD and its processes.

example:

$ strace -aeft -o /dixit/labtest/asmcmdtrbsst.log asmcmd
ASMCMD>

3. Set the DBI_TRACE for ASMCMD perl tracing
Asmcmd is a wrapper for asmcmdcore script which is a shell script that starts a Perl program. If you are a Perl programmer, you can easily extend this script to add additional commands and security checks. We can use the DBI_TRACE argument to collect more diagnostic information on asm command line.

$ export DBI_TRACE=1
ASMCMD>

Hope That Helps
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: , | Leave a Comment »

CELL_OFFLOAD_PROCESSING value in Non-Exadata Surrounds! — “High ‘ksv master wait’ And ‘ASM File Metadata Operation’ Wait Events

Posted by FatDBA on January 27, 2016

It was a case some time back while performing the performance analysis for one of the database, i encountered a curious wait event ‘ASM file metadata operation’. After further analysis i discovered that it’s happening specially when doing operations such as a DROP TABLESPACE, or in this case when using Data Pump. Though the server was basically not doing anything and much of the CPU cycles were IDLE but there was this strange wait event on ‘ASM file metadata operation’ with high waits on ‘ksv master wait’.

Below is the graphical depiction of the situation.
asm1

Image to highlight one of the major spike during the period.
asm2

Reason:
In an Exadata surrounds this is the suitable setting, nevertheless, in a non-Exadata environment, which it was in this case, this causes performance issues to arise as processes on the RDBMS awaits on a reply from the ASM which is trying to delivery smart-scan results.

Solution:
The following solutions are available for non-Exadata databases:
1. For the quickest solution, use the workaround. The workaround does not negatively impact non-Exadata databases.
This parameter is to be set on the database instance.
alter system set cell_offload_processing = false;
2. Upgrade to 12.1, when available. OR
3. Apply the 11.2.0.3 patch set OR
4. Apply one-off Patch 11800170, if available for your RDBMS and Grid Homes


Hope That Helps
Prashant Dixit

Posted in Advanced | Tagged: , | Leave a Comment »

Part 1: ASM Installation on 11gR2 (VMWare)

Posted by FatDBA on January 10, 2016

Hello Everyone,
Today i would like to start series/chapters describing Oracle Automatic Storage Management (Oracle ASM) concepts and provides an overview of Oracle ASM features. Followed posts will covers subjects like Installation, Configuration, Administration/Management, Monitoring. Troubleshooting and Optimization etc.

In this maiden post (Part 1) i would like to discuss and elaborate about ASM installation and related areas.

Prerequisites:
Considering that you already have the OS ready with all packages per-installed before we begin our ASM installation on the top. I will start with right from the scratch.

Step 1:
Preparing Disks or Partitions which will be used while creating the ASM diskgroups.
I’ve created 3 Persistent Disks each of 4GB in size from the VM Disk (I will perform all steps in VM environment).

This is how the VM Setting will look like once you are done with the Disk creation.
*Forgot about the Fifth Hard Disk of 10GB for now. Will explain the usage later on the series.

1

Once you have the disks created, Next you’ll have to format the newly created disks to make them usable: Using fdisk command. Command displays the status of available newly created partitions/disks as:
/dev/sdb, /dev/sdc, /dev/sdd – Each of 4GB (4294 MBs) in size.

[root@localhost ~]# fdisk -l

Disk /dev/sda: 91.3 GB, 91268055040 bytes, 178257920 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000aab6c

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 178257919 88615936 8e Linux LVM

Disk /dev/sdb: 4294 MB, 4294967296 bytes, 8388608 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Click Here to Read More!!

Posted in Advanced, Basics | Tagged: | Leave a Comment »

ORA-03113: end-of-file on communication channel **During ASM Instance startup**

Posted by FatDBA on January 8, 2013

Faced one more end-of-file communication error with one of my Test ASM Environment. Below are the steps i’ve tried to fix the problem.
Error:
[oracle@localhost bin]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on Tue Jan 8 20:00:25 2013
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
Connected to an idle instance.

SQL> ORA-03113: end-of-file on communication channel
1st Step:

I’ve tried to check the CRS (Cluster Restart Service or Oracle Restart Service) of instance. Using crs_stats you can get a list of all services which are linked with CRS with their respective state and target details.
(Always look for STATE & TARGET out of other stats provided and try to fix ‘Types’ with ‘OFFLINE’ state).

Example:
[oracle@localhost bin]$  crs_stat -t

Name           Type           Target    State     Host
————————————————————
ora.cssd       ora.cssd.type  OFFLINE   OFFLINE
ora.diskmon    ora….on.type OFFLINE   OFFLINE
ora.asm        ora….on      ONLINE    OFFLINE

* Clearly reflects broken ora.asm service is broken and is OFFLINE right now.

2nd Step:
Forcibly de-configure CRS Service via Root user.
(You’ll find the Perl script in GRID Home/crs/install directory)

Example:
[root@localhost install]# ./rootcrs.pl -deconfig -force
2013-01-08 19:53:59: Parsing the host name
2013-01-08 19:53:59: Checking for super user privileges
2013-01-08 19:53:59: User has super user privileges
Using configuration parameter file: ./crsconfig_params
Usage: srvctl <command> <object> [<options>]
commands: enable|disable|start|stop|status|add|remove|modify|getenv|setenv|unsetenv|config
objects: database|service|asm|diskgroup|listener|home|ons|eons
For detailed help on each command and object and its options use:
srvctl <command> -h or
srvctl <command> <object> -h
PRKO-2012 : nodeapps object is not supported in Oracle Restart
ACFS-9200: Supported
CRS-2613: Could not find resource ‘ora.registry.acfs’.
CRS-4000: Command Stop failed, or completed with errors.
CRS-4013: This command is not supported in a single-node configuration.
CRS-4000: Command Stop failed, or completed with errors.
You must kill crs processes or reboot the system to properly
cleanup the processes started by Oracle clusterware
error: package cvuqdisk is not installed
Successfully deconfigured Oracle clusterware stack on this node

Third Step:
Re-run root.sh script using Root account. This will configure OCR (Oracle Cluster Registry) keys for oracle user and restart ohasd (Oracle High Availability Service Deamon) again.

* Oracle Clusterware is run by Cluster Ready Services (CRS) which consisting of  Oracle Cluster Registry (OCR), which records and maintains the cluster and node membership information. Records Disk membership information in case of ASM configured on Non-RAC systems.

Example:
[root@localhost dbhome_1]# ./root.sh
Running Oracle 11g root.sh script…

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME=  /u01/app/oracle/product/11.2.0/dbhome_1

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file “dbhome” already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file “oraenv” already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file “coraenv” already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2013-01-08 19:55:58: Checking for super user privileges
2013-01-08 19:55:58: User has super user privileges
2013-01-08 19:55:58: Parsing the host name
Using configuration parameter file: /u01/app/oracle/product/11.2.0/dbhome_1/crs/install/crsconfig_params
LOCAL ADD MODE
Creating OCR keys for user ‘oracle’, privgrp ‘oinstall’..
Operation successful.
CRS-4664: Node localhost successfully pinned.
Adding daemon to inittab
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
acfsroot: ACFS-9301: ADVM/ACFS installation can not proceed:

acfsroot: ACFS-9302: No installation files found at /u01/app/oracle/product/11.2.0/dbhome_1/install/usm/EL5/i386/2.6.18-8/2.6.18-8.el5uek-i686/bin.

localhost     2013/01/08 19:56:20     /u01/app/oracle/product/11.2.0/dbhome_1/cdata/localhost/backup_20130108_195620.olr
Successfully configured Oracle Grid Infrastructure for a Standalone Server
Updating inventory properties for clusterware
Starting Oracle Universal Installer…

Checking swap space: must be greater than 500 MB.   Actual 2047 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
‘UpdateNodeList’ was successful.
You have new mail in /var/spool/mail/root

Step 4th:
Start cssd and diskmon services.

* Oracle Cluster Synchronization Services daemon (OCSSd) provides basic ‘group services’ support. This is also required in a single instance configuration if Automatic Storage Management (ASM) is use.
* diskmon (Disk Monitor daemon): Responsibel for performing Monitoring activities. The diskmon daemon is always started when ocssd starts.

[oracle@localhost ~]$ cd /u01/app/oracle/product/11.2.0/dbhome_1/bin
[oracle@localhost bin]$ ./crsctl start resource ora.cssd
CRS-2672: Attempting to start ‘ora.cssd’ on ‘localhost’
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘localhost’
CRS-2676: Start of ‘ora.diskmon’ on ‘localhost’ succeeded
CRS-2676: Start of ‘ora.cssd’ on ‘localhost’ succeeded

Step 5th:
Let’s try to Mount the ASM Diskgroup.
ASMCMD> mount DATA

(If not mounted, you’ll probably receive error codes: ORA-15110: ‘no diskgroups mounted’ and will be fixed once you mount Diskgroup).

Step 6th:
Finally start your ASM instance.

[oracle@localhost bin]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on Tue Jan 8 20:00:25 2013
Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
With the Automatic Storage Management option

SQL>

Done…

Posted in Advanced | Tagged: , | Leave a Comment »

ORA-03113: end-of-file on communication channel **Error due to insufficient Disk Space in ASM DGroup**

Posted by FatDBA on January 7, 2013

Steps to fix an ASM related issue/error when you failed to start an Instance configured on ASM Disk Groups.

Error Message:
ORA-03113: end-of-file on communication channel

When (Live/Real-Time Situation):
[oracle@localhost trace]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Mon Jan 7 20:43:52 2013
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  535662592 bytes
Fixed Size                  1337720 bytes
Variable Size             239077000 bytes
Database Buffers          289406976 bytes
Redo Buffers                5840896 bytes
Database mounted.

ORA-03113: end-of-file on communication channel
Process ID: 10910
Session ID: 21 Serial number: 3

How to track the error message:
As always Alert log for that instance.

Snippetsfrom AlertLog:
Errors in file /u01/app/oracle/diag/rdbms/etr/etr/trace/etr_ora_6969.trc:
ORA-19816: WARNING: Files may exist in db_recovery_file_dest that are not known to database.
ORA-17502: ksfdcre:4 Failed to create file +DATA
ORA-15041: diskgroup “DATA” space exhausted
*************************************************************
WARNING: A file of type ARCHIVED LOG may exist in
db_recovery_file_dest that is not known to the database.
Use the RMAN command CATALOG RECOVERY AREA to re-catalog
any such files. If files cannot be cataloged, then manually
delete them using OS command. This is most likely the
result of a crash during file creation.
*************************************************************
Errors in file /u01/app/oracle/diag/rdbms/etr/etr/trace/etr_ora_6969.trc:
ORA-19504: failed to create file “+DATA”
ORA-17502: ksfdcre:4 Failed to create file +DATA
ORA-15041: diskgroup “DATA” space exhausted
ARCH: Error 19504 Creating archive log file to ‘+DATA’

Results: It clearly says about Space Exhaustion in Line Number 15 with error message ‘ORA-15041: diskgroup “DATA” space exhausted’
Let’s understand this error message using OERR Utlity:

[oracle@localhost ~]$ oerr ORA 15041
15041, 00000, “diskgroup \”%s\” space exhausted”
// *Cause:  The diskgroup ran out of space.
// *Action: Add more disks to the diskgroup, or delete some existing files.

Let’s Verify space of ASM Disk Group:
SQL> select name, state, type, total_mb, free_mb from v$asm_diskgroup;

NAME                                     STATE          TYPE     TOTAL_MB    FREE_MB
—————————— ———–        ——         ———-    ————————————
DATA                                  MOUNTED     NORMAL       4000         17

Only 17 MB of Free Space is left in Disk Group.

Resolution:
Create a new RAW Disk and alter Diskgroup to add that newly created rdisk and create some extra space.

SQL> alter diskgroup DATA add disk ‘/dev/raw/raw6’;
Diskgroup altered.

SQL> select name, state, type, total_mb, free_mb from v$asm_diskgroup;

NAME                                       STATE          TYPE     TOTAL_MB    FREE_MB
—————————— ———–        ——           ———- ——————————-
DATA                                     MOUNTED     NORMAL       5000       1015

Let’s try to start the same instance again:

SQL> startup
ORACLE instance started.

Total System Global Area  535662592 bytes
Fixed Size                  1337720 bytes
Variable Size             239077000 bytes
Database Buffers          289406976 bytes
Redo Buffers                5840896 bytes
Database mounted.

Database opened.

Fixed.

Posted in Advanced | Tagged: , | Leave a Comment »

ASM: File Extensions

Posted by FatDBA on December 27, 2012

If you have the ASM configured in your environment, it’s always hard to remember the ASM file names unless you have aliased the entires. But this is what i discovered while working on one of my Test machine which has the ASM Configured. I found filenames and extensions fully qualified but not just another techical opaques.

These are some of the datafiles which are part of my ASM and available on the diskgroup.

SQL> select name from v$datafile;

NAME
——————————————————————————–
+TESTDB_DATA1/orcl/datafile/tesla.261.795016577
+TESTDB_DATA1/orcl/datafile/example12.256.794584325
+TESTDB_DATA1/orcl/datafile/dixy.257.794593881
+TESTDB_DATA1/orcl/datafile/users1111.258.794825249
+TESTDB_DATA1/orcl/datafile/text.259.794825753
+TESTDB_DATA1/orcl/datafile/test12345.260.794840557

eg:
Let’s dissect one of the file
+TESTDB_DATA1/orcl/datafile/dixy.257.794593881
Here if watch carefully:
+TESTDB_DATA1: DiskGroup Name (+ indicates the root of the ASM)
orcl: Name of the client database installed.
datafile: The File Type
dixy: Name of the Data File.
257: Unique File Number
794593881: Database Incarnation Number.

To conform you can also crosscheck and validate using v$database_incarnation.

Posted in Advanced | Tagged: | Leave a Comment »

 
%d bloggers like this: