Tales From A Lazy Fat DBA

Fan of Oracle DB & Performance, PostgreSQL & Cassandra … \,,/

  • Likes

    • 219,294
  • Archives

  • Categories

  • Subscribe

Posts Tagged ‘troubleshooting’

Cassandra node refused to start – MismatchedInputException: No content to map due to end-of-input

Posted by FatDBA on August 5, 2020

Hi All,

This one will be a quick one, kind of error and solution approach.
This time it’s about Cassandra statup error which I have faced while doing some testing on a POC box and how I resolved it.

In this case I was using Datastax Cassandra 6.8.1 on RHEL6, I tried starting the instance but failed with error which says “MismatchedInputException: No content to map due to end-of-input”.
I’ve attempted to start it few more times but ended up with the same error message. Below is the complete/full error details.


WARN  [main] 2020-08-05 13:43:46,277  DatabaseDescriptor.java:1517 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
INFO  [main] 2020-08-05 13:43:46,281  DseDelegateSnitch.java:39 - Setting my workloads to [Cassandra]
INFO  [main] 2020-08-05 13:43:46,358  YamlConfigurationLoader.java:77 - Configuration location: file:/home/cassandra/dse-6.8.1_node1/resources/cassandra/conf/cassandra.yaml
INFO  [main] 2020-08-05 13:43:46,360  DseDelegateSnitch.java:41 - Initialized DseDelegateSnitch with workloads [Cassandra], delegating to com.datastax.bdp.snitch.DseSimpleSnitch
INFO  [main] 2020-08-05 13:43:46,424  YamlConfigurationLoader.java:77 - Configuration location: file:/home/cassandra/dse-6.8.1_node1/resources/cassandra/conf/cassandra.yaml
ERROR [main] 2020-08-05 13:43:47,904  DseModule.java:126 - Unable to start server. Exiting...
org.apache.cassandra.io.FSReadError: com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input
 at [Source: (BufferedInputStream); line: -1, column: 0]
        at com.datastax.bdp.db.nodes.Nodes.transactionalRead(Nodes.java:195)
        at com.datastax.bdp.db.nodes.Nodes.access$100(Nodes.java:76)
        at com.datastax.bdp.db.nodes.Nodes$Local.(Nodes.java:477)
        at com.datastax.bdp.db.nodes.Nodes.(Nodes.java:97)
        at com.datastax.bdp.db.nodes.Nodes$Instance.setup(Nodes.java:911)
        at org.apache.cassandra.config.DatabaseDescriptor.applyMetadata(DatabaseDescriptor.java:587)
        at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:234)
        at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:249)
        at com.datastax.bdp.DseCoreModule.(DseCoreModule.java:84)
        at com.datastax.bdp.DseModule.getRequiredModules(DseModule.java:163)
        at com.datastax.bdp.server.AbstractDseModule.configure(AbstractDseModule.java:27)
        at com.datastax.bdp.DseModule.configure(DseModule.java:89)
        at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
        at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
        at com.google.inject.spi.Elements.getElements(Elements.java:110)
        at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
        at com.google.inject.Guice.createInjector(Guice.java:96)
        at com.google.inject.Guice.createInjector(Guice.java:73)
        at com.google.inject.Guice.createInjector(Guice.java:62)
        at com.datastax.bdp.ioc.DseInjector.get(DseInjector.java:36)
        at com.datastax.bdp.DseModule.main(DseModule.java:103)
Caused by: com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input
 at [Source: (BufferedInputStream); line: -1, column: 0]
 


Now, let’s talk about the fix that I have applied to get rid of the problem, but before that I recalled that the database was abruptly shutdown last time due to server/hsot crashed. This file contains local startup information like data center, release info, scheam version, native transport address, port info (JMX, storage etc.), broadcast address etc.


[cassandra@fatdba-doccass bin]$ cd /var/lib/cassandra/metadata/nodes/
[cassandra@fatdba-doccass nodes]$ ls
local  peers  snapshots
[cassandra@fatdba-doccass nodes]$ rm -rf local
[cassandra@fatdba-doccass nodes]$
[cassandra@fatdba-doccass nodes]$
[cassandra@fatdba-doccass nodes]$ ls -ltrh
total 4.0K
drwxrwxr-x. 2 cassandra cassandra 6 Jul 27 16:07 snapshots
-rw-rw-r--. 1 cassandra cassandra 1 Aug  4 18:29 peers
 

Now let’s try to start the node once again.


[cassandra@fatdba-doccass bin]$
[cassandra@fatdba-doccass bin]$ ./dse cassandra
 


And it got started after clearning the local details/information which was there when node was crashed.

Hope It Helps
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: , | Leave a Comment »

High Latch Free waits due to Result Cache: RC Latch contention

Posted by FatDBA on June 16, 2020

Hi Everyone,

This is one of the latest issue that I will be discussing next where in one of the 12c (12.1.0.2.0) Non-RAC production box we suddenly started seeing a huge spike in database workload (or AAS). As usual I started with some live monitoring using scripts, OEM Performance page and tools and saw huge numbers of sessions waiting on ‘Latch Free’ (Others classed event) waits. Next I saw the event wait were contributing more than 87% of the total DBTime % with exceptionally high latch wait times of 54 ms (Yes, that’s too much for latch gets) and what else caught my eye was ‘enq: RC – Result Cache: Contention’ with average wait times of 19.31 ms.

But lets not jump directly to the conclusion, next I checked SQLs with excessive buffer scans (SQL Ordered by Gets) and saw the source statement that caused the mess with exceptionally high number of buffer scan requests and as you must be aware that each scan that you do in memory buffers (db buffer cache) you will have to acquire a latch for your scan. This one SQL was responsible for around 85% of total buffer reads or gets and interestingly within three hours of AWR report not even a single execution was completed.

Alright now next task was to identify the type of latch which was causing ‘latch free’ or latch shortage in the database. And for that I checked ‘Latch Statistics’ section and ‘Latch Sleep Breakdown’ subsection to see what latch was frequently missed to get and with more sleep requests, and I have got the culprit, ‘Result Cache: RC Latch’ was coming with huge Miss and sleep ratio. The latch was sleeping mainly for ‘Result Cache: Serialization12’ and little bit on ‘Result Cache: Serialization01’.

Next just to be double sure, I checked for P2 value for ‘Latch Free’ event which was coming as 559. So I queried V$Latchname to see if it’s matching the same what we identified so far. And yes, it was same RC Latch!


> SELECT latch#,name FROM gv$latchname WHERE latch#=559;

    LATCH# NAME
---------- ----------------------------------------------------------------
       559 Result Cache: RC Latch
 

You can do more deep down analysis and troubleshooting on such issues, Tanel Poder has written a great presentation for all who want to dig deep and want to pin point the addresses, code etc.
Download the copy of the document, click next –> Oracle_Latch_And_Mutex_Contention_Troubleshooting

Alright, now when the latch name, SQL is identified lets check more about the RESULT CACHE, and it’s set to MANUAL (parameter result_cache_mode) and this is strange. Next we saw that the SQL using the RC contains DS_SVC hints and are part of dynamic statistics activity.
Then I thought to look out for some official documentation on the same, and luckily found one official note from Oracle Doc ID 2002089.1 which is matching almost same with my problem.

And as per the document the workload we can apply is to disable the Adaptive Dynamic Statistics mechanism.


alter system set "_optimizer_ads_use_result_cache" = FALSE;
 

or else One-off Patch 26436717 can be requested and applied to fix the issue for a permanent fix.

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: , | Leave a Comment »

Golden Gate 19.1.0 – VIEW REPORT & VIEW PARAMS command failed

Posted by FatDBA on March 24, 2020

Hi Folks,

Would like to share one of the latest problem that I’ve faced with Golden Gate release 19.1.0 where two of the most frequently used commands, VIEW REPORT and VIEW PARAMS command simply fails in the administration client console. This is a Windows server environment where I was doing some testing with this new release.
I mean it produces no additional logs or information while doing that, it simply tries to wake up and crashes in next few seconds.

So, with no idea about the problem, plus this being a new release of Oracle GG (19.1) there wasn’t any metalink note or public documentation available, so reached Oracle support. They gave a very simple but logical solution to handle this issue. They asked me to once again set the default text viewer program for viewing parameter and report files by passing following command and it worked


SET PAGER notepad 


They later on declared this issue as a BUG with code 30427030, but bug details are yet to be published on Metalink website.


Stay inside, learn something new during this self isolation & stop the spread of Covid-19.
Stay Healthy and Safe Everyone!

Prashant Dixit
 

Posted in troubleshooting | Tagged: , | Leave a Comment »

How to enable RAT feature and some Troubleshooting

Posted by FatDBA on February 28, 2020

Hi Everyone,

This post of all about preparing your database before you start using RAT on it, as it’s little tricky. You have to enable the feature at the binary level before you start using the RAT replay plus you have to patch your database with few of the RAT related mandatory patches.

This post is all about enabling the feature at the binary level and one of the issue that I have faced immediately after that.

My database which is 11.2.0.4.0 (2 Node RAC) has the RAT feature set to FALSE, which means that the option is not yet enabled.


SQL> select * from v$option where parameter like 'Real Application Testing';

PARAMETER                                                        VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
Real Application Testing                                         FALSE
 

Alright, so next step is to enable the feature at the binary level, and to do that you have to run ‘make’ command and bind rat_on to your Oracle Home. Here I have a 2 Node RAC and I am starting with Node 1 and once done will repeat steps on Node 2.


[oracle@Testdb lib]$ make -f ins_rdbms.mk rat_on ioracle
/usr/bin/ar d /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/libknlopt.a kecnr.o
/usr/bin/ar cr /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/libknlopt.a /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/kecwr.o
chmod 755 /u01/app/oracle/product/11.2.0/dbhome_1/bin

 - Linking Oracle
rm -f /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/oracle
gcc  -o /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/oracle -m64 -z noexecstack -L/u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/ -L/u01/app/oracle/product/11.2.0/dbhome_1/lib/ 
-L/u01/app/oracle/product/11.2.0/dbhome_1/lib/stubs/   -Wl,-E /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/opimai.o /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/ssoraed.o /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/ttcsoi.o  -Wl,--whole-archive -lperfsrv11 -Wl,--no-whole-archive /u01/app/oracle/product/11.2.0/dbhome_1/lib/nautab.o /u01/app/oracle/product/11.2.0/dbhome_1/lib/naeet.o /u01/app/oracle/product/11.2.0/dbhome_1/lib/naect.o /u01/app/oracle/product/11.2.0/dbhome_1/lib/naedhs.o /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/config.o  -lserver11 -lodm11 -lcell11 -lnnet11 -lskgxp11 -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 -lclient11  -lvsn11 -lcommon11 -lgeneric11 -lknlopt `if /usr/bin/ar tv /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/libknlopt.a | grep xsyeolap.o > /dev/null 2>&1 ; then echo "-loraolap11" ; fi` -lslax11 -lpls11  -lrt -lplp11 -lserver11 -lclient11  -lvsn11 -lcommon11 -lgeneric11 `if [ -f /u01/app/oracle/product/11.2.0/dbhome_1/lib/libavserver11.a ] ; then echo "-lavserver11" ; else echo "-lavstub11"; fi` `if [ -f /u01/app/oracle/product/11.2.0/dbhome_1/lib/libavclient11.a ] ; then echo "-lavclient11" ; fi` -lknlopt -lslax11 -lpls11  -lrt -lplp11 -ljavavm11 -lserver11  -lwwg  `cat /u01/app/oracle/product/11.2.0/dbhome_1/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnro11 `cat /u01/app/oracle/product/11.2.0/dbhome_1/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnnz11 -lzt11 -lmm -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 -lztkg11 `cat /u01/app/oracle/product/11.2.0/dbhome_1/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnro11 `cat /u01/app/oracle/product/11.2.0/dbhome_1/lib/ldflags`    -lncrypt11 -lnsgr11 -lnzjs11 -ln11 -lnl11 -lnnz11 -lzt11   -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 `if /usr/bin/ar tv /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/libknlopt.a | grep "kxmnsd.o" > /dev/null 2>&1 ; then echo " " ; else echo "-lordsdo11"; fi` -L/u01/app/oracle/product/11.2.0/dbhome_1/ctx/lib/ -lctxc11 -lctx11 -lzx11 -lgx11 -lctx11 -lzx11 -lgx11 -lordimt11 -lclsra11 -ldbcfg11 -lhasgen11 -lskgxn2 -lnnz11 -lzt11 -lxml11 -locr11 -locrb11 -locrutl11 -lhasgen11 -lskgxn2 -lnnz11 -lzt11 -lxml11  -loraz -llzopro -lorabz2 -lipp_z -lipp_bz2 -lippdcemerged -lippsemerged -lippdcmerged  -lippsmerged -lippcore  -lippcpemerged -lippcpmerged  -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 -lsnls11 -lunls11  -lsnls11 -lnls11  -lcore11 -lsnls11 -lnls11 -lcore11 -lsnls11 -lnls11 -lxml11 -lcore11 -lunls11 -lsnls11 -lnls11 -lcore11 -lnls11 -lasmclnt11 -lcommon11 -lcore11 -laio    `cat /u01/app/oracle/product/11.2.0/dbhome_1/lib/sysliblist` -Wl,-rpath,/u01/app/oracle/product/11.2.0/dbhome_1/lib -lm    `cat /u01/app/oracle/product/11.2.0/dbhome_1/lib/sysliblist` -ldl -lm   -L/u01/app/oracle/product/11.2.0/dbhome_1/lib

test ! -f /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle ||\
           mv -f /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracleO
mv /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/lib/oracle /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle
chmod 6751 /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle
[oracle@Testdb lib]$
 

Alright, so it’s done. No errors during the enablement process. Let’s connect with the database and see if the status of the feature.


[oracle@Testdb ~]$ !sql
sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Mon Sep 16 22:14:59 2019

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected.
SQL> alter system switch logfile;

ERROR at line 1:
ORA-01012: not logged on
Process ID: 0
Session ID: 0 Serial number: 0
 

Oops, why is that. I have set the environmental variables and everything is in place then why it’s not allowing me to connect with the database and say not logged on. This is strange!

Let’s check if there is anything captured withing alert logs and other traces.


Mon Jan 16 22:19:48 2019
WARNING: Oracle executable binary mismatch detected.
 Binary of new process does not match binary which started instance
issue alter system set "_disable_image_check" = true to disable these messages
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/app/oracle/diag/rdbms/Testdb/Testdb1/trace/Testdb1_cjq0_176885.trc:
Mon Jan 16 22:19:49 2019
WARNING: Oracle executable binary mismatch detected.
 Binary of new process does not match binary which started instance
issue alter system set "_disable_image_check" = true to disable these messages
Mon Jan 16 22:19:49 2019
Errors in file /u01/app/oracle/diag/rdbms/Testdb/Testdb1/trace/Testdb1_j000_1389.trc:
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 54322 (dba), current egid = 54321 (oinstall)
Mon Jan 16 22:19:49 2019
WARNING: Oracle executable binary mismatch detected.
 Binary of new process does not match binary which started instance
issue alter system set "_disable_image_check" = true to disable these messages
 

But it’s all good, no error except a general warning of “_disable_image_check” which happens as we have made a live change and it’s reporting that a mismatch is there at the binary level which were used to start the database instance. This can be easily suppressed by setting the parameter “_disable_image_check” to true.

Alright, so no idea or any hint from alerts and other logs.
Next I have checked the permissions of executable ‘oracle’ under ORACLE_HOME/bin folder.


[oracle@Testdb bin]$ ls -ll oracle
-rwsr-s--x 1 oracle oinstall 228395182 Sep 16 23:02 oracle
 


Why the group has changed from ‘dba’ to ‘oinstall’. This is strange!
Let me revert back the group name and re-assign it back to the older group.


[oracle@Testdb bin]$ chown oracle:dba oracle
 


Second I have noted the change in file permissions. It’s missing the ‘setuid’ access right flag from it, which is present in Node 2 where we didn’t made any change. I have revert back the permissions as well!


[oracle@Testdb bin]$ ls -ll oracle
-rwxr-x--x 1 oracle dba 228395182 Sep 16 23:02 oracle

[oracle@Testdb bin]$ chmod 6751 oracle

[oracle@Testdb bin]$ ls -ll oracle
-rwsr-s--x 1 oracle dba 228395182 Sep 16 23:02 oracle
 

And it worked, no errors this time and I was able to connect with the database using SQL Plus.


SQL> select * from v$option where parameter like 'Real Application Testing';

PARAMETER                                                        VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
Real Application Testing                                         TRUE

SQL> select name, open_mode from v$database;

NAME      OPEN_MODE
--------- --------------------
Testdb  READ WRITE

SQL> select name, open_mode from gv$database;

NAME      OPEN_MODE
--------- --------------------
Testdb  READ WRITE
Testdb  READ WRITE
 

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: , | Leave a Comment »

Disk Goes Offline after rebalance! – Is this due to a BUG ?

Posted by FatDBA on January 30, 2020

HI Everyone,

Today during one of the activity where we migrated the ASM Storage for one of our 2 Node RAC cluster (running on 11gR2), where we had to perform the disk rebalancing to copy/mirror the contents from older/existing storage to the new storage before we go and drop the older storage partitions, we faced some some weirdness. The disks goes offline in this multi-node ASM and we left stranded with initially no idea behind this behavior, but finally we were able to locate a metalink page for the same issue.

Yes, this was due to a known Bug with number 13476583
Oracle Server (Rdbms) Version
This problem is introduced in the
11.2.0.2.3 Patch Set Update
11.2.0.2.5 Patch Set Update
11.2.0.2.4 Patch Set Update
11.2.0.2.3 Patch Set Update
and in 11.2.0.3, by the fix for bug 10040921.

Problem:
When disks are dropped, a forcible diskgroup dismount is performed on other ASM instance/s.

Workaround or Fix:
1. The problem does not cause diskgroup corruption. So mostly diskgroup can be mounted again.
2. Apply fix
Interim patches here: Patch:13476583
11.2.0.2.6 Patch Set Update
11.2.0.2 Patch 17 on Windows Platforms

Oracle Notes: 245840.1

Hope That Helps
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: , | Leave a Comment »

CLSRSC-188: Failed to create keys in Oracle Local Registry

Posted by FatDBA on January 3, 2020

Hi Everyone,

Happy New Year!

So here goes my first post for Year 2020. This time I will be discussing an error that we encountered some time back while executing the important ‘root.sh’ script for a new 12cR2 Oracle Restart setup on RHEL7. The script was going smooth till the point where it tries to add keys in OLR for HASD and died with error “CLSRSC-188: Failed to create keys in Oracle Local Registry”.

Below is the exact error what we get during the root.sh run.
Here you will that it was throwing an error which says “Site name (1819181-monkeydb) is invalid.clscfg”.


[root@1819181-monkeydb gridhome]# ./root.sh

Performing root user operation.
Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/12.2.0.1/gridhome/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/12.2.0.1/crsdata/1819181-monkeydb/crsconfig/roothas_2019-12-31_11-03-49AM.log
Site name (1819181-monkeydb) is invalid.clscfg -localadd -z  [-avlookup]
                 -p property1:value1,property2:value2...

  -avlookup       - Specify if the operation is during clusterware upgrade
  -z   - Specify the site GUID for this node
  -p propertylist - list of cluster properties and its value pairs

 Adds keys in OLR for the HASD.
WARNING: Using this tool may corrupt your cluster configuration. Do not
         use unless you positively know what you are doing.

Failed to create keys in the OLR, rc = 100, Message:

2019/12/31 11:03:56 CLSRSC-188: Failed to create keys in Oracle Local Registry
Died at /u01/app/12.2.0.1/gridhome/crs/install/oraolr.pm line 552.
The command '/u01/app/12.2.0.1/gridhome/perl/bin/perl -I/u01/app/12.2.0.1/gridhome/perl/lib -I/u01/app/12.2.0.1/gridhome/crs/install /u01/app/12.2.0.1/gridhome/crs/install/roothas.pl ' execution failed
 

It all happened because our hostname started with a number (1819181-monkeydb) and it’s a known bug that makes the hostname as invalid for root.sh and therefore the above error comes up.
There is a another condition as well, suppose your hostname starts with a alphabet (AHOST-TEXTIBOX-09) but as there is a limit of 15 characters which oracle considers for the hostname, and here in our example the 15th character is a hyphen (-).
So, even in such a case the root.sh will fail even when the hostname starts with a non-numeric character but it’s 15th character is a special character.

Now let’s discuss the solutions.
First, you can apply a merge patch 26751067 (which is merge of Bugs: Bug 25499276 Bug 26581118) and re-run the root.sh script.
Second, change the hostname right after the failure and re-run the script, this time it will go through with no error. Below is an example.

Let’s first change the hostname quickly before we and re-run root.sh


[root@1819181-monkeydb gridhome]# cat /etc/hostname
1819181-monkeydb
[root@1819181-monkeydb gridhome]# echo A1819181-monkeydb > /etc/hostname
[root@1819181-monkeydb gridhome]# cat /etc/hostname
A1819181-monkeydb
 

To update your command prompt simply re-login and to apply this change system wide execute below.


[root@1819181-monkeydb gridhome]# systemctl restart systemd-hostnamed
[root@A1819181-monkeydb gridhome]# 

[root@1819181-monkeydb gridhome]# ./root.sh
Performing root user operation.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/12.2.0.1/gridhome/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/12.2.0.1/crsdata/a1819181-monkeydb/crsconfig/roothas_2019-12-31_11-17-33AM.log
LOCAL ADD MODE
Creating OCR keys for user 'oracle', privgrp 'oinstall'..
Operation successful.
PROT-53: The file name [/u01/app/12.2.0.1/gridhome/cdata/localhost/local.ocr] specified for the 'ocrconfig -repair', 'ocrconfig -add' or 'ocrconfig -replace' command designates an invalid storage type for the Oracle Cluster Registry.
2019/12/31 11:17:43 CLSRSC-155: Replace of older local-only OCR failed
LOCAL ONLY MODE
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4664: Node a1819181-monkeydb successfully pinned.
2019/12/31 11:17:47 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.service'
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'a1819181-monkeydb'
CRS-2673: Attempting to stop 'ora.evmd' on 'a1819181-monkeydb'
CRS-2677: Stop of 'ora.evmd' on 'a1819181-monkeydb' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'a1819181-monkeydb' has completed
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.

a1819181-monkeydb     2019/12/31 11:18:41     /u01/app/12.2.0.1/gridhome/cdata/a1819181-monkeydb/backup_20191231_111841.olr     0
2019/12/31 11:18:42 CLSRSC-327: Successfully configured Oracle Restart for a standalone server
 


Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: | Leave a Comment »

FGA Error ORA-28138: Error in Policy Predicate

Posted by FatDBA on December 26, 2019

Hi Folks,

Today’s I am going to discuss one of the eerie issue that we faced recently while doing a Database Switch-over activity (From 10gR2 to 12cR2) where application team changed their application string or connection ways and started pointing to this new 12c database.
Before I proceed, let me give you a quick background about this activity, this was a test (Staging) database which was migrated on a new infrastructure and with version 12c, we’ve used data pump to move data from source to this new target and everything went well during all those steps.

Everything was successfully moved till the time the first test customer login to the application and reported that he failed to connect using his credentials. One error message that was captured in application server logs (this was a three tiered platform) which reads

"java.sql.SQLException: ORA-28138: Error in Policy Predicate". 

This error prevented all of the users to connect with the application after this switch-over. Well, apart from regular login procedures, rest all of was working fine.
The error immediately gave us an idea that the error was pointing to the FGA that we have tested on few of the tables some time back, including one of the base table which is used to insert login details before it authenticates access. So, we verified the FGA settings that migrated to this new database and found they are configured with some strange and complex AUDIT conditions
using a custom function where someone tried to define a subquery in the audit_condition, and didn’t tested the result.


i.e. sys.check_audited_user > 0  & sys.check_audited_user = 'XYS'. 

This being an invalid policy preicate and ultimately all operations got failed on said table which in turn stopping users to login.
So, this all happened due to complex precidates used in audit policies, this should be avoided. I mean it will allow you to create the policy but will fail with such errors related with FGA predicates. You cannot define a subquery in the audit_condition; it must be a simple predicate

So, now we have two solutions to avoid this situation.
One, you can simply go and drop the policy created on the said object to resume operations.
Else you can write a function that will evaluate the complex criteria and return a value that can be used in a simple predicate.


Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: | Leave a Comment »

root.sh failed on RHEL >7.3 — CLSRSC-400: A system reboot is required to continue installing,

Posted by FatDBA on November 14, 2019

Hi Everyone,

Was little occupied in few of the database migrations happened here at my end, so wasn’t able to post on regular basis. But the good thing is that I have a good list of issues that we faced during the course of this end of end migration and starting from today will try to share all them.

Alright, the one I am going to discuss next is the issue that we encountered while running root.sh script on this ‘Oracle Restart’ setup where the root.sh script failed with below set of errors



Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/12.2.0.1/grid_home/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/12.2.0.1/crsdata/testserver-monkey/crsconfig/roothas_2019-11-12_10-56-56PM.log
LOCAL ADD MODE
Creating OCR keys for user 'oracle', privgrp 'oinstall'..
Operation successful.
LOCAL ONLY MODE
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4664: Node testserver-monkey successfully pinned.
2019/11/12 22:57:02 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.service'
 2019/11/12 22:59:06 CLSRSC-400: A system reboot is required to continue installing.
The command '/u01/app/12.2.0.1/grid_home/perl/bin/perl -I/u01/app/12.2.0.1/grid_home/perl/lib -I/u01/app/12.2.0.1/grid_home/crs/install /u01/app/12.2.0.1/grid_home/crs/install/roothas.pl ' execution failed 


To further understand what caused that failure we have checked the log file the above error pointing.
Below are few of the core/main lines when it got failed. So it shows that it failed to load the ADVM/ACFS drivers on the system while running the root.sh script.



>  ACFS-9504: Copying file '/u01/app/12.2.0.1/grid_home/lib/libacfs12.so' to the path '/opt/oracle/extapi/64/acfs/orcl/1/'
>  ACFS-9308: Loading installed ADVM/ACFS drivers.
>  ACFS-9321: Creating udev for ADVM/ACFS.
>  ACFS-9323: Creating module dependencies - this may take some time.
>  ACFS-9176: Entering 'ld usm drvs'
>  ACFS-9154: Loading 'oracleoks.ko' driver.
>  modprobe: FATAL: Module oracleoks not found.
>  ACFS-9109: oracleoks.ko driver failed to load.
>  ACFS-9178: Return code = USM_FAIL
>  ACFS-9177: Return from 'ld usm drvs'
 >  ACFS-9428: Failed to load ADVM/ACFS drivers. A system reboot is recommended.
>  ACFS-9310: ADVM/ACFS installation failed.


The solution to this problem is to apply the one off patch (25078431) to fix this issue with ACFS/ADVM drivers in RHEL > 7.3. Yes, there is a metalink not available for the same file too.
But in our setup even the patch failed to fix the issue as the the .gridSetup -applyoneoffs comes out within 1-2 seconds we ran this command, I mean in short it did nothing and pretends that it applied the patch but the ‘opatch lspatches‘ not showing anything.

Well, we raised this issue with Oracle and they passed it to their development team as there were lot’s of other things running on this DB.
And as you know their DEV team, they don’t have any fixed SLA. Well there is a reason too for them doing like that, as development team does lot’s of testing and regressions hence that is something acceptable.

Well, this problem we anyhow to fix as we had an important test that we planned to perform on this system.
So, comes the time to apply the temporary fix, of course a crude/raw one 🙂

Now, as on this system we don’t need the ACFS, so we can disable the feature right at the code/binary level.
Below are the two main files that when renamed disabled this feature and you are all good to bypass this root.sh check.



acfsdriverstate
acfsroot



You simply have to rename them and re-run the root.sh script it will pass this time and you are done with your GI installation.


Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: , | Leave a Comment »

OPatch – Error occurred during initialization of VM, Could not reserve enough space for XXXXXXKB object heap

Posted by FatDBA on February 19, 2019

Hi Guys,

Disucssing a random issue what i’ve encountered few hours back, is a problem related with the new version of the OPatch which when unzipped generating a weird error and is discussed below.



[oracle@gunna:~/app/oracle/product/12.2.0/dbhome_1/OPatch/28822515]$ opatch prereq CheckConflictAgainstOHWithDetail -ph ./

Error occurred during initialization of VM
Could not reserve enough space for 39957221KB object heap


On OCS download page for OPatch, the auto version is set to 32-bit (Linux X86).
Check if the name of the downloaded file is something similar ‘p6880880_122010_LINUX.zip’. If yes, then you have downloaded the 32 bit version. Choose ‘Linux x86-64’ as the right vrsion and try again

Let’s try again.



[oracle@gunna:~/app/oracle/product/12.2.0/dbhome_1/OPatch/28822515]$ opatch prereq CheckConflictAgainstOHWithDetail -ph ./
Oracle Interim Patch Installer version 12.2.0.1.16
Copyright (c) 2018, Oracle Corporation.  All rights reserved.

PREREQ session

Oracle Home       : /home/oracle/app/oracle/product/12.2.0/dbhome_1
Central Inventory : /home/oracle/app/oraInventory
   from           : /home/oracle/app/oracle/product/12.2.0/dbhome_1/oraInst.loc
OPatch version    : 12.2.0.1.16
OUI version       : 12.2.0.1.4
Log file location : /home/oracle/app/oracle/product/12.2.0/dbhome_1/cfgtoollogs/opatch/opatch2018-12-24_00-46-02AM_1.log

Invoking prereq "checkconflictagainstohwithdetail"

Prereq "checkConflictAgainstOHWithDetail" passed.

OPatch succeeded.


All good now!

Hope It Helps
Prashant ‘Fatdba’ Dixit

Posted in troubleshooting | Tagged: | Leave a Comment »

DBMS_XPLAN and its different options/arguments available

Posted by FatDBA on May 10, 2018

Hi Mates,

I am back after a long time to write something that is very vital piece of information when you are about to start any troubleshooting task, yes that is the EXECUTION PLAN, well there are many ways to generate the CBO plans (i.e. AUTOTRACE, extended/debug traces, utlxpls.sql, V$SQL_PLAN, Few of the specialized SQL specific AWR reports like awrsqrpt.sql, STS etc.) but the most common and best way of doing thisng in this subecjt is to use DBMS_XPLAN & its function DISPLAY_CURSOR.

So, yes today’s topic is to understand what all options do we have to generate a more interactive, detailed, elaborative plans.

Okay will start it with very rudimentary (BASIC) styled plan and slowly will use rest of the arguments/options available. The plan includes the operation, options, and the object name (table, index, MV, etc)


SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('0h79fq6vx4p99',0,'BASIC'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from dixemp

Plan hash value: 3980174453

------------------------------------
| Id  | Operation         | Name   |
------------------------------------
|   0 | SELECT STATEMENT  |        |
|   1 |  TABLE ACCESS FULL| DIXEMP |
------------------------------------

Next is the most common way of generating the plans that’s with the DISPLAY function which allows us to display the execution plan stored in the plan table. First we explain a SQL statement.
Next we use the DBMS_XPLAN.DISPLAY function to display the execution plan.


SQL> explain plan for select * from dixemp;
Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 3980174453

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |    14 |   532 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| DIXEMP |    14 |   532 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Next is the ‘ALLSTATS LAST’ option for FORMAT parameter. Lets see how to use it.
ALLSTATS is a shortcut for IOSTATS and MEMSTATS (e.q. BUFFER information) information and the other keyword LAST can be specified to see only the statistics for the last execution.


SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format=>'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0h79fq6vx4p99, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from dixemp

Plan hash value: 3980174453

--------------------------------------------------------------------------------------
| Id  | Operation         | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |      1 |        |     14 |00:00:00.01 |      10 |
|   1 |  TABLE ACCESS FULL| DIXEMP |      1 |     14 |     14 |00:00:00.01 |      10 |
--------------------------------------------------------------------------------------

But if you take a look at the above plan you’ll see that the plan doesn’t contain few of the vital columns or stats like COST and bytes processed (BYTES) as it doesn’t comes by default with that, you actually have to add few more predicates to get that info. That is +cost and +bytes with your FORMAT parameter.

Below is how you can get that missing info from the plan.


SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(sql_id=>'0h79fq6vx4p99',format=>'ALLSTATS LAST +cost +bytes'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0h79fq6vx4p99, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from dixemp

Plan hash value: 3980174453

-----------------------------------------------------------------------------------------------------------
| Id  | Operation         | Name   | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |      1 |        |       |     3 (100)|     14 |00:00:00.01 |      10 |
|   1 |  TABLE ACCESS FULL| DIXEMP |      1 |     14 |   532 |     3   (0)|     14 |00:00:00.01 |      10 |
-----------------------------------------------------------------------------------------------------------

And the ALL parameter will give you the general plan but rest all of the details like Query Block Name / Object Alias, Predicate Information, column projection details.


SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('0h79fq6vx4p99',0,'ALL'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0h79fq6vx4p99, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from dixemp

Plan hash value: 3980174453

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS FULL| DIXEMP |    14 |   532 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / DIXEMP@SEL$1

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "DIXEMP"."EMPNO"[NUMBER,22], "DIXEMP"."ENAME"[VARCHAR2,10],
       "DIXEMP"."JOB"[VARCHAR2,9], "DIXEMP"."MGR"[NUMBER,22],
       "DIXEMP"."HIREDATE"[DATE,7], "DIXEMP"."SAL"[NUMBER,22],
       "DIXEMP"."COMM"[NUMBER,22], "DIXEMP"."DEPTNO"[NUMBER,22]

ALLSTATS is a shortcut for IOSTATS and MEMSTATS (e.q. BUFFER information) with your plan.


SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('0h79fq6vx4p99',0,'ALLSTATS'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0h79fq6vx4p99, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from dixemp

Plan hash value: 3980174453

--------------------------------------------------------------------------------------
| Id  | Operation         | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |      7 |        |     98 |00:00:00.01 |      70 |
|   1 |  TABLE ACCESS FULL| DIXEMP |      7 |     14 |     98 |00:00:00.01 |      70 |
--------------------------------------------------------------------------------------

Now, if you want to view additional details of your plan, for example set of hints to reproduce the statement or OUTLINE, you can use it in your format parameter.


SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(sql_id=>'0h79fq6vx4p99',format=>'ALLSTATS LAST +outline'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0h79fq6vx4p99, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from dixemp

Plan hash value: 3980174453

--------------------------------------------------------------------------------------
| Id  | Operation         | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |      1 |        |     14 |00:00:00.01 |      10 |
|   1 |  TABLE ACCESS FULL| DIXEMP |      1 |     14 |     14 |00:00:00.01 |      10 |
--------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      DB_VERSION('12.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "DIXEMP"@"SEL$1")
      END_OUTLINE_DATA
  */

Adding or removing any data/stats from the plan, that’s possible using + or – signs follwed by argument.
Example: if you want t view cost and bytes information use +cost, +bytes in your plan or if you want to remove the same info in your run of dbms_xplan use -cost, -bytes.


SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(sql_id=>'0h79fq6vx4p99',format=>'ALLSTATS LAST +cost +bytes'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0h79fq6vx4p99, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from dixemp

Plan hash value: 3980174453

-----------------------------------------------------------------------------------------------------------
| Id  | Operation         | Name   | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |      1 |        |       |     3 (100)|     14 |00:00:00.01 |      10 |
|   1 |  TABLE ACCESS FULL| DIXEMP |      1 |     14 |   532 |     3   (0)|     14 |00:00:00.01 |      10 |
-----------------------------------------------------------------------------------------------------------

You can write a mix of both as below


SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '0h79fq6vx4p99', cursor_child_no => 0, FORMAT => 'TYPICAL -ROWS -BYTES +COST +PARALLEL +PARTITION +IOSTATS +MEMSTATS +ALIAS +PEEKED_BINDS +OUTLINE +PREDICATE -PROJECTION +REMOTE +NOTE'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------
SQL_ID  0h79fq6vx4p99, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from dixemp

Plan hash value: 3980174453

--------------------------------------------------------------------------------------------------------------
| Id  | Operation         | Name   | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |      7 |        |     3 (100)|          |     98 |00:00:00.01 |      70 |
|   1 |  TABLE ACCESS FULL| DIXEMP |      7 |     14 |     3   (0)| 00:00:01 |     98 |00:00:00.01 |      70 |
--------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / DIXEMP@SEL$1

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      DB_VERSION('12.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "DIXEMP"@"SEL$1")
      END_OUTLINE_DATA
  */

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: , | Leave a Comment »

 
%d bloggers like this: