Tales From A Lazy Fat DBA

Its all about Databases & their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Archive for January, 2020

Oracle Real Application Testing (RAT) – Part 1: What it is ?

Posted by FatDBA on January 31, 2020

Hi Guys,

As committed I am back with the first edition or the post on Oracle RAT (Real Application Testing) and there be couple more follow up chapters on the same in next few days or weeks.

Alright, recently during one of our mission-critical production database migration we reached a point where we had to perform the Load Test before pushing the real-time workload on to this new system. I was asked to prepare the strategy and to pick the best possible tool to access the performance of the performance of this new system and how it will respond to the current traffic.

Received lot’s of suggestions from rest of the team, i.e. Swingbench, Loadrunner, Orion etc. but most of them are with a predefined set of Supplied Benchmarks though few are customized but are more related to the server performance and bechmarking but not at the Database or SQL level. And considering the notorious behavior of many of the custom code and legacy application modules I was more leaned towards picking a tool which covers both Database and SQL, and we finally agreed on Oracle RAT.

Oracle Real Application Testing, an option that comes with Oracle Enterprise Edition. Oracle Real Application Testing helps you to test the real-life workload after changes on the database such as database upgrades, OS upgrades, parameter changes, hardware replacement, etc. So, in short the Oracle RAT will be system stress test tool to simulate production load. Introduced in Oracle 11g Release 1. But yes, it’s not free and comes with additional cost and licenses.

There are two features “Database Replay” and “SQL Performance Analyzer” will help fine-tuning on the database before passing production.
I will cover more about the ‘Database Replay’ feature here and might cover the ‘SQL Performance Analyzer’ feature later.

When can you use RAT – “Database Replay” feature?
System Changes
– Hardware replacement such as CPU, RAM, etc.
– Database and OS upgrades
– Storage changes (OCFS2 – ASM)
– OS changes (Windows – Linux)
Configuration Changes
– Single Instance – RAC– Patch installation– Database parameter change

Which database versions are supported?
The workload capture process is supported on the Oracle Database 10g R2 (10.2.0.4) and above versions. The worload replay process is supported on the Oracle Database 11g R1 and above versions.

How to do it, where to start and all ?
Well there are two different ways you can perform the RAT (DB Replay) testing
– Using Oracle Enterprise Manager (OEM) : This option is entirely GUI based where you select your source and target systems and by doing all those clicks performs this stress/load testing on the system.
– Using command line way (My preferred way of doing this, yes I am ‘old school’) using DBMS_WORKLOAD_CAPTURE & DBMS_WORKLOAD_REPLAY procedures.

Some High Level Steps:
– Capture workload into capture files (In the form of .rec files, are flat files)
– Copy files to test system and preprocess them (to make them machine understandable)
– Replay files on test system (play the recorded files)
– Perform detailed analysis of workload capture and replay using reports generated by Database Replay. (Reporting for bench markings)

ON SOURCE System:
dbms_workload_capture.start_capture 
dbms_workload_capture.finish_capture; 

Copy the workload files to the client system. For example: – /home/oracle/rat/test1

On TARGET System:
1. dbms_workload_replay.process_capture 
2. dbms_workload_replay.initialize_replay 
3. dbms_workload_replay.prepare_replay 
4. Run the workload client to calibrate the replay. The calibration process (mode=CALIBRATE) recommends the number of client processes required to perform the replay
5. Replay the workload using below command.
dbms_workload_replay.start_replay; 

Components: The ARCHITECTURE (Simplified)

DB REPLAY “The Big Picture”

What is a Workload Client ?
The REPLAY uses wrc clients – Which are multi-threaded JAVA clients and can be started on the same machine/host or on separate hosts.
Will cover about them more in depth in future posts.


[oracle@PDIXIT:RAT]$ wrc system/XXXX mode=calibrate replaydir=/DBCapture/RAT/RAT_13DEC15_19_17
 Workload Replay Client: Release 11.2.0.4.0 - Production on Sat Dec 16 05:50:39 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
 
 
Report for Workload in: /DBCapture/RAT/RAT_13DEC16_19_17
-----------------------
Recommendation:
Consider using at least 13 clients divided among 4 CPU(s)
You will need at least 168 MB of memory per client process.
If your machine(s) cannot match that number, consider using more clients.
 
Workload Characteristics:
- max concurrency: 575 sessions
- total number of sessions: 1729
 
Assumptions:
- 1 client process per 50 concurrent sessions
- 4 client process per CPU
- 256 KB of memory cache per concurrent session
- think time scale = 100
- connect time scale = 100
- synchronization = TRUE
 

Now how to compare/benchmark ?
At the end of both CAPTURE & REPLAY methods you need to generate few process specific report.
Few of the important files that help in benchmarking are:
AWR Reports: Generate the AWR reports for the same time interval when we have any of the two process were in progress. The BEGIN AND END Snaps can be collected from DBA_WORKLOAD_CAPTURES & DBA_WORKLOAD_REPLAYS
CAPTURE/REPLAY Reports: These reports are specific to workload capture and playing on target.
Capture Vs Replay reports.

Hope It Helps
Prashant Dixit

Advertisement

Posted in Advanced, troubleshooting | Tagged: , | Leave a Comment »

Disk Goes Offline after rebalance! – Is this due to a BUG ?

Posted by FatDBA on January 30, 2020

HI Everyone,

Today during one of the activity where we migrated the ASM Storage for one of our 2 Node RAC cluster (running on 11gR2), where we had to perform the disk rebalancing to copy/mirror the contents from older/existing storage to the new storage before we go and drop the older storage partitions, we faced some some weirdness. The disks goes offline in this multi-node ASM and we left stranded with initially no idea behind this behavior, but finally we were able to locate a metalink page for the same issue.

Yes, this was due to a known Bug with number 13476583
Oracle Server (Rdbms) Version
This problem is introduced in the
11.2.0.2.3 Patch Set Update
11.2.0.2.5 Patch Set Update
11.2.0.2.4 Patch Set Update
11.2.0.2.3 Patch Set Update
and in 11.2.0.3, by the fix for bug 10040921.

Problem:
When disks are dropped, a forcible diskgroup dismount is performed on other ASM instance/s.

Workaround or Fix:
1. The problem does not cause diskgroup corruption. So mostly diskgroup can be mounted again.
2. Apply fix
Interim patches here: Patch:13476583
11.2.0.2.6 Patch Set Update
11.2.0.2 Patch 17 on Windows Platforms

Oracle Notes: 245840.1

Hope That Helps
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: , | Leave a Comment »

CLSRSC-188: Failed to create keys in Oracle Local Registry

Posted by FatDBA on January 3, 2020

Hi Everyone,

Happy New Year!

So here goes my first post for Year 2020. This time I will be discussing an error that we encountered some time back while executing the important ‘root.sh’ script for a new 12cR2 Oracle Restart setup on RHEL7. The script was going smooth till the point where it tries to add keys in OLR for HASD and died with error “CLSRSC-188: Failed to create keys in Oracle Local Registry”.

Below is the exact error what we get during the root.sh run.
Here you will that it was throwing an error which says “Site name (1819181-monkeydb) is invalid.clscfg”.


[root@1819181-monkeydb gridhome]# ./root.sh

Performing root user operation.
Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/12.2.0.1/gridhome/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/12.2.0.1/crsdata/1819181-monkeydb/crsconfig/roothas_2019-12-31_11-03-49AM.log
Site name (1819181-monkeydb) is invalid.clscfg -localadd -z  [-avlookup]
                 -p property1:value1,property2:value2...

  -avlookup       - Specify if the operation is during clusterware upgrade
  -z   - Specify the site GUID for this node
  -p propertylist - list of cluster properties and its value pairs

 Adds keys in OLR for the HASD.
WARNING: Using this tool may corrupt your cluster configuration. Do not
         use unless you positively know what you are doing.

Failed to create keys in the OLR, rc = 100, Message:

2019/12/31 11:03:56 CLSRSC-188: Failed to create keys in Oracle Local Registry
Died at /u01/app/12.2.0.1/gridhome/crs/install/oraolr.pm line 552.
The command '/u01/app/12.2.0.1/gridhome/perl/bin/perl -I/u01/app/12.2.0.1/gridhome/perl/lib -I/u01/app/12.2.0.1/gridhome/crs/install /u01/app/12.2.0.1/gridhome/crs/install/roothas.pl ' execution failed
 

It all happened because our hostname started with a number (1819181-monkeydb) and it’s a known bug that makes the hostname as invalid for root.sh and therefore the above error comes up.
There is a another condition as well, suppose your hostname starts with a alphabet (AHOST-TEXTIBOX-09) but as there is a limit of 15 characters which oracle considers for the hostname, and here in our example the 15th character is a hyphen (-).
So, even in such a case the root.sh will fail even when the hostname starts with a non-numeric character but it’s 15th character is a special character.

Now let’s discuss the solutions.
First, you can apply a merge patch 26751067 (which is merge of Bugs: Bug 25499276 Bug 26581118) and re-run the root.sh script.
Second, change the hostname right after the failure and re-run the script, this time it will go through with no error. Below is an example.

Let’s first change the hostname quickly before we and re-run root.sh


[root@1819181-monkeydb gridhome]# cat /etc/hostname
1819181-monkeydb
[root@1819181-monkeydb gridhome]# echo A1819181-monkeydb > /etc/hostname
[root@1819181-monkeydb gridhome]# cat /etc/hostname
A1819181-monkeydb
 

To update your command prompt simply re-login and to apply this change system wide execute below.


[root@1819181-monkeydb gridhome]# systemctl restart systemd-hostnamed
[root@A1819181-monkeydb gridhome]# 

[root@1819181-monkeydb gridhome]# ./root.sh
Performing root user operation.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/12.2.0.1/gridhome/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/12.2.0.1/crsdata/a1819181-monkeydb/crsconfig/roothas_2019-12-31_11-17-33AM.log
LOCAL ADD MODE
Creating OCR keys for user 'oracle', privgrp 'oinstall'..
Operation successful.
PROT-53: The file name [/u01/app/12.2.0.1/gridhome/cdata/localhost/local.ocr] specified for the 'ocrconfig -repair', 'ocrconfig -add' or 'ocrconfig -replace' command designates an invalid storage type for the Oracle Cluster Registry.
2019/12/31 11:17:43 CLSRSC-155: Replace of older local-only OCR failed
LOCAL ONLY MODE
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4664: Node a1819181-monkeydb successfully pinned.
2019/12/31 11:17:47 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.service'
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'a1819181-monkeydb'
CRS-2673: Attempting to stop 'ora.evmd' on 'a1819181-monkeydb'
CRS-2677: Stop of 'ora.evmd' on 'a1819181-monkeydb' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'a1819181-monkeydb' has completed
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.

a1819181-monkeydb     2019/12/31 11:18:41     /u01/app/12.2.0.1/gridhome/cdata/a1819181-monkeydb/backup_20191231_111841.olr     0
2019/12/31 11:18:42 CLSRSC-327: Successfully configured Oracle Restart for a standalone server
 


Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: | Leave a Comment »

 
%d bloggers like this: