Tales From A Lazy Fat DBA

Its all about Databases & their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Posts Tagged ‘Database’

Real Application Testing for Capture and Replay in a PDB, a great addition was made in 19c

Posted by FatDBA on November 20, 2022

I have used RAT many times in the past and is no doubt the best tool to simulate Oracle database workloads, it empowers us, the DBAs, to make proactive changes while knowing exactly what to expect when a change is implemented and minimizes the likelihood of any negative repercussions.

Once again I’d got a chance to use it during a migration project where we moved to Multitenant 19.15 environment. Database Release 19c, you can capture and replay the workload from within an individual pluggable database (PDB). This enhancement enables you to capture and replay workloads at the PDB level. This leads to better testing, less downtime, and more effective and efficient change control. This was earlier not possible and was only limited to capture and replay multitenant databases at the root multitenant container database (CDB) level.

Todays post is to show how to do that using a short demo on one of my test build.

-- For testing purpose, writing a shell script to capture workload. 
[oracle@oracleontario ~]$cat pdbtest_fatdba.sh
while sleep 1; do \
sqlplus <USERNAME1>/<PASSWORD1>@<PDB1_CONNECTION_STRING> &>/dev/null <<EOF
set head off
set pages 0
select count(*) from dixit.bigtab;
exit
EOF
done




-- Create a capture directory on the source database server. This is where all your capture specific files will reside. 
[oracle@oracleontario ~]$sqlplus <USERNAME1>/<PASSWORD1>@<PDB1_CONNECTION_STRING>

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Nov 19 09:05:12 2022
Version 19.15.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Sat Nov 19 2022 09:03:41 +00:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.15.0.0.0

SQL> CREATE or replace DIRECTORY mycapture_dir AS '/ontadomain/home/capture_pdb_fatdbatest';

Directory created.

SQL> quit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.15.0.0.0


-- Lets start the test, call the shell script which we prepared in the begining. 
[oracle@oracleontario ~]$sh ./pdbtest_fatdba.sh&
[1] 2189


And now capture the workload, use DBMS_WORKLOAD_CAPTURE.start_capture to collect workload data. Once completed, copy all the files from source to target server where they will be replayed.

[oracle@oracleontario ~]$sqlplus <USERNAME1>/<PASSWORD1>@<PDB1_CONNECTION_STRING>

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Nov 19 09:05:41 2022
Version 19.15.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Sat Nov 19 2022 09:05:39 +00:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.15.0.0.0

SQL> BEGIN
DBMS_WORKLOAD_CAPTURE.start_capture (name => 'PDB_666', dir => 'mycapture_dir', duration => 900);
END;
/

PL/SQL procedure successfully completed.

SQL> SQL> col name for a10
col status for a22
SELECT id, name, STATUS FROM dba_workload_captures WHERE name='PDB_666';

        ID NAME       STATUS
---------- ---------- --------------------
        31 PDB_666      IN PROGRESS

SQL> BEGIN
DBMS_WORKLOAD_CAPTURE.finish_capture;
END;
/  

PL/SQL procedure successfully completed.

SQL> quit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.15.0.0.0

Next we have to create the replay directory structure, process captured files and initialzie the replay.

[oracle@oracleontario ~]$sqlplus <USERNAME1>/<PASSWORD1>@<PDB1_CONNECTION_STRING>

SQL*Plus: Release 19.0.0.0.0 - Production on Sun Mar 1 01:20:18 2022
Version 19.15.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Sat Nov 19 2022 09:06:37 +00:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.15.0.0.0

SQL> CREATE or REPLACE DIRECTORY replaydir_fatdbatest1 AS '/ontadomain/home/replay_pdb_fatdbatest';

Directory created.

SQL>




-- Process all files that we have captured during the capture step. 
SQL> BEGIN
DBMS_WORKLOAD_REPLAY.process_capture('replaydir_fatdbatest1');
END;
/  

PL/SQL procedure successfully completed.

SQL>





-- Initialize the replay once above step of capture is sucessfully completed. 
SQL> BEGIN
DBMS_WORKLOAD_REPLAY.initialize_replay (replay_name => 'REPLAY_FATDBA1', replaydir_fatdbatest1 => 'replaydir_fatdbatest1');
END;
/  

PL/SQL procedure successfully completed.

SQL>

Next we have to remap connections, prepare the replay process and start the replay clients.

SQL> select conn_id,schedule_cap_id,capture_conn,replay_conn  from dba_workload_connection_map;

   CONN_ID SCHEDULE_CAP_ID CAPTURE_CONN                                       REPLAY_CONN
---------- --------------- -------------------------------------------------- --------------------------------------------------
         1                 (DESCRIPTION=(CONNECT_DATA=(SERVER=DEDICATED)(SERV
                           ICE_NAME=<PDB_SERVICE_NAME>)(CID=(PROGRAM=sqlplus)
                           (HOST=<HOSTNAME>)(USER=oracle)))(A
                           DDRESS=(PROTOCOL=TCP)(HOST=<IP_ADDRESS>)(PORT=<port>
                           )(HOSTNAME=<HOSTNAME>)))


SQL> BEGIN
  DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION (connection_id => 1,
                           replay_connection => 'DESCRIPTION=(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=xxxx)(CID=(PROGRAM=sqlplus)(HOST=<HOSTNAME>)(USER=xxxxx)))(ADDRESS=(PROTOCOL=TCP)(HOST=xxxx)(PORT=xxxx)(HOSTNAME=xxxx)))');
END;
/

PL/SQL procedure successfully completed.




-- Prepare your replay. 
SQL> BEGIN
  DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY;
END;
/

PL/SQL procedure successfully completed.

SQL>



-- start the replay client. 
[oracle@oracleontario ~]$wrc <USERNAME1>/<PASSWORD1>@<PDB1_CONNECTION_STRING> replaydir=/refresh/home/replay_pdb1 mode=replay CONNECTION_OVERRIDE=TRUE

Workload Replay Client: Release 19.15.0.0.0 - Production on Sun Mar 1 01:29:11 2022

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.


Wait for the replay to start (01:29:11)

Start the replay process on the target to fire the workload on the target and once completed generate the replay or any comparison reports.

SQL> BEGIN
DBMS_WORKLOAD_REPLAY.start_replay;
END;
/  

PL/SQL procedure successfully completed.

SQL>

-- Verify its status
SQL> COL NAME FOR A10
SQL> SELECT id, name, STATUS FROM dba_workload_replays where name = 'REPLAY_FATDBA1';

        ID NAME            STATUS
---------- --------------- ----------------------------------------
         1 REPLAY_FATDBA1  IN PROGRESS

SQL>
SQL> SELECT id, name, STATUS FROM dba_workload_replays where name = 'REPLAY_FATDBA1';

        ID NAME            STATUS
---------- --------------  ----------------------------------------
         1 REPLAY_FATDBA1  COMPLETED

SQL>




-- Now when its completed we can generate the REPLAY report or any other reportings like compare report etc. 
SQL> set long 50000
set pages 50000
set lines 500SQL> SQL>
SQL> select dbms_workload_replay.report (replay_id => 1, format=> 'TEXT') from dual;

DBMS_WORKLOAD_REPLAY.REPORT(REPLAY_ID=>1,FORMAT=>'TEXT')
--------------------------------------------------------------------------------
DB Replay Report for REPLAY_FATDBA1
--------------------------------------------------------------------------------
-------------------

--------------------------------------------------------------------------
| DB Name  | DB Id      | Release    | RAC | Replay Name | Replay Status |
--------------------------------------------------------------------------
| FATDBA12 | 1218318131 | 19.15.0.0.0 | NO  | REPLAY_FATDBA1    | COMPLETED     |
--------------------------------------------------------------------------
... ...
..........
.............
......................

Hope It Helped!
Prashant Dixit

Advertisement

Posted in Uncategorized | Tagged: , , , , | Leave a Comment »

Some of the new exotic wait events in Oracle 21c …

Posted by FatDBA on November 12, 2022

Hi Guys,


Lately I was doing tests on Oracle 21c and came across few of the exotic new wait events added to Oracle 21c. I was using 21.3.0.0.0 and have noticed few new Exadata, ASM, Dataguard, AI/ML, Exascale, Cluster, exa persistent memory, CTWR etc. specific waits were added to the database. I have also noticed some strange looking waits as well i.e. ‘Spare1′,’Spare2′,’Spare3’ ….

Being an innovation release, and since this is a beta version, there is no document available or published for the customer for any of these new waits on Metalink

Couple of new ‘Bloom Filter‘ related waits added to 21c
Bloom Filter EOF
Bloom Filter Fetch

Bunch of new Exadata ‘smart Index|Table scan’ related, User I/O classed wait events added to 21c are
cell smart index scan request
cell smart index scan: db timezone upgrade
cell smart index scan: disabled by user
cell smart index scan: pass through
cell smart table scan request
cell smart table scan: db timezone upgrade
cell smart table scan: disabled by user
cell smart table scan: pass through
cell worker online completion
cell worker retry

Few of the new machine learning & Artificial Intellegence related wait events added in 21c are
enq: AI - Seek operation
enq: AI - dequeue operation
enq: AI - lwm update operation
enq: AI - remove subscriber
enq: AI - start cross operation
enq: AI - stop cross operation
enq: AI - truncate operation

Some of the new RAC (cluster class), global cache (cache fusion) block related waits added to 21c are
gc cr block direct read
gc current block direct read

Some of the new EDSLib/EBSLib latch waits, which are used to gum the code between RDBMS and Exascale storage added in 21c are given below.
latch: EDSLib File
latch: EDSLib Message
latch: EDSLib Root
latch: EGSLib Root
latch: EGSLibDS Root
latch: EGSLibDataStore Object
latch: EGSLibDataStoreShard Object
latch: EGSLibForum Object
latch: EGSLibNet Root
latch: EGSLibSP Root
latch: EGSLibStoragePool Object
latch: EGSLibStoragePoolRing Object
latch: FOB s.o list parent latch

This is by far the most interesting one, with no name and they seem some sort of free or extra wait events names. I am familiar with few of the other ‘spare’ i.e. “enq: BS – krb spare”, “enq: SP..xx”, “enq: SX.xx.xxx“, but these are different
spare 10
spare 2
spare 3
spare 4
spare 5
spare 6
spare 7
spare 8
spare 9

Couple of new ASM related wait events were added too
enq: AM - ASM Scrubbing
enq: AM - ASM used quota calculation

Some of the new enqueue waits on Change Tracking file are
enq: CT - CTWR DBA buffer resize
enq: CT - CTWR datafile sync
enq: CT - CTWR thread checkpoint

Some of the new enqueue related waits on Exadata Persistent Memory (PMEM)
enq: FF - PMEMfs - ODM
enq: FF - PMEMfs - Param
enq: FF - PMEMfs - Resize
enq: FF - PMEMfs - aXtnd pool
enq: FF - PMEMfs - reSze pool

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , | Leave a Comment »

Kafka Java Client for Transactional Event Queues (TEQ), Another great feature added to Oracle 21c …

Posted by FatDBA on November 6, 2022

Hi Guys,

Recently I was testing some advance queuing functions to setup an asynchronous communication (using event queues) between few applications and users on Oracle 21c database, to build a messaging platform. I was looking for a way to connect my client side libraries to allow Kafka APIs to connect to Oracle DB.

I had no clue how to do this integration, and then I came across this official documentation which talks about a new functionality in Oracle Database Version 21c – Kafka Java Client for Transactional Event Queues (TEQ) that enables Kafka application compatibility with Oracle Database. This provides easy migration of Kafka applications to TEQ.

Oracle Transactional Event Queue (TEQ) makes it easy to implement event-based applications. It is also highly integrated with Apache Kafka. Apart from enabling apps that use Kafka APIs to transparently operate on Oracle TEQ, Oracle TEQ also supports bi-directional information flow between TEQ and Kafka, so that changes are available in TEQ or Kafka as soon as possible in near-real-time.

Apache Kafka Connect is a framework included in Apache Kafka that integrates Kafka with other systems. Oracle TEQ will provide standard JMS package and related JDBC, Transaction packages to establish the connection and complete the transactional data flow. Oracle TEQ configures standard Kafka JMS connectors to establish interoperability and complete the data flow between the two messaging systems.

Another great blog post on the topic : https://docs.oracle.com/en/database/oracle/oracle-database/21/adque/Kafka_cient_interface_TEQ.html#GUID-94589C97-F323-4607-8C3A-10A0EDF9DA0D

Hope It Helps!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , | Leave a Comment »

A cool perl script to generate AWR report time series in any given timeframe …

Posted by FatDBA on October 2, 2022

Hi All,

Recently I was doing an analysis on a slothful database where I had to generate multiple AWR reports to cover almost 12 hours of the problem period. I usually go with few of the SQL scripts or AWR generator tools for my trend analysis, but I was little lucky in making unexpected and fortunate discovery on Metalink, found a cool Doc ID 2857522.1 which explains about an Oracle provided perl script to generate AWR report time series in any given timeframe. The script works with RDBMS 12.1 and later.

The script generates all the AWR reports with [begin snapid:end snapid] equal to [n,n+1] with n falling into a given time interval. The script is very handy and interactive, gives you option to choose between standard Text or HTML format, report type (RAC or Non-RAC type reports). You can call it in both command line or interactive mode.

This script generates a timeseries of awr report for a given database in a awr repository. It connects to database via oracle sid on the db server or via tnsalias. To run the script just execute B . To connect via oracle sid do not specify username password and tnsalias. In order to have sixty minutes time-series reports do not specify frequency if awr snapshot is executed every 60 minutes (default awr setting), set frequency 2 if awr snapshot is executed every 30 mins , 4 if awr snapshot is executed every 15 mins and so on. Leaving frequency blank will generate a series based on the awr snapshot frequency. All reports generated during execution will be saved in dumpdir, if the directory does not exist it will be automatically created.

By default awr reports will be generated in text format. To generate pluggable database awr reports connect to database by specifying username , password and tns alias of the pluggable db.

-- Call perl script awrdmp.pl to run the AWR extraction.
[oracle@fatdba ~]$ perl ./awrdmp.pl 
Enter usrname: - 
Enter password: - 
Enter tnsalias: - 
Enter frequency - 
Enter mode
(text/html) - text
CONNECTED AS SYSDBA  
RDBMS VERSION: 19.0.0.0.0

---- ---------- ------- ---------- ---------- -------------------- ------
NUM       DBID  INSTID     DBNAME     INSTID              MACHINE CONTID
---- ---------- ------- ---------- ---------- -------------------- ------
0 2511273110       2      DIXITD       fat2             racnode2      0
1 2511273110       1      DIXITD       fat1             racnode1      0

Enter database num: [0,1] -: 0
 0 2511273110       2      DIXITD       fat2             racnode2
RANGE AVAILABLE IN REPOSITORY FOR DBID 2511273110 INST 2:
------------------------------------------------------
[191 04-SEP-22 07.58.34.180 AM : 420 05-SEP-22 06.28.18.307 AM] 
Enter the minimum date interval (DD/MM/YYYY) -: 04/09/2022
Enter the maximum date interval (DD/MM/YYYY) -: 05/09/2022
GENERATING FILES
[  12 %] writing file : report_2_DIXITD_191_192.text 



--
--
--
-- In case if want to execute it in command line format.
perl awrdmp.pl --batch --freq 1 --instid 1 --dbid 2511273110  --dbn DIXITD --begin 04/09/2022 --end 05/09/2022 --rac --mode html


--
--
--
-- Output under dumpdir directory.
ls -ltr ./dumpdir
[...]
-rw-r--r-- 1 oracle oinstall 145147 Sep 05 14:16 report_1_DIXITD_196_197.text
-rw-r--r-- 1 oracle oinstall 159775 Sep 05 14:16 report_1_DIXITD_197_198.text
-rw-r--r-- 1 oracle oinstall 157100 Sep 05 14:16 report_1_DIXITD_198_199.text
-rw-r--r-- 1 oracle oinstall 148216 Sep 05 14:16 report_1_DIXITD_199_200.text
-rw-r--r-- 1 oracle oinstall 144003 Sep 05 14:16 report_1_DIXITD_200_201.text
-rw-r--r-- 1 oracle oinstall 146216 Sep 05 14:16 report_1_DIXITD_201_202.text

Hope It Helped
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: , , , | Leave a Comment »

Are you looking for a method to stop automatic SQL quarantine without disabling the entire SQL Quarantine feature ? Welcome to Oracle 21c …

Posted by FatDBA on September 23, 2022

Hi All,

Recently I was working on a 21c database for a POC where at one point I want to disable automatic creation of SQL Quarantine, but without disabling the entire statement Quarantine feature. This 21.3.0.0.0 Database had got the resource manager enabled, and one of the SQL was taking long time to complete, longer than the allowed directive limits on IO & CPU TIME, and as expected the SQL was killed with message “ORA-00040: active time limit exceeded – call aborted” and the SQL plan was quarantined. I wanted to stop or disable the auto creation of SQL Quarantines for the SQL in question, after RM terminates the SQL.

If you want to read more about SQL Quarantine, please click this link to my earlier post with a demo about it.

I remember in 19c there wasn’t any way to achieve that and can only regulate behavior using two of the underscore parameters _quarantine_enabled or _optimizer_quarantine_sql. Oracle 21c has introduced two two new parameters to control the behavior of SQL Quarantine, and that specially solves this issue.

First one is optimizer_capture_sql_quarantine, if set to FALSE, would disable the automatic creation of SQL Quarantine configurations after RM termination of a SQL query execution. This is FALSE by default.

The second parameter is optimizer_use_sql_quarantine, if set to FALSE would disable the use of existing SQL Quarantine configurations in a database. This parameter determines whether the optimizer considers SQL Quarantine configurations when choosing an execution plan for a SQL statement. This is TRUE by default, thereby allowing users to manually create and use SQL Quarantine configurations.

Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0

-- Default Setting
SQL> show parameter OPTIMIZER_CAPTURE_SQL_QUARANTINE

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_capture_sql_quarantine     boolean     FALSE
SQL>
SQL>

-- Default Setting
SQL> sho parameter OPTIMIZER_USE_SQL_QUARANTINE

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_use_sql_quarantine         boolean     TRUE
SQL>
SQL>

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , , | Leave a Comment »

FRA full with archives ORA-38706 ORA-38708 ? Not always think of a BUG … And when that proud little DBA in me missed common sense

Posted by FatDBA on September 11, 2022

Hi All,
Some time back I was working on an 19.16 database where I’d to enable FLASHBACK on a database, but immediately kicked out with an error “ORA-38706: Cannot turn on FLASHBACK DATABASE logging. ORA-38708: not enough space for first flashback database log file”. So, I’d tried to expand the FRA size, but thought to check what there inside the FRA, it was 99.9% full with 270 archive log files occupying 99.39% of the total allocated space. So, everything was good till that point.

[oracle@fatdba ~]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Sep 5 19:07:28 2022
Version 19.16.0.0.0

Copyright (c) 1982, 2022, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.16.0.0.0

SQL>
SQL> show parameter db_recovery_file_dest_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest_size           big integer 251G


NAME         SPACE_LIMIT_GB SPACE_AVAILABLE_GB PERCENT_FULL
------------ -------------- ------------------ ------------
+RECO                   251         .240234375         99.9


FILE_TYPE               PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES     CON_ID
----------------------- ------------------ ------------------------- --------------- ----------
CONTROL FILE                             0                         0               1          0
REDO LOG                                 0                         0               0          0
ARCHIVED LOG                         99.39                         0             270          0
BACKUP PIECE                           .01                         0               2          0
IMAGE COPY                               0                         0               0          0
FLASHBACK LOG                            0                         0               0          0
FOREIGN ARCHIVED LOG                     0                         0               0          0
AUXILIARY DATAFILE COPY                  0                         0               0          0

But I thought to check space at the ASM Level. I queried v$asm_diskgroup and results were totally opposite with what I saw with V$RECOVERY_FILE_DEST and v$flash_recovery_area_usage. RECO Disk Group (FRA location) was almost 100% free and only 0.54% was consumed. Same results were there when I’d queried RECO DG via asmcmd.

-- results from v$asm_diskgroup view
Disk Group            Sector   Block   Allocation
Name                    Size    Size    Unit Size State       Type   Total Size (MB) Used Size (MB) Pct. Used
-------------------- ------- ------- ------------ ----------- ------ --------------- -------------- ---------
DATA                     512   4,096    1,048,576 CONNECTED   EXTERN         691,197        356,322     51.55
OCRVFDG                  512   4,096    4,194,304 MOUNTED     EXTERN          25,596            100       .39
RECO                     512   4,096    1,048,576 CONNECTED   EXTERN       1,048,575          5,645       .54
                                                                     --------------- --------------
Grand Total:                                                               1,765,368        362,067



ASMCMD> lsdg
State    Type    Rebal  Sector  Logical_Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512             512   4096  1048576    691197   334875                0          334875              0             N  DATA/
MOUNTED  EXTERN  N         512             512   4096  4194304     25596    25496                0           25496              0             N  OCRVFDG/
MOUNTED  EXTERN  N         512             512   4096  1048576   1048575  1043233                0         1043233              0             N  RECO/
ASMCMD>

At this point I’d started thinking about some kind of a BUG in the database, I know about few FRA related bugs in earlier Oracle versions. Just before I’d tried anything else, I thought to execute dbms_backup_restore.refreshagedfiles which refreshes the view. At the same time tried kra_options event which resets v$recovery_file_dest, but no luck 😦

SQL>
SQL> alter session set events 'immediate trace name kra_options level 1';

Session altered.

SQL>  execute dbms_backup_restore.refreshagedfiles;

PL/SQL procedure successfully completed.

SQL>

And right at that point I was very sure it was happening all due to a bug and was thinking to contact Oracle support. But just before that, that proud little DBA in me thought, Am I missing something ? Any other possible areas I should have explored before connecting with OCS ?

I’d checked RMAN to see if it still has any expired archivelog details and ran ‘crosscheck archivelog all’, and it identified exactly 270 older archivelogs which were non existent. So, that proud little DBA was wrong 🙂 .. When the count was matched exactly with the v$flash_recovery_area_usage, I’d deleted all of those expired archivelogs from catalog.

-- To Crosscheck all archivelog files present in the RMAN catalog.
RMAN> crosscheck archivelog all;

using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=258 device type=DISK
validation failed for archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_1.261.1109109619 RECID=1 STAMP=1109109620
validation failed for archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_2.264.1109401205 RECID=2 STAMP=1109401206
validation failed for archived log
.....
........
..........
..............
validation failed for archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_268.463.1114203831 RECID=268 STAMP=1114203829
validation failed for archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_269.469.1114207455 RECID=269 STAMP=1114207454
validation succeeded for archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_321.531.1114624193 RECID=270 STAMP=1114624193
Crosschecked 270 objects


-- Delete expired archivelog files
RMAN>
RMAN>
RMAN> delete expired archivelog all;
...
.......
270     1    270     X 31-AUG-22
        Name: +RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_268.463.1114203831 RECID=268 STAMP=1114203829
..
.......
Do you really want to delete the above objects (enter YES or NO)? YES
deleted archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_1.261.1109109619 RECID=1 STAMP=1109109620
deleted archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_2.264.1109401205 RECID=2 STAMP=1109401206
deleted archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_3.265.1109671797 RECID=3 STAMP=1109671799
deleted archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_4.266.1109955617 RECID=4 STAMP=1109955618
deleted archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_5.267.1110233333 RECID=5 STAMP=1110233334
deleted archived log
archived log file name=+RECO/FATDBA/ARCHIVELOG/2022_07_02/thread_1_seq_6.268.1110485231 RECID=6 STAMP=1110485232
deleted archived log
Deleted 270 EXPIRED objects

RMAN>
RMAN>

And immediately I saw the change in FRA related dynamic views (V$RECOVERY_FILE_DEST and v$flash_recovery_area_usage) and matched the genuine utilization at the ASM level.

[oracle@fatdba ~]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Sep 5 19:07:28 2022
Version 19.16.0.0.0

Copyright (c) 1982, 2022, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.16.0.0.0

SQL>
SQL>
SQL> SELECT NAME,
       (SPACE_LIMIT / 1024 / 1024 / 1024) SPACE_LIMIT_GB,
         ((SPACE_LIMIT - SPACE_USED + SPACE_RECLAIMABLE) / 1024 / 1024 / 1024) AS SPACE_AVAILABLE_GB,
       ROUND((SPACE_USED - SPACE_RECLAIMABLE) / SPACE_LIMIT * 100, 1) AS PERCENT_FULL
  FROM V$RECOVERY_FILE_DEST;  

NAME         SPACE_LIMIT_GB SPACE_AVAILABLE_GB PERCENT_FULL
------------ -------------- ------------------ ------------
+RECO                   200          197.59082          1.2

SQL> select * from v$flash_recovery_area_usage;

FILE_TYPE               PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES     CON_ID
----------------------- ------------------ ------------------------- --------------- ----------
CONTROL FILE                           .01                         0               1          0
REDO LOG                                 0                         0               0          0
ARCHIVED LOG                           .57                         0               2          0
BACKUP PIECE                           .01                       .01               2          0
IMAGE COPY                               0                         0               0          0
FLASHBACK LOG                            0                         0               0          0
FOREIGN ARCHIVED LOG                     0                         0               0          0
AUXILIARY DATAFILE COPY                  0                         0               0          0

8 rows selected.

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: , , | Leave a Comment »

A new 21c dynamic view securefile_shrink, and possibly a new BUG in 21.3. I guess I have identified a BUG

Posted by FatDBA on September 1, 2022

Recently I was doing a POC on Oracle 21c, and I had to shrink an LOB secure file column, and I was aware about a new dynamic view added to Oracle 21c called V$SECUREFILE_SHRINK, which is very useful to monitor securefiles shrink operations. This gives you some great details like start time, end time, status of the operation etc. I shrank one of the required securefile LOB column of a table and I was interested to see what all was captured by V$SECUREFILE_SHRINK … But It was empty!! got ‘no rows selected’, Why is that ??

Let me try to replicate the scenario and explain what I was doing that day.

-- Let me create a test scenario
-- Will create a brand new playground to test all things
[oracle@witnessalberta ~]$ !sq
sqlplus / as sysdba

SQL*Plus: Release 21.0.0.0.0 - Production on Sun Aug 28 10:50:26 2022
Version 21.3.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.


Connected to:
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0

SQL>
SQL>
SQL>
SQL> CREATE PLUGGABLE DATABASE pdb2 ADMIN USER pdb_adm IDENTIFIED BY oracle90 CREATE_FILE_DEST='/opt/oracle/oradata';

Pluggable database created.

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 ORCLPDB1                       MOUNTED
         4 PDB2                           MOUNTED
SQL> ALTER PLUGGABLE DATABASE pdb2 OPEN READ WRITE;

Pluggable database altered.

SQL>  ALTER SESSION SET CONTAINER = PDB2;

Session altered.

SQL> SHOW CON_NAME;

CON_NAME
------------------------------
PDB2

SQL>
SQL> conn fatdba/oracle90@PDB2
Connected.
SQL>
SQL>
SQL>
SQL> show user
USER is "FATDBA"
SQL>
SQL>
SQL>
SQL>

-- Created a Table with a SECUREFILE LOB 
SQL> CREATE TABLE secure_file_tab (rid  NUMBER(5), bcol BLOB) LOB (bcol) STORE AS SECUREFILE bcol_lob 
(TABLESPACE users DISABLE  STORAGE IN ROW CHUNK 8192 RETENTION MIN 3600 KEEP_DUPLICATES NOCOMPRESS DECRYPT CACHE READS); 

Table created.

SQL>

SQL>
-- Inserted some 100000 random rows into the table
insert into secure_file_tab VALUES(101, utl_raw.cast_to_raw('hello, this is the first review'));
insert into secure_file_tab VALUES(101, utl_raw.cast_to_raw('hello, this is the first review'));
insert into secure_file_tab VALUES(101, utl_raw.cast_to_raw('hello, this is the first review'));
insert into secure_file_tab VALUES(101, utl_raw.cast_to_raw('hello, this is the first review'));
insert into secure_file_tab VALUES(101, utl_raw.cast_to_raw('hello, this is the first review'));
insert into secure_file_tab VALUES(101, utl_raw.cast_to_raw('hello, this is the first review'));
...
......
........

SQL> select count(*) from secure_file_tab;

  COUNT(*)
----------
    100001

SQL>
SQL>
-- Lets check the size of the table.
SQL> select segment_name, bytes/1024 from dba_segments where segment_name='SECURE_FILE_TAB';

SEGMENT_NAME         BYTES/1024
-------------------- ----------
SECURE_FILE_TAB      4096

-- Have deleted all rows from the table
SQL> delete from secure_file_tab where RID=101;

100000 rows deleted.

SQL> COMMIT;

SQL> select count(*) from secure_file_tab;

COUNT(*)
----------
1

-- Gathered table stats
SQL> exec dbms_stats.gather_table_stats(null, 'secure_file_tab');

PL/SQL procedure successfully completed.


SQL> select segment_name, bytes/1024 from dba_segments where segment_name='SECURE_FILE_TAB';

SEGMENT_NAME         BYTES/1024
-------------------- ----------
SECURE_FILE_TAB      113



SQL> alter table secure_file_tab modify lob(bcol) (shrink space);

Table altered.

Now time to check if anything captured in the v$securefile_shrink or gv$securefile_shrink dynamic view about the securefile shrink operation.

-- Lets check if anything captured by the view
SQL> desc v$securefile_shrink
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 LOB_OBJD                                           NUMBER
 SHRINK_STATUS                                      VARCHAR2(40)
 START_TIME                                         TIMESTAMP(3) WITH TIME ZONE
 END_TIME                                           TIMESTAMP(3) WITH TIME ZONE
 BLOCKS_MOVED                                       NUMBER
 BLOCKS_FREED                                       NUMBER
 BLOCKS_ALLOCATED                                   NUMBER
 EXTENTS_ALLOCATED                                  NUMBER
 EXTENTS_FREED                                      NUMBER
 EXTENTS_SEALED                                     NUMBER
 CON_ID                                             NUMBER


SQL> select * from v$securefile_shrink;

no rows selected

SQL> select * from gv$securefile_shrink;

no rows selected

SQL>

Nothing!!! Why ? Why ? Let me try the other way (MOVE LOB) and see if that populates anything in v$securefile_shrink or gv$securefile_shrink. Repopulated the same table again with 100000 rows, deleted all rows from the table, committed changes, regathered stats and verified table size, same what we did in Test 1.

-- Changed the LOB SHRINK technique this time.
SQL> ALTER TABLE secure_file_tab MOVE LOB(BCOL) STORE AS (TABLESPACE test);

Table altered.


-- Lets check if anything captured by the view
SQL> select * from v$securefile_shrink;

no rows selected

SQL> select * from gv$securefile_shrink;

no rows selected

SQL>

Nothing!!! Why ? Why ? Let me try the with one more way (SHRINK SPACE CASCADE) again and see if that populates anything in v$securefile_shrink or gv$securefile_shrink. Repopulated the same table again with 100000 rows, deleted all rows from the table, committed changes, regathered stats and verified table size, same what we did in Test 1.

--Let me try the other way
SQL> alter table secure_file_tab MODIFY LOB(BCOL) (SHRINK SPACE CASCADE);

Table altered.

-- Lets check if anything captured by the view
SQL> select * from v$securefile_shrink;

no rows selected

SQL> select * from gv$securefile_shrink;

no rows selected

SQL>

Nothing!!! Why ? Why ? This is super strange now. Then finally, I decided to check with Oracle support and asked them to try at their end in 21c and from their end also it was getting populated. I guess I have identified a BUG in 21.3.0.0.0!! 🙂 🙂 Currently the case is with development team and they are taking a look on this buggy behavior on Oracle 21.5 on Red Hat Linux 7 x86_64 … Let’s see how it goes from here, will post the solution or the cause behind this ill behavior soon.

Hope It Helped!
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: , , , , | Leave a Comment »

Prerequisite check CheckActiveFilesAndExecutables failed while applying July 2022 GI Release Update 19.16.0.0.220719, and the magic of opatchauto resume

Posted by FatDBA on August 8, 2022

Hi All,

Recently while applying the latest (July 2022) GI Release Update 19.16.0.0.220719 on GI+DB homes I’ve encountered an issue where the GI patching failed with an (expected) error ‘oracle.opatch.opatchsdk.OPatchException: Prerequisite check “CheckActiveFilesAndExecutables” failed’ and ended with ‘OPATCHAUTO-68061: The orchestration engine failed‘. Below pasted is what exactly happened …

[root@monkeybox patches]# /test/patch/dir/grid/OPatch/opatchauto apply /patchdir/july2022/34130714

OPatchauto session is initiated  
.....
..........

Performing prepatch operations on CRS - bringing down CRS service on home /test/patch/dir/grid
Prepatch operation log file location: /testdir/app/grid/crsdata/monkeyboxcrsconfig/hapatch_xxxxxx.log
CRS service brought down successfully on home /test/dir/grid

Start applying binary patch on home /oracledir/app/oracle/product/19c/dbhome
Binary patch applied successfully on home /oracledir/app/oracle/product/19c/dbhome

Start applying binary patch on home /test/patch/dir/grid
Failed while applying binary patches on home /test/patch/dir/grid   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Execution of [OPatchAutoBinaryAction] patch action failed, check log for more details. Failures:
Patch Target : monkeybox->/test/patch/dir/grid Type[siha]
Details: [
---------------------------Patching Failed---------------------------------
Command execution failed during patching in home: /test/patch/dir/grid, host: monkeybox.
Command failed:  /test/patch/dir/grid/OPatch/opatchauto  apply /patchdir/july2022/34130714 -oh /test/patch/dir/grid -target_type has -binary -invPtrLoc /test/patch/dir/grid/oraInst.loc -jre /test/patch/dir/grid/OPatch/jre -persistresult /test/patch/dir/grid/opatchautocfg/db/sessioninfo/sessionresult_monkeybox_siha_1.ser -analyzedresult /test/patch/dir/grid/opatchautocfg/db/sessioninfo/sessionresult_analyze_monkeybox_siha_1.ser
Command failure output:
==Following patches FAILED in apply:

Patch: /patchdir/july2022/34130714/33575402
Log: /test/patch/dir/grid/cfgtoollogs/opatchauto/core/opatch/xxxx.log
Reason: Failed during Patching: oracle.opatch.opatchsdk.OPatchException: Prerequisite check "CheckActiveFilesAndExecutables" failed.

Patch: /patchdir/july2022/34130714/34133642
Log: /test/patch/dir/grid/cfgtoollogs/opatchauto/core/opatch/xxxxxxxxx.log
Reason: Failed during Patching: oracle.opatch.opatchsdk.OPatchException: Prerequisite check "CheckActiveFilesAndExecutables" failed. >>>>>>>>>>>>>

Patch: /patchdir/july2022/34130714/34139601
Log: /test/patch/dir/grid/cfgtoollogs/opatchauto/core/opatch/xxxxxxxxx.log
Reason: Failed during Patching: oracle.opatch.opatchsdk.OPatchException: Prerequisite check "CheckActiveFilesAndExecutables" failed. >>>>>>>>>>>>>

Patch: /patchdir/july2022/34130714/34160635
Log: /test/patch/dir/grid/cfgtoollogs/opatchauto/core/opatch/xxxxxxxxx.log
Reason: Failed during Patching: oracle.opatch.opatchsdk.OPatchException: Prerequisite check "CheckActiveFilesAndExecutables" failed. >>>>>>>>>>>>>

Patch: /patchdir/july2022/34130714/34318175
Log: /test/patch/dir/grid/cfgtoollogs/opatchauto/core/opatch/xxxxxxxxx.log
Reason: Failed during Patching: oracle.opatch.opatchsdk.OPatchException: Prerequisite check "CheckActiveFilesAndExecutables" failed. >>>>>>>>>>>>>

After fixing the cause of failure Run opatchauto resume  >>>>>>>>>>>>>

]
OPATCHAUTO-68061: The orchestration engine failed. >>>>>>>>>>>>>
OPATCHAUTO-68061: The orchestration engine failed with return code 1
OPATCHAUTO-68061: Check the log for more details.
OPatchAuto failed.

OPatchauto session completed at xxxxxxxxx
Time taken to complete the session 8 minutes, 50 seconds

This is a classic case where the patching failed as there were few executables/files from the HOME still active. Same you can verify in the standard logging directory cfgtoollogs for opatchauto for the patch failed.

[INFO]    Prerequisite check "CheckActiveFilesAndExecutables" failed.
The details are:

Following active files/executables/libs are used by ORACLE_HOME :/test/dir/grid

/test/dir/grid/lib/libclntsh.so.19.1
/test/dir/grid/lib/libasmclntsh19.so

The easiest way to fix this issue is to find which opened process is using a file, a directory or a socket, and that you can do it using fuser command. The fuser command lists the process numbers of local processes that use the local or remote files specified by the File parameter. Let’s do it!

[grid@monkeybox ~]$
[grid@monkeybox ~]$ /sbin/fuser /test/dir/grid/lib/libclntsh.so.19.1
/test/dir/grid/lib/libclntsh.so.19.1: 18199m
[grid@monkeybox ~]$
[grid@monkeybox ~]$ /sbin/fuser /test/dir/grid/lib/libasmclntsh19.so
/test/dir/grid/lib/libasmclntsh19.so: 18199m
[grid@monkeybox ~]$
[grid@monkeybox ~]$
[grid@monkeybox ~]$
[grid@monkeybox ~]$ ps -ef|grep 18199
grid     18199 13587  0 09:34 pts/2    00:00:00 /test/dir/grid/perl/bin/perl -w -I /test/dir/grid/perl/lib/5.32.0 -I /test/dir/grid/perl/lib/site_perl/5.32.0 -I /test/dir/grid/lib -I /test/dir/grid/lib/asmcmd -I /test/dir/grid/rdbms/lib/asmcmd /test/dir/grid/bin/asmcmdcore
grid     29647 16974  0 10:11 pts/3    00:00:00 grep --color=auto 13610
[grid@monkeybox ~]$
[grid@monkeybox ~]$
[grid@monkeybox ~]$ kill -9 18199
[grid@monkeybox ~]$

Now when we have killed those two opened files (libclntsh.so.19.1 and libasmclntsh19.so), lets resume the patch from the same spot where it has left last time before crashing. I mean opatchauto was able to patch DB HOME before it failed while applying it on GI HOME. So, this will resume from the same spot and will igore previous applied patches. So, will use ‘opatchauto resume’ instruction/command as this operation resumes a previous patching session.

opatchauto is a really powerful tool which even let you resume your patch even when the patching crashed in between by any reasons like server crash, reboot cases or even manual CTRL+C etc. The other two regular options are rollback and version.

[root@monkeybox patches]# /test/dir/grid/OPatch/opatchauto resume

OPatchauto session is initiated at xxxxxxxxx
Session log file is .....
Resuming existing session with id xxxxxx
....
.......
...............
OPatchAuto successful.

Patching is completed successfully. Please find the summary as follows:

OPatchauto session completed at xxxxxx
Time taken to complete the session 9 minutes, 12 seconds

[root@monkeybox patches]#

Hope It Helped!
Prashant Dixit

Posted in troubleshooting | Tagged: , , , , , | Leave a Comment »

Writing custom messages to Alert Log and remembering DBMS_SYSTEM and that sweet little ksdwrt

Posted by FatDBA on August 1, 2022

Hi Guys,

Today’s one is a quick one and is about an old package (I guess its there since 8i days), but lesser known and underutilized feature at the same time.
Recently I was working on an OEM task where I have to edit the metric collection regular expressions to make it as per requirements. The next question was asked – ‘Lets wait till the time that specific incident happens in the database to see if it triggers the alert or not ….‘ Really ? Do we have to wait that long ?

No, in order to test that change, you don’t even have to adopt any crude method of editing the alert log file manually and write the error message over there or use UTL_FILE. There is an inbuilt package called dbms_system that you can use to handle such cases and write your custom messages to the alert log file. There are few other options/routines available that you can do with the package but this one is about a special subprogram called ‘ksdwrt‘.

dbms_system.ksdwrt(dest IN BINARY_INTEGER,tst IN VARCHAR2);

Here is the syntax:
execute sys.dbms_system.ksdwrt(,to_char(sysdate)|| ‘ — ‘);
where the argument values can be
1 to write to trace file
2 to write to alert log file
3 to write both trace and alert logfile.

Example:
exec dbms_system.ksdwrt(3,'ORA-04031: This is a test error message, please ignore');

Hope It Helped!
Prashant Dixit

Posted in Basics | Tagged: , | Leave a Comment »

High stats collection time for partitioned tables after upgrade to 19c

Posted by FatDBA on July 2, 2022

Recently, while working on a database upgrade from 12c to 19c (19.15) one of my friend encountered a strange issue on the newly upgraded 19c database where the stats gathering on the full database started taking huge time. It used to take ~ 3 hours to complete the full database statistics, but the same stats collection job after the upgrade to 19c started taking close to 10 hours. The stats collection script they were using was quite simple and with minimal parameters used.

EXEC DBMS_STATS.GATHER_DATABASE_STATS(ESTIMATE_PERCENT=>DBMS_STATS.AUTO_SAMPLE_SIZE,degree => 8);

During the analysis he observed that the top 3-4 in-flight transactions during stats collection were related to the Index Statistics and were found doing ‘Index Fast Full Scan’, and all of them are on few of the large partitioned tables in the database. He discussed the case with me and together tried few thing i.e. recollected dictionary and fixed object statistics, did some comparative study of parameters between 12c and 19c but none of them worked. At last we tried to set debugging levels on DBMS_STATS to see what’s happening under the hood, and that gave us some hint when set it with level/flag 8 (trace index stats) and with level 32768 to trace approximate NDV (number distinct values) gatherings. Traces gave us some idea that its surely with the index stats and NDV or number of distinct keys and is taking time.

But even after that we both were totally clueless as these Tables and its dependent objects are there in the system for a very long time. So. the big question was – What’s new in 19c that has slowed down stats collection ?

Finally we decided to contact OCS! And they quickly responded to the problem as its a known problem with the 19c. As per them, there was an enhancement in 19c that is related to Index stats gathering, and that had lead to the longer stats times. It was all due to an unpublished Bug 33427856 which is an enhancement to improve the calculation of index NDK (Number of Distinct Keys). This new feature with the approx_count_distinct function and fully scans indexes to calculate NDK. This has a significant benefit because NDK is now accurate. It also means that gathering statistics can take longer (for example, updating global index statistics if incremental stats is used). So, In general, this is expected behavior, since DBMS_STATS is doing more work in 19c than it did in previously unenhanced versions.

And the solution to this new 19c index-stats feature (a problem) off by setting fix control to disable ‘Enhance Index NDK Statistics’ – 27268249

alter system set "_fix_control"='27268249:0';

And as soon as we deleted existing statistics and regather them, the time dropped drastically and got completed under 3 hours.

Hope It Helped!
Prashant Dixit

Posted in Advanced, troubleshooting | Tagged: , , , , | Leave a Comment »

 
%d bloggers like this: