12c OEM Error: LongOpManager$ZombieDetection:1017
Posted by FatDBA on August 31, 2018
Hey Pals,
I am sure, few of the us who have the 12c EM configured on their systems must have received one of the annoying alert/incident where the EM Agent on the server has reported
something about the ‘Zombie’ processes repeatedly.
I recently encountered one such issue with this Production system where the 12c agent frequently sending zombie related alerts, like one below. So, this post of all about handling such issues
and how to fix them or to avoid them.
Host=dixitlab.asi.dixson.corp Target type=Agent Target name=dixitlab.asi.dixson.corp:3873 Message=Problem: java.lang.Throwable:oracle.sysman.gcagent.tmmain.execution.LongOpManager$ZombieDetection:1017 Severity=Critical Problem creation time=May 5, 2018 10:22:48 AM ADST Last updated time=Aug 29, 2018 1:00:47 AM ADST Problem ID=113 Problem key=java.lang.Throwable:oracle.sysman.gcagent.tmmain.execution.LongOpManager$ZombieDetection:1017 Incident count=5 Problem Status=New Priority=None Escalated=No Problem acknowledged By Owner=No Rule Name=OEM12c-Problems,Enterprise Problems Rule Owner=SYSMAN
Next quick thing in such cases is always checking the agent (gcagent) logs to understand the exact error or issues with EM or Agent.
And as expected the logs has something interesting in them, i saw a DEBUG message captured where Zombie Detection was initiated by the agent after a delay of lot of attempts.
This happens when an EM Agent task such as collecting metrics is running more than the expected time, the process is marked as a zombie and is one of the leading causes of agent crash or halt.
X-AGENT_PERSISTENCE_WAIT_TIME: 60 X-AGENT_PERSISTENCE_ID: https://dixitlab1.asi.dixson.corp:1830/emd/main/ 2018-08-29 17:10:26,050 [31:858161EB] DEBUG - Submitting task ZombieDetector for execution 2018-08-29 17:10:26,050 [216:1AE716D8] DEBUG - Begin task ZombieDetector on Thread: GC.SysExecutor.8 2018-08-29 17:10:26,050 [216:69BEAC9D:GC.SysExecutor.8 (ZombieDetector)] DEBUG - Scheduling next ZombieDetector.Task after delay 60000 including periodShift of 0 milliseconds
There are few of the ways to avoid such issues.
1. Set _zombieSuspensions=TRUE
2. Set _canceledThreadWait=900
3. set _zombieThreadPercentThreshold=0
Other way, that is to avoid the alerts/incidents you can set one of the ‘Hidden’ parameter “_zombieCreateIncident=” to FALSE and set it in the agent configuration file followed by agent restart.
This
[oracle@dixitlab config]$ [oracle@dixitlab config]$ pwd /u01/app/oracle/new_agent12c/agent_inst/sysman/config [oracle@dixitlab config]$ more emd.properties|grep _zombieCreateIncident _zombieCreateIncident=false [oracle@dixitlab config]$ [oracle@dixitlab bin]$ [oracle@dixitlab bin]$ ./emctl status agent Oracle Enterprise Manager Cloud Control 12c Release 5 Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved. --------------------------------------------------------------- Agent Version : 12.1.0.5.0 OMS Version : 12.1.0.5.0 Protocol Version : 12.1.0.1.0 Agent Home : /u01/app/oracle/new_agent12c/agent_inst Agent Log Directory : /u01/app/oracle/new_agent12c/agent_inst/sysman/log Agent Binaries : /u01/app/oracle/new_agent12c/core/12.1.0.5.0 Agent Process ID : 2978 Parent Process ID : 12860 Agent URL : https://dixitlab.asi.dixson.corp:3873/emd/main/ Local Agent URL in NAT : https://dixitlab.asi.dixson.corp:3873/emd/main/ Repository URL : https://dixitlab1.asi.dixson.corp:4900/empbs/upload Started at : 2018-08-29 01:01:08 Started by user : oracle Operating System : Linux version 2.6.32-696.20.1.el6.x86_64 (amd64) Last Reload : (none) Last successful upload : 2018-08-29 17:28:20 Last attempted upload : 2018-08-29 17:28:20 Total Megabytes of XML files uploaded so far : 1.03 Number of XML files pending upload : 0 Size of XML files pending upload(MB) : 0 Available disk space on upload filesystem : 52.74% Collection Status : Collections enabled Heartbeat Status : Ok Last attempted heartbeat to OMS : 2018-08-29 17:29:16 Last successful heartbeat to OMS : 2018-08-29 17:29:16 Next scheduled heartbeat to OMS : 2018-08-29 17:30:16 --------------------------------------------------------------- [oracle@dixitlab bin]$ ./emctl stop agent Oracle Enterprise Manager Cloud Control 12c Release 5 Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved. Stopping agent ... stopped. [oracle@dixitlab bin]$ [oracle@dixitlab bin]$ ./emctl status agent Oracle Enterprise Manager Cloud Control 12c Release 5 Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved. --------------------------------------------------------------- Agent is Not Running [oracle@dixitlab bin]$ [oracle@dixitlab bin]$ [oracle@dixitlab bin]$ ./emctl start agent Oracle Enterprise Manager Cloud Control 12c Release 5 Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved. Starting agent .............................. started. [oracle@dixitlab bin]$ [oracle@dixitlab bin]$ [oracle@dixitlab bin]$ ./emctl status agent Oracle Enterprise Manager Cloud Control 12c Release 5 Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved. --------------------------------------------------------------- Agent Version : 12.1.0.5.0 OMS Version : 12.1.0.5.0 Protocol Version : 12.1.0.1.0 Agent Home : /u01/app/oracle/new_agent12c/agent_inst Agent Log Directory : /u01/app/oracle/new_agent12c/agent_inst/sysman/log Agent Binaries : /u01/app/oracle/new_agent12c/core/12.1.0.5.0 Agent Process ID : 2664 Parent Process ID : 2455 Agent URL : https://dixitlab.asi.dixson.corp:3873/emd/main/ Local Agent URL in NAT : https://dixitlab.asi.dixson.corp:3873/emd/main/ Repository URL : https://dixitlab1.asi.dixson.corp:4900/empbs/upload Started at : 2018-08-29 17:33:05 Started by user : oracle Operating System : Linux version 2.6.32-696.20.1.el6.x86_64 (amd64) Last Reload : (none) Last successful upload : 2018-08-29 17:34:34 Last attempted upload : 2018-08-29 17:34:34 Total Megabytes of XML files uploaded so far : 0.03 Number of XML files pending upload : 0 Size of XML files pending upload(MB) : 0 Available disk space on upload filesystem : 52.71% Collection Status : Collections enabled Heartbeat Status : Ok Last attempted heartbeat to OMS : 2018-08-29 17:34:24 Last successful heartbeat to OMS : 2018-08-29 17:34:24 Next scheduled heartbeat to OMS : 2018-08-29 17:35:24 --------------------------------------------------------------- Agent is Running and Ready [oracle@dixitlab bin]$
Hope It Helps
Prashant Dixit
Leave a Reply