Thursday, November 7, 2019

Problem when WebLogic’s NodeManager is using a start script

Many people will use the WebLogic’s NodeManager to use the start script for starting Managed Servers:

StartScriptEnabled=true

The NodeManager will start the Managed Server using the script and monitor the Managed Server when it fails.
One of the mistakes people can make is creating their own start script and start the Managed Server using ‘nohup &’ within this script: Metalink [ID 984122.1] and[ID 861098.1]. The script will finish and the server keeps running, but the NodeManager thinks the Managed Server has failed…
It will restart the Managed Server (if configured), but fails starting it because the Managed Server was already started:

<ms1> <Server failed during startup so will not be restarted>
 
This is bad, because the server will not be restarted when the NodeManager sees it failing, marking the Managed Server ‘FAILED_NOT_RESTARTABLE’.

Theoretical
The default startWebLogic.sh script will not finish when Killing the Managed Server (kill -9 17920) will make the NodeManager restart the Manager Server: OK!
Killing the startWebLogic.sh script (kill -9 17874) however, will also make the NodeManager restart the Manager Server. The Managed Server is not down, so a restart will fail:


Many people will use the WebLogic’s NodeManager to use the start script for starting Managed Servers:

StartScriptEnabled=true
 
The NodeManager will start the Managed Server using the script and monitor the Managed Server when it fails.
One of the mistakes people can make is creating their own start script and start the Managed Server using ‘nohup &’ within this script: Metalink [ID 984122.1] and [ID 861098.1]. The script will finish and the server keeps running, but the NodeManager thinks the Managed Server has failed…
It will restart the Managed Server (if configured), but fails starting it because the Managed Server was already started:

<ms1> <Server failed during startup so will not be restarted>

This is bad, because the server will not be restarted when the NodeManager sees it failing, marking the Managed Server ‘FAILED_NOT_RESTARTABLE’.
The default startWebLogic.sh script will not finish when used, because it does not use ‘nohup &’. The NodeManager sees everything is fine:
oracle     17874 16428  0 12:39 ?        00:00:00 /bin/sh /mw_home/user_projects/domains/test_domain/bin/startWebLogic.sh
oracle     17920 17874 12 12:39 ?        00:00:38 /mw_home/jrockit_160_17_R28.0.0-679/bin/java -jrockit -Xms512m -Xmx512m -Dwebl...


Killing the Managed Server (kill -9 17920) will make the NodeManager restart the Manager Server: OK!
Killing the startWebLogic.sh script (kill -9 17874) however, will also make the NodeManager restart the Manager Server. The Managed Server is not down, so a restart will fail:

<ms1> <Server failed during startup so will not be restarted>

Same situation as described above: This means the NodeManager will not restart the Managed Server if it really fails, marking the Managed Server ‘FAILED_NOT_RESTARTABLE’: not OK.

No comments:

Post a Comment