Cannot start Weblogic managed node : Stuck at

oracle-11gstartupweblogic

Appreciate you expert help on this issue.

Two weblogic managed nodes running in the same machine went down one day , on trying to start them up again using the startManagedWebLogic.sh, observed that Weblogic is stuck at the statment .

<Nov 17, 2015 12:11:29 AM UTC> <Info> <WorkManager> <BEA-002900> <Initializing self-tuning thread pool>

The behavior is observed for both the nodes.

running a PS -3 PID gave the following thread dump. Ran 3, 4 times all the time observed presence of the method in thread dump.

weblogic/diagnostics/flightrecorder/FlightRecorderManager.isRecordingPossible(FlightRecorderManager.java:181)

Any pointers on resolving this issue and get the weblogic nodes started is greatly appreciated.

Thread dump is given below for the reference.

===== FULL THREAD DUMP ===============
Mon Nov 16 19:19:12 2015
Oracle JRockit(R) R28.2.7-7-155314-1.6.0_45-20130329-0641-linux-x86_64

    "Main Thread" id=1 idx=0x4 tid=26997 prio=5 alive, native_blocked
    at java/lang/System.currentTimeMillis()J(Native Method)
    at java/io/ExpiringCache.put(ExpiringCache.java:74)[inlined]
    at java/io/UnixFileSystem.canonicalize(UnixFileSystem.java:158)[optimized]
    ^-- Holding lock: java/io/ExpiringCache@0x1416ba3a8[biased lock]
    at java/io/File.getCanonicalPath(File.java:559)[inlined]
    at java/io/File.getCanonicalFile(File.java:583)[inlined]
    at oracle/jrockit/jfr/Repository$1.run(Repository.java:71)[inlined]
    at oracle/jrockit/jfr/Repository$1.run(Repository.java:68)[optimized]
    at jrockit/vm/AccessController.doPrivileged(AccessController.java:232)
    at jrockit/vm/AccessController.doPrivileged(AccessController.java:240)
    at oracle/jrockit/jfr/Repository.tryToUseAsRepository(Repository.java:68)
    at oracle/jrockit/jfr/Repository.createUniqueRepository(Repository.java:48)
    at oracle/jrockit/jfr/Repository.<init>(Repository.java:26)
    at oracle/jrockit/jfr/JFRImpl.<init>(JFRImpl.java:102)
    at oracle/jrockit/jfr/VMJFR.<init>(VMJFR.java:69)
    at oracle/jrockit/jfr/VMJFR.create(VMJFR.java:572)
    at oracle/jrockit/jfr/JFR.get(JFR.java:59)
    ^-- Holding lock: java/lang/Class@0x141e80420[biased lock]
    at com/oracle/jrockit/jfr/FlightRecorder.isNativeImplementation(FlightRecorder.java:25)
    at weblogic/diagnostics/flightrecorder/FlightRecorderManager.isRecordingPossible(FlightRecorderManager.java:181)
    at weblogic/diagnostics/instrumentation/gathering/DataGatheringManager.initialize(DataGatheringManager.java:319)
    ^-- Holding lock: java/lang/Class@0x141637e10[biased lock]
    at weblogic/diagnostics/image/ImageManager.<init>(ImageManager.java:115)
    at weblogic/diagnostics/image/ImageManager.<clinit>(ImageManager.java:57)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    at jrockit/vm/RNI.initializeClass(J)V(Native Method)
    at weblogic/work/ServerWorkManagerFactory.initializeHere(ServerWorkManagerFactory.java:121)
    at weblogic/work/ServerWorkManagerFactory.initialize(ServerWorkManagerFactory.java:59)
    ^-- Holding lock: java/lang/Class@0x1415e2140[biased lock]
    at weblogic/t3/srvr/BootService.start(BootService.java:61)
    at weblogic/t3/srvr/ServerServicesManager.startService(ServerServicesManager.java:461)
    at weblogic/t3/srvr/ServerServicesManager.startInStandbyState(ServerServicesManager.java:166)
    ^-- Holding lock: java/lang/Class@0x1415d52d8[biased lock]
    at weblogic/t3/srvr/T3Srvr.initializeStandby(T3Srvr.java:881)
    at weblogic/t3/srvr/T3Srvr.startup(T3Srvr.java:568)
    at weblogic/t3/srvr/T3Srvr.run(T3Srvr.java:469)
    at weblogic/Server.main(Server.java:71)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

"(Signal Handler)" id=2 idx=0x8 tid=26998 prio=5 alive, daemon

"(OC Main Thread)" id=3 idx=0xc tid=26999 prio=5 alive, native_waiting, daemon

"(GC Worker Thread 1)" id=? idx=0x10 tid=27000 prio=5 alive, daemon

"(GC Worker Thread 2)" id=? idx=0x14 tid=27001 prio=5 alive, daemon

"(GC Worker Thread 3)" id=? idx=0x18 tid=27002 prio=5 alive, daemon

"(GC Worker Thread 4)" id=? idx=0x1c tid=27003 prio=5 alive, daemon

"(GC Worker Thread 5)" id=? idx=0x20 tid=27004 prio=5 alive, daemon

"(GC Worker Thread 6)" id=? idx=0x24 tid=27005 prio=5 alive, daemon

"(GC Worker Thread 7)" id=? idx=0x28 tid=27006 prio=5 alive, daemon

"(GC Worker Thread 8)" id=? idx=0x2c tid=27007 prio=5 alive, daemon

"(GC Worker Thread 9)" id=? idx=0x30 tid=27008 prio=5 alive, daemon

"(GC Worker Thread 10)" id=? idx=0x34 tid=27009 prio=5 alive, daemon

"(GC Worker Thread 11)" id=? idx=0x38 tid=27010 prio=5 alive, daemon

"(GC Worker Thread 12)" id=? idx=0x3c tid=27011 prio=5 alive, daemon

"(GC Worker Thread 13)" id=? idx=0x40 tid=27012 prio=5 alive, daemon

"(Code Generation Thread 1)" id=4 idx=0x44 tid=27013 prio=5 alive, native_waiting, daemon

"(Code Optimization Thread 1)" id=5 idx=0x48 tid=27014 prio=5 alive, native_waiting, daemon

"(VM Periodic Task)" id=6 idx=0x4c tid=27015 prio=10 alive, native_blocked, daemon

"Finalizer" id=7 idx=0x50 tid=27016 prio=8 alive, native_waiting, daemon
    at jrockit/memory/Finalizer.waitForFinalizees(J[Ljava/lang/Object;)I(Native Method)
    at jrockit/memory/Finalizer.access$700(Finalizer.java:12)
    at jrockit/memory/Finalizer$4.run(Finalizer.java:201)
    at java/lang/Thread.run(Thread.java:662)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

"Reference Handler" id=8 idx=0x54 tid=27017 prio=10 alive, native_waiting, daemon
    at java/lang/ref/Reference.waitForActivatedQueue(J)Ljava/lang/ref/Reference;(Native Method)
    at java/lang/ref/Reference.access$100(Reference.java:11)
    at java/lang/ref/Reference$ReferenceHandler.run(Reference.java:82)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

"(Sensor Event Thread)" id=9 idx=0x58 tid=27018 prio=5 alive, native_blocked, daemon

"VM JFR Buffer Thread" id=10 idx=0x5c tid=27019 prio=5 alive, in native, daemon

"Timer-0" id=13 idx=0x60 tid=27020 prio=5 alive, waiting, native_blocked, daemon
    -- Waiting for notification on: java/util/TaskQueue@0x1415dfa70[fat lock]
    at jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native Method)
    at java/lang/Object.wait(J)V(Native Method)
    at java/lang/Object.wait(Object.java:485)
    at java/util/TimerThread.mainLoop(Timer.java:483)
    ^-- Lock released while waiting: java/util/TaskQueue@0x1415dfa70[fat lock]
    at java/util/TimerThread.run(Timer.java:462)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

"Timer-1" id=14 idx=0x64 tid=27021 prio=5 alive, waiting, native_blocked, daemon
    -- Waiting for notification on: java/util/TaskQueue@0x1415dfad8[fat lock]
    at jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native Method)
    at java/lang/Object.wait(J)V(Native Method)
    at java/util/TimerThread.mainLoop(Timer.java:509)
    ^-- Lock released while waiting: java/util/TaskQueue@0x1415dfad8[fat lock]
    at java/util/TimerThread.run(Timer.java:462)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

"[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'" id=15 idx=0x68 tid=27022 prio=5 alive, waiting, native_blocked, daemon
    -- Waiting for notification on: weblogic/work/ExecuteThread@0x1415e0518[fat lock]
    at jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native Method)
    at java/lang/Object.wait(J)V(Native Method)
    at java/lang/Object.wait(Object.java:485)
    at weblogic/work/ExecuteThread.waitForRequest(ExecuteThread.java:205)
    ^-- Lock released while waiting: weblogic/work/ExecuteThread@0x1415e0518[fat lock]
    at weblogic/work/ExecuteThread.run(ExecuteThread.java:226)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

"JFR request timer" id=16 idx=0x6c tid=27023 prio=5 alive, waiting, native_blocked, daemon
    -- Waiting for notification on: java/util/TaskQueue@0x1415dfb58[fat lock]
    at jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native Method)
    at java/lang/Object.wait(J)V(Native Method)
    at java/lang/Object.wait(Object.java:485)
    at java/util/TimerThread.mainLoop(Timer.java:483)
    ^-- Lock released while waiting: java/util/TaskQueue@0x1415dfb58[fat lock]
    at java/util/TimerThread.run(Timer.java:462)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

===== END OF THREAD DUMP ===============

Thanks and Regards
Jimmi

Best Answer

Finally we are able to find a way to start the weblogic nodes up . here is some data that might helpful to others as well.

Root cause

The root cause of the issue that were facing which posted as question something to do with the permissions of /tmp folder.

The machine where these weblogic nodes are hosted had the /tmp folder permission reset as shown below

drw-r--r-- 10 root root 4096 Nov 20 00:00 tmp

when the nodes were working earlier the permission was set to

drwxrwxrwt 36 root root 20480 Nov 20 00:24 tmp

Weblogic process is started using a user other than root.

Looks like this change permission caused the nodes to go down and then prevented them from start up again.

Solution

Since I did not have the root access , Permissions could not be reset.

For now the nodes were brought up by pointing the temp folder required by the jvm to another folder where the weblogic user has access, setting the JVM property java.io.tmpdir in the file $DOMAIN_HOME/bin/setDomainEnv.sh

Example:

-Djava.io.tmpdir=/home/weblogic/tmp