Azure – Connection issue with Jenkins slave on Windows Azure

azureJenkins

I've set up a Jenkins slave node on a Windows Azure VM. When building on that node the project runs smoothly for about 20-30 minutes after which the connection gets dropped. I've been on the node VM as the connection was dropped and it appears it is losing/resetting the connection to the Jenkins Master( also an Azure VM). Has anyone had similar issues and been able to resolve it? The stack trace is as follows. Any help would be appreciated.

Progress: |=====================FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
at hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
at hudson.remoting.Request.call(Request.java:174)
at hudson.remoting.Channel.call(Channel.java:739)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:168)
at com.sun.proxy.$Proxy49.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:951)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:137)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:97)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:745)
at hudson.model.Build$BuildExecution.build(Build.java:198)
at hudson.model.Build$BuildExecution.doRun(Build.java:159)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:518)
at hudson.model.Run.execute(Run.java:1709)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:231)

Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Failed to abort
at hudson.remoting.Request.abort(Request.java:299)
at hudson.remoting.Channel.terminate(Channel.java:802)
at hudson.remoting.Channel$2.terminate(Channel.java:483)
at hudson.remoting.AbstractByteArrayCommandTransport$1.terminate(AbstractByteArrayCommandTransport.java:72)
at org.jenkinsci.remoting.nio.NioChannelHub$NioTransport.abort(NioChannelHub.java:184)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:563)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Caused by: java.io.IOException: Failed to abort
… 9 more

Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host
at sun.nio.ch.SocketDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(Unknown Source)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.read(Unknown Source)
at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at org.jenkinsci.remoting.nio.FifoBuffer$Pointer.receive(FifoBuffer.java:136)
at org.jenkinsci.remoting.nio.FifoBuffer.receive(FifoBuffer.java:306)
at org.jenkinsci.remoting.nio.NioChannelHub.run(NioChannelHub.java:496)
… 7 more

Best Answer

I am also setting up Jenkins CI in Azure and was getting this same problem. Specifically, I would see this error in the Jenkins slave error log:

SEVERE: I/O error in channel channel
java.net.SocketException: Connection reset

To work around it, you need to increase the frequency with which the slave pings the master. You can do this by adding the Dhudson.slaves.ChannelPinger.pingInterval system property to your master jenkins.xml file.

I modified it to ping every 2 minutes and the channel was able to mantain the connection without dropping it. XML looks like this:

<arguments>
  -Xrs -Xmx256m -Dhudson.slaves.ChannelPinger.pingInterval=2 
  -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle 
  -jar "%BASE%\jenkins.war" 
  --httpPort=8080
</arguments>

For more information you can see the related ticket.

Related Topic