Powershell remoting suddenly stops working


I'm having trouble executing a remote powershell script which is supposed to update the installation
of an application for automated testing in a taskmanager task running once a day.

The rather simple script (details below) used to run successfully for about one year. Suddenly it started failing, because the
remote powershell script could not be executed. I have no idea what is the root cause of this. Local IT ensures they did not
change anything.

(I should note that I can probably replace the powershell script by something else, but I do not intend to give up that easily. Apart from that I'd like to understand what's wrong here)

Here is the general setup:

A Windows Server 2008 R2 virtual machine not attached to a domain, called the target.
A local user u_target assigned to the Administrator group on on target.

A Windows Server 2008 R2 virtual machine in a domain (let call the domain D), called the source.
A domain D user u_source assigned to the Administrator group on source.

Powershell has version 2.0 on both VMs.

All commands on target are executed by u_target with admin rights, all commands on source are executed by u_source with admin rights.
I triple checked that powershell has been started as administrator in all cases.

About one year ago, I enabled psremoting on both VMs as follows:

On target, u_target executed in admin powershell
enable-psremoting -force and
set-item wsman:\local\client\TrustedHosts -value 'source'
Afterwards the machine was rebooted.

Both commands were executed without any errors. Later on, when I ran into trouble, I replaced 'source' with * to ensure that the problem is
not due to a typo.

On source, u_source executed in admin powershell
enable-psremoting -force.
This machine was also rebooted.

Later on, when things failed, target was added to the TrustedHosts here, too.

The script which is supposed to be executed, looks in principle as follows:

$server = 'target' #(using the FQHN)
$username = 'u_target'
$password = 'u_targetpwd' #(the correct one, of course)

$pass = ConvertTo-SecureString -AsPlainText $password -Force
$Cred = New-Object System.Management.Automation.PSCredential -Argumentlist $username,$pass

$scbScriptBlock = {
# a valid script. For simplicity assume it's
Get-ChildItem C:\

Invoke-Command -ComputerName $server -Credential $Cred -ScriptBlock $scbScriptBlock 

This results, since about one week in the following error message:

[target] Connecting to remote server failed with following error messages  :    
The connection to the specified remote host was refused. Verify that the
WS-Management service is running on the remote host and configured to 
listen for requests on the correct port and HTTP URL. For more information,     
see the about_Remote_Troubleshooting Help topic.
+ CategoryInfo      : OpenError: (:) [], PSRemotingTransportException
+ FullyQualifiedErrorID : PSSessionStateBroken

What I tried to fix this or figure out what's wrong:

  • Read the about_Remote_TroublShooting Help topic. With few exceptions (see below) I'm sure I followed the instructions in there without success.
  • Read the documentation
  • reconfigured WSMan (see above)
  • verified the trusted host settings with Get-Item, created them with Set-Item (and did not forget to restart wsman).
  • logged in to both computers with the users in question and verified they still belong to the Admin group and their
    passwords are still valid
  • replaced the original scriptblock with a trivial one to ensure it's not the scriptblock which is broken.
  • verified that the computers know of each other (ping, test-connection from source to target)
  • verified that u_target is able to execute the commands in the scriptblock when logged on to target using powershell in a remote desktop session
  • searched the internet. One hit was suggesting that the user profile of u_target on target might be corrupted, which it was not
  • verified WSMan is running in service manager
  • replaced 'u_target' with 'target\u_target' in the script and tried from scratch
  • rebooted the VMs in question several times and repeated all of the above
  • created an alternative source and target VM with similar setup (these have powershell version 3.0 installed). Remoting fails here too. Remoting seems to work fine with a pair of domain bound
    VMs (yesterday, today I could not get it to work, either), which I don't want, though.
  • checked the event log of target in the sections Windows Logs -> Application and Windows Logs -> Security. While the script earlier (when it
    was still working fine) generated, e.g., logon events there is now nothing there anymore. No errors, either.
  • checked the Firewall settings on source and target. I think they are ok, but maybe I just want to believe the 'enable-psremoting' tool did it's job correctly
  • (and, of course, checked that the script also fails with the trivial scriptblock I used in this question instead of the original one which I should not publish here).

So here are my questions:

  • of course, this one ;-): do you have any idea what might have caused the script to suddenly fail???

  • The thing I'm unsure about is the port and HTTP URL WS-Management is listening on, how would I check that? I executed 'winrm enumerate winrm/config/listener'
    because some old message in the event told me to do so, what I noticed here is that the hostname seems to be undefined. There also seems to be no value
    for the CertificateThumbprint, but Transport is HTTP.

  • are there any recent updates or patches which are known to cause trouble with PS remoting?
  • is there any typical setting an admin would want to apply network wide which could result in this kind of failure?
  • where else could I look (particular event log, event ids)?

Some more absurd ideas which came to my mind, maybe not too absurd(?):

  • the VMs have been set up almost exactly one year before the script started to fail — are there some settings which expire after a year which I would not note
    otherwise but would cause this kind of effect?
  • installation of .net Framework 4.5.2 — I'm rather sure the script was in working condition after that had happened, though
  • is there any certificate which may have expired?

Best Answer

You may be able to get more information about this kind of problem if you enable trace option for the WSMan provider (on the source server at the least, possibly also the remote). Without this, not very much is logged. To enable trace logging:

Import-Module PSDiagnostics
<run your script>

Review the the powershell event log (Applications and Services Logs/Microsoft/Windows/PowerShell/Operational)
If WinRM were using SSL, the cert should be in the computer account personal strore.