PowerShell to reset local Administrator account password. 5% failures

powershellwmi

I have the responsibility for changing the local administrator account password on my environment of 16,000 servers; I wrote a PowerShell script, but it took too long, so I added multi-threading using the powershell runspace factory to break out the 16,000 into manageable pieces.

There's a ~5% error rate (~800 servers); of these, 75-100 are clear errors that can be troubleshot (username not found, access denied, etc.), and 700-725 get the error message "The network path was not found".

However, pinging the servers gets a response and server engineers tell me they are operational, I have access, and that both PowerShell and WMI are running and functioning.

I have no idea where to begin troubleshooting. Here's the logic and code I'm using:

I use FQDNs, however my company tends to have servers listed differently in DNS to their FQDN, and the 2 choices will not resolve to one another. so servera.production.active.directory will not resolve to servera.mycompany.com. This function determines a valid FQDN for connecting and setting the password, returning either a valid FQDN or an empty string:

function get-validfqdn([string]$server, [string]$domain){

    $fqdn = $server + "." + (get_FQDN $domain)
    $altdn = $server + ".mycompany.com"

    if(Test-Connection -count 1 -computer $fqdn -quiet -TimeToLive 80){
        $valid = $fqdn
    }
    elseif{
        $valid = $altdn
    }else{
        $valid = ""

    return $valid
}

I attempt to execute the password change using the following code, embedded in a module and performed for each server in the list we're processing (this is a long function due to the PowerShell runspace factory code).

function Set-ServerPass([string]$filepath){

    $servers = Import-CSV $filepath
    $results = @()

    foreach($server in $servers){

        $svr = $server.Server
        $password = $server.Password
        $domain = $server.domain
        $fqdn = get-validfqdn $svr $domain

        if ($fqdn -ne ""){
            Try{
                $admin = [adsi]("WinNT://$fqdn/Administrator, user")
                $admin.psbase.invoke("SetPassword", "$password")

                $result.Error_Code = "0"
                $result.Error_Msg = "The operation was sucessful"

            }Catch{
                $error_msg = Trim_ExceptionMessage $_.exception.Message

                $result.Error_Code = "1"
                $result.Error_Msg = $error_msg
                $results += $result
            }
        }else{

            $result.Error_Code = "51"
            $result.Error_Msg = "The remote computer is not available"
        }
    } 
    return $results
}

Notes: Test-Connection filters out servers that would otherwise be unavailable; a timeout on this function defaults to ~180 seconds (3 mins x 1600 servers = too long).

This code works on 95% of the servers, and reports accurately after a year of running this script. However, server engineers are starting question whether this script works because when I report problems they're not seeing how or why I would get "the network path was not found" error when all of their tests say it is working fine.

Troubleshooting steps so far:

  • Run on different computer
  • Run as different admin
  • Run at different times of day – this was to prevent possible server activity from interrupting the script (application patching, reboots, etc)

This last 2 months I have manually troubleshot each of the 800 servers and ran the script some 15 times on just the failed servers. Re-running nets me about 10-300 more passwords being reset, but doesn't catch all of them, and it is very inconsistent.

On 3 occasions the server engineers reported no problems, I re-ran the script and it reset all of them with no errors.

So my questions are: what could be causing the error, and what should I look at to determine the the root cause? Settings on server? Settings on my workstation?

Setup is as follows: Windows XP Pro SP3. Servers are Windows Server 2003 or Windows Server 2008 R2. These errors occur on both server operating systems.

Best Answer

Run WireShark on the client computer, log all the network traffic as the script runs.

That's going to be a lot of data, but given that you're talking about manually running it on 800 servers, it's not going to be much worse than that.

Look for failed DNS resolution, look at comparing the servers that worked and the servers that failed.

WMI supports logging, errors and debug information. It might be useful on the client or on some of the failing servers: http://blogs.technet.com/b/askperf/archive/2008/03/04/wmi-debug-logging.aspx