Powershell – How to run the PowerShell scripts in parallel without using Jobs

automationperformancepowershell

If I have a script that I need to run against multiple computers, or with multiple different arguments, how can I execute it in parallel, without having to incur the overhead of spawning a new PSJob with Start-Job?

As an example, I want to re-sync the time on all domain members, like so:

$computers = Get-ADComputer -filter * |Select-Object -ExpandProperty dnsHostName
$creds = Get-Credential domain\user
foreach($computer in $computers)
{
    $session = New-PSSession -ComputerName $computer -Credential $creds
    Invoke-Command -Session $session -ScriptBlock { w32tm /resync /nowait /rediscover }
}

But I don't want to wait for each PSSession to connect and invoke the command. How can this be done in parallel, without Jobs?

Best Answer

Update - While this answer explains the process and mechanics of PowerShell runspaces and how they can help you multi-thread non-sequential workloads, fellow PowerShell aficionado Warren 'Cookie Monster' F has gone the extra mile and incorporated these same concepts into a single tool called Invoke-Parallel - it does what I describe below, and he has since expanded it with optional switches for logging and prepared session state including imported modules, really cool stuff - I strongly recommend you check it out before building you own shiny solution!


With Parallel Runspace execution:

Reducing inescapable waiting time

In the original specific case, the executable invoked has a /nowait option which prevents blocking the invoking thread while the job (in this case, time re-synchronization) finishes on its own.

This greatly reduces the overall execution time from the issuers perspective, but connecting to each machine is still done in sequential order. Connecting to thousands of clients in sequence may take a long time depending on the number of machines that are for one reason or another inaccessible, due to an accumulation of timeout waits.

To get around having to queue up all subsequent connections in case of a single or a few consecutive timeouts, we can dispatch the job of connecting and invoking commands to separate PowerShell Runspaces, executing in parallel.

What is a Runspace?

A Runspace is the virtual container in which your powershell code executes, and represents/holds the Environment from the perspective of a PowerShell statement/command.

In broad terms, 1 Runspace = 1 thread of execution, so all we need to "multi-thread" our PowerShell script is a collection of Runspaces that can then in turn execute in parallel.

Like the original problem, the job of invoking commands multiple runspaces can be broken down into:

  1. Creating a RunspacePool
  2. Assigning a PowerShell script or an equivalent piece of executable code to the RunspacePool
  3. Invoke the code asynchronously (ie. not having to wait for the code to return)

RunspacePool template

PowerShell has a type accelerator called [RunspaceFactory] that will assist us in the creation of runspace components - let's put it to work

1. Create a RunspacePool and Open() it:

$RunspacePool = [runspacefactory]::CreateRunspacePool(1,8)
$RunspacePool.Open()

The two arguments passed to CreateRunspacePool(), 1 and 8 is the minimum and maximum number of runspaces allowed to execute at any given time, giving us an effective maximum degree of parallelism of 8.

2. Create an instance of PowerShell, attach some executable code to it and assign it to our RunspacePool:

An instance of PowerShell is not the same as the powershell.exe process (which is really a Host application), but an internal runtime object representing the PowerShell code to execute. We can use the [powershell] type accelerator to create a new PowerShell instance within PowerShell:

$Code = {
    param($Credentials,$ComputerName)
    $session = New-PSSession -ComputerName $ComputerName -Credential $Credentials
    Invoke-Command -Session $session -ScriptBlock {w32tm /resync /nowait /rediscover}
}
$PSinstance = [powershell]::Create().AddScript($Code).AddArgument($creds).AddArgument("computer1.domain.tld")
$PSinstance.RunspacePool = $RunspacePool

3. Invoke the PowerShell instance asynchronously using APM:

Using what is known in .NET development terminology as the Asynchronous Programming Model, we can split the invocation of a command into a Begin method, for giving a "green light" to execute the code, and an End method to collect the results. Since we in this case are not really interested in any feedback (we don't wait for the output from w32tm anyways), we can make due by simply calling the first method

$PSinstance.BeginInvoke()

Wrapping it up in a RunspacePool

Using the above technique, we can wrap the sequential iterations of creating new connections and invoking the remote command in a parallel execution flow:

$ComputerNames = Get-ADComputer -filter * -Properties dnsHostName |select -Expand dnsHostName

$Code = {
    param($Credentials,$ComputerName)
    $session = New-PSSession -ComputerName $ComputerName -Credential $Credentials
    Invoke-Command -Session $session -ScriptBlock {w32tm /resync /nowait /rediscover}
}

$creds = Get-Credential domain\user

$rsPool = [runspacefactory]::CreateRunspacePool(1,8)
$rsPool.Open()

foreach($ComputerName in $ComputerNames)
{
    $PSinstance = [powershell]::Create().AddScript($Code).AddArgument($creds).AddArgument($ComputerName)
    $PSinstance.RunspacePool = $rsPool
    $PSinstance.BeginInvoke()
}

Assuming that the CPU has the capacity to execute all 8 runspaces at once, we should be able to see that the execution time is greatly reduced, but at the cost of readability of the script due to the rather "advanced" methods used.


Determining the optimum degree of parallism:

We could easily create a RunspacePool that allows for the execution of a 100 runspaces at the same time:

[runspacefactory]::CreateRunspacePool(1,100)

But at the end of the day, it all comes down to how many units of execution our local CPU can handle. In other words, as long as your code is executing, it does not make sense to allow more runspaces than you have logical processors to dispatch execution of code to.

Thanks to WMI, this threshold is fairly easy to determine:

$NumberOfLogicalProcessor = (Get-WmiObject Win32_Processor).NumberOfLogicalProcessors
[runspacefactory]::CreateRunspacePool(1,$NumberOfLogicalProcessors)

If, on the other hand, the code you are executing itself incurs a lot of wait time due to external factors like network latency, you can still benefit from running more simultanous runspaces than you have logical processors, so you'd probably want to test of range possible maximum runspaces to find break-even:

foreach($n in ($NumberOfLogicalProcessors..($NumberOfLogicalProcessors*3)))
{
    Write-Host "$n: " -NoNewLine
    (Measure-Command {
        $Computers = Get-ADComputer -filter * -Properties dnsHostName |select -Expand dnsHostName -First 100
        ...
        [runspacefactory]::CreateRunspacePool(1,$n)
        ...
    }).TotalSeconds
}
Related Topic