AWS PowerShell – Significantly Slower Than AWS CLI

amazon s3amazon-web-servicespowershell

I'm having an issue in that a PowerShell Script takes 10 times as long as a batch file to download files from AWS S3.

I have an existing batch file script to move files from one S3 Bucket to another, it takes about 30 seconds to move 1000 files.

The script looks like this
aws s3 mv s3://bucket/folder/s3://bucket/%destKey%/%subDestKey%/ –recursive –include "*.json" -profile etl

I'd rather do this in PowerShell as I'd like to apply a lot more logic and I'm more comfortable in PowerShell.

My Powershell script to do the same things looks like this

$files = Get-S3object -BucketName bucket | where {$_.Key -like "*.json" -and 
$_.Key -notlike "inprogress*"}
foreach ($file in $files){

Copy-S3Object -BucketName bucket -Key $file.Key -DestinationKey 
"$date/$($file.key)" -DestinationBucket newbucket
Remove-S3Object -BucketName bucket -Key $file.Key -Force


}

However in PowerShell this script takes about 300 seconds to move 1000 files, has anyone else has this same experience? Hopefully the answer is that I'm taking the wrong approach here as I'd love to be able to use PowerShell for this task!

Best Answer

There are two reasons for the performance difference here:

  • Powershell uploads in a single thread
  • You are copying each file in series

AWS CLI is much faster because it uses multiple threads (up to 10 by default), and so is doing multiple simultaneous operations.

You can speed things up by changing your script to use the -parallel option, limiting the number of concurrent operations.

The foreach would then look like this:

foreach -parallel -throttlelimit 10 ($file in $files){

Copy-S3Object -BucketName bucket -Key $file.Key -DestinationKey "$date/$($file.key)" -DestinationBucket newbucket Remove-S3Object -BucketName bucket -Key $file.Key -Force

}

Depending on your system, Windows may limit you to only 5 parallel process, but this should still give you a reasonable speed up.