Powershell deleting files with regular expressions

powershellregular expressions

I have several thousand files of different revisions in a folder.

ACZ002-0.p
ACZ002-1.p
ACZ002-2.p
ACZ051-0.p
ACZ051-1.p
...

The revision is always the last digit before the dot. I need to preserve only the files with the latest revision, but I don't know how to proceed with my code.

$path = "E:\Export\"
$filter = [regex] "[A-Z]{3}[0-9]{3}\-[0-9]{1,2}\.(p)"
$files = Get-ChildItem -Path $path -Recurse | Where-Object {($_.Name -match $filter)}

Best Answer

There is probably a better way to do this, but here is what I first thought of to accomplish this; this approach uses named capture groups in the regex to make it easier for powershell to sort and group by the base filename and the revision. The final variable $filesToKeep will be an array of FileInfo objects that you can exclude from your delete command. Of course I recommend lots of testing before actually deleting anything.

$filter = [regex] "(?<baseName>[A-Za-z]{3}[0-9]{3})\-(?<revision>[0-9]+)\.p"
$results = ls c:\temp -Recurse | where {$_.Name -match $filter} | foreach-object {
    new-object PSObject -Property @{
        BaseName = $matches.BaseName
        Revision = $matches.Revision
        File = $_
    }
}
$filesToKeep = $results | sort basename, revision  -Descending | group basename | ForEach-Object { $_.group | select -first 1 -ExpandProperty File}
Related Topic