Volume Deduplication yields 0 savings and has 0 InPolicyFiles


I am trying to get some sensible results from the Server 2012 R2 dedup feature and repeatedly failing. I have two large-ish volumes (4 + 2 TB) exposed as D: and E: respectively. The volumes are assigned as Cluster Storage to two different File Server cluster resources (don't know if this changes anything) and the disks are online on the machine I am trying to enable dedup.

Enable-DedupVolume D:
Enable-DedupVolume E:
Set-DedupVolume -Volume D: -MinimumFileAgeDays 0
Set-DedupVolume -Volume E: -MinimumFileAgeDays 0
Start-DedupJob D: -Type Optimization
Start-DedupJob E: -Type Optimization

After couple of minutes, both optimization jobs are finished. In the meantime, there is some disk read load for both disks. At the end, the events I am getting in the Deduplication log are indicating that nothing has been deduplicated:

Log Name:      Microsoft-Windows-Deduplication/Operational
Source:        Microsoft-Windows-Deduplication
Date:          12/2/2015 11:36:02 AM
Event ID:      6153
Task Category: None
Level:         Information
User:          SYSTEM
Computer:      wss-01.example.com
Optimization job has completed.

Volume: D: (\\?\Volume{73180747-4bf5-4292-86fd-8e8fc4d076c4}\)
Error code: 0x0
Error message: 
Savings rate: 0
Saved space: 0
Volume used space: 2867461320704
Volume free space: 1530452017152
Optimized file count: 0
In-policy file count: 0
Job processed space (bytes): 0
Job elapsed time (seconds): 37
Job throughput (MB/second): 0

Log Name:      Microsoft-Windows-Deduplication/Operational
Source:        Microsoft-Windows-Deduplication
Date:          12/2/2015 11:38:26 AM
Event ID:      6153
Task Category: None
Level:         Information
User:          SYSTEM
Computer:      wss-01.example.com
Optimization job has completed.

Volume: E: (\\?\Volume{a3f85da5-283e-4ed4-81c0-2c0fd163b1c3}\)
Error code: 0x0
Error message: 
Savings rate: 0
Saved space: 0
Volume used space: 2068610711552
Volume free space: 130142007296
Optimized file count: 0
In-policy file count: 0
Job processed space (bytes): 0
Job elapsed time (seconds): 686
Job throughput (MB/second): 0

The data volumes are rather well populated – D: is mainly ISO images and installers of different kinds while E: is typical user home data, so I would expect some savings (at least more than 0) to show. The invocation of Update-DedupStatus for either of the volumes is not doing much. The get-dedupstatus result is indicating that no files are considered to be "in policy" for deduplication:

PS C:\> get-dedupstatus | select-object -Property *

ObjectId                           : \\?\Volume{a3f85da5-283e-4ed4-81c0-2c0fd163b1c3}\
Capacity                           : 2198752718848
FreeSpace                          : 130142007296
InPolicyFilesCount                 : 0
InPolicyFilesSize                  : 0
LastGarbageCollectionResult        :
LastGarbageCollectionResultMessage :
LastGarbageCollectionTime          :
LastOptimizationResult             : 0
LastOptimizationResultMessage      : The operation completed successfully.
LastOptimizationTime               : 12/2/2015 11:45:10 AM
LastScrubbingResult                :
LastScrubbingResultMessage         :
LastScrubbingTime                  :
OptimizedFilesCount                : 0
OptimizedFilesSavingsRate          : 0
OptimizedFilesSize                 : 0
SavedSpace                         : 0
SavingsRate                        : 0
UnoptimizedSize                    : 2068610711552
UsedSpace                          : 2068610711552
Volume                             : E:
VolumeId                           : \\?\Volume{a3f85da5-283e-4ed4-81c0-2c0fd163b1c3}\
PSComputerName                     :
CimClass                           : ROOT/Microsoft/Windows/Deduplication:MSFT_DedupVolumeStatus
CimInstanceProperties              : {Capacity, FreeSpace, InPolicyFilesCount, InPolicyFilesSize...}
CimSystemProperties                : Microsoft.Management.Infrastructure.CimSystemProperties

ObjectId                           : \\?\Volume{73180747-4bf5-4292-86fd-8e8fc4d076c4}\
Capacity                           : 4397913337856
FreeSpace                          : 1530452013056
InPolicyFilesCount                 : 0
InPolicyFilesSize                  : 0
LastGarbageCollectionResult        : 5657346
LastGarbageCollectionResultMessage : There are no actions associated with this job.
LastGarbageCollectionTime          : 12/2/2015 11:58:12 AM
LastOptimizationResult             : 0
LastOptimizationResultMessage      : The operation completed successfully.
LastOptimizationTime               : 12/2/2015 11:45:10 AM
LastScrubbingResult                : 0
LastScrubbingResultMessage         : The operation completed successfully.
LastScrubbingTime                  : 11/28/2015 3:45:07 AM
OptimizedFilesCount                : 0
OptimizedFilesSavingsRate          : 0
OptimizedFilesSize                 : 0
SavedSpace                         : 0
SavingsRate                        : 0
UnoptimizedSize                    : 2867461324800
UsedSpace                          : 2867461324800
Volume                             : D:
VolumeId                           : \\?\Volume{73180747-4bf5-4292-86fd-8e8fc4d076c4}\
PSComputerName                     :
CimClass                           : ROOT/Microsoft/Windows/Deduplication:MSFT_DedupVolumeStatus
CimInstanceProperties              : {Capacity, FreeSpace, InPolicyFilesCount, InPolicyFilesSize...}
CimSystemProperties                : Microsoft.Management.Infrastructure.CimSystemProperties

and the configuration is pretty much at its default settings:

PS C:\> get-dedupvolume | select-object -Property *

ObjectId                 : \\?\Volume{a3f85da5-283e-4ed4-81c0-2c0fd163b1c3}\
UsageType                : Default
Capacity                 : 2198752718848
ChunkRedundancyThreshold : 100
DataAccessEnabled        : True
Enabled                  : True
ExcludeFileType          :
ExcludeFileTypeDefault   : {edb, jrs}
ExcludeFolder            :
FreeSpace                : 130142007296
MinimumFileAgeDays       : 0
MinimumFileSize          : 32768
NoCompress               : False
NoCompressionFileType    : {asf, mov, wma, wmv...}
OptimizeInUseFiles       : False
OptimizePartialFiles     : False
SavedSpace               : 0
SavingsRate              : 0
UnoptimizedSize          : 2068610711552
UsedSpace                : 2068610711552
Verify                   : False
Volume                   : E:
VolumeId                 : \\?\Volume{a3f85da5-283e-4ed4-81c0-2c0fd163b1c3}\
PSComputerName           :
CimClass                 : ROOT/Microsoft/Windows/Deduplication:MSFT_DedupVolume
CimInstanceProperties    : {Capacity, ChunkRedundancyThreshold, DataAccessEnabled, Enabled...}
CimSystemProperties      : Microsoft.Management.Infrastructure.CimSystemProperties

ObjectId                 : \\?\Volume{73180747-4bf5-4292-86fd-8e8fc4d076c4}\
UsageType                : Default
Capacity                 : 4397913337856
ChunkRedundancyThreshold : 100
DataAccessEnabled        : True
Enabled                  : True
ExcludeFileType          :
ExcludeFileTypeDefault   : {edb, jrs}
ExcludeFolder            :
FreeSpace                : 1530452013056
MinimumFileAgeDays       : 0
MinimumFileSize          : 32768
NoCompress               : False
NoCompressionFileType    : {asf, mov, wma, wmv...}
OptimizeInUseFiles       : False
OptimizePartialFiles     : False
SavedSpace               : 0
SavingsRate              : 0
UnoptimizedSize          : 2867461324800
UsedSpace                : 2867461324800
Verify                   : False
Volume                   : D:
VolumeId                 : \\?\Volume{73180747-4bf5-4292-86fd-8e8fc4d076c4}\
PSComputerName           :
CimClass                 : ROOT/Microsoft/Windows/Deduplication:MSFT_DedupVolume
CimInstanceProperties    : {Capacity, ChunkRedundancyThreshold, DataAccessEnabled, Enabled...}
CimSystemProperties      : Microsoft.Management.Infrastructure.CimSystemProperties

I already tried detaching the respective disks from the Cluster Service role (i.e. just made them stand-alone disks with "simple" volumes and NTFS file systems) disabling and re-enabling deduplication and doing optimization runs without any significant change to the overall result.

So why is it broken and how do I fix it?

Best Answer

My trouble seems to be that the data I am trying to deduplicate has been sourced from a NetApp filer exposing SMB storage. All of the files copied from there (via robocopy with /COPYALL) do seem to have an Extended Attribute ".NETAPP" attached. And deduplication is ignoring files with Extended Attributes according to the documentation:

Files with extended attributes, encrypted files, files smaller than 32 KB, and reparse point files are not processed by deduplication.

Double-checking the hypothesis was easy, just create two files with (nearly) identical content:

type C:\Windows\WindowsUpdate.log > d:\file1.txt
type C:\Windows\WindowsUpdate.log > d:\file2.txt

run the Optimization job and see InPolicyFilesCount increase to 2. Others have reported similar problems in the NetApp user forums.

I just needed to find out how to remove the EAs on ~10 millions of files in decent time - luckily Veritas has published the EVEARemovalUtility to accomplish this very task as their archival solution suffers from the incompatibility between EAs and junction points they are creating as pointers to archived data. The tool is a free download, usage is straightforward:

  • EVEARemovalUtility.exe \\server\Share -d -s to create a list of files with their respective extended attributes
  • EVEARemovalUtility.exe \\server\Share -r -s to strip all files of EAs

Probably due to its age, the way it is installed and lack of updated documentation, it does not run on Server 2012 R2 out of the box, complaining about missing DLLs. I have used a Server 2008 R2 machine for execution as a workaround.

After EAs were removed, deduplication is running as expected.