I just installed Windows Server 2012, and while I see and read about a new deduplication feature, the Server Manager keeps reporting empty fields in the columns for deduplication rate and savings. How do I get the server to start looking for deduplication opportunities?
How to activate deduplication in Windows Server 2012
deduplicationntfswindows-server-2012
Related Solutions
If it uses the Microsoft defragmentation APIs it should be able to, as the deduplication chunks and metadata are stored as plain files on the disk. If you're paranoid about data loss, just disable the dedup jobs on the volume before running it. I asked Ran Kalach, part of the dedup team at Microsoft about this, and he stated that there were no known data integrity issues with 3rd party defragmentation programs which use the Microsoft defragmentation APIs. Although there could be performance issues due to large sparse files utilized by dedup.
I've been using MyDefrag because it is highly configurable and allows you to write scripts to determine file placement and other actions. The deduplication chunks and metadata are stored in ?:\System Volume Information\Dedup
. Security permissions on this directory are set to only allow NT AUTHORITY\SYSTEM
access, so if you want to be able to defragment these files you will need to run your defragmentation program under the NT AUTHORITY\SYSTEM
account. This can be acomplished with Microsoft/SysInternal's psexec program. Just run psexec.exe -i -s -d C:\YourDefrag.exe
To address the comments in your question regarding defragmenting a deduplicated volume is of little use, I would have to disagree. To start off not all files and directories are always deduplicated. In a default configuration several file types are excluded, see the ExcludeFolder
, ExcludeFileType
and ExcludeFileTypeDefault
properties for the Get-DedupVolume
cmdlet. This can be further configured by the administrator, for instance I exclude .MKV video files because of the low duplication rates in my environment. Also files in excess of 1TB will not be deduplicated even in Server 2016, and files 32KB or smaller will not be deduplicated either. Secondly, free space fragmentation can decrease write performance, and can increase the chance future files are fragmented. Thirdly even if a deduplicated file is inherently fragmented, a fragmented deduplication chunk will further decrease performance. And finally by grouping dedup chunks together with a program like MyDefrag you can reduce the time it takes to perform garbage collection and scrubbing jobs by reducing the amount the amount of time spent as the disks are seeking.
Also the data itself will not be rehydrated if defragmentation is ran as the user visible deduplicated files are stored as reparse points on disk - a special type of file similar to a junction or directory mount point.
You are right in your assumption that WSB has created shadow copies. It uses these copies to maintain a backup history.
If you still have backup versions (and thus shadow copies) of points in time before your dedup optimization job has run, you would not see any savings at all since the deduplicated blocks have not been freed - they are needed for an older, non-deduped version of the data which is still referenced by one of the shadow copies.
So the bottom line is that if you need deduplication savings to show, you need to remove all older shadow copies.
The increase you are seeing is probably not due to deduplication activity but simply due to the fact that additional backup jobs were run in the meantime and older shadow copies are not deleted unless necessary (i.e. the volume would not have enough space for the new backup otherwise)
Best Answer
You need to install, enable, and configure the dedupe features.
Since there is overhead associated with deduplication, and there are scrubbing and other jobs that need to be scheduled, it's not automagically enabled on a default install.
You can do this from the GUI, or using PowerShell:
Please note that the commands listed above are the bare minimum necessary to turn dedupe on. If you want to use this in production, read the entire link and understand all of the other components involved and configure them as necessary.