Linux – How to delete millions of files without disturbing the server

ext4filesystemslinuxUbuntu

I'd like to delete an nginx cache directory, which I quickly purged by:

mv cache cache.bak
mkdir cache
service nginx restart

Now I have a cache.bak folder which has 2 million files. I'd like to delete it, without disturbing the server.

A simple rm -rf cache.bak trashes the server, even the simplest HTTP response takes 16 seconds while rm is running, so I cannot do that.

I tried ionice -c3 rm -rf cache.bak, but it didn't help. The server has an HDD, not an SSD, probably on an SSD these might not be a problem.

I believe the best solution would be some kind of throttling, like how nginx's built in cache manager does.

How would you solve this? Is there any tool which can do exactly this?

ext4 on Ubuntu 16.04

Best Answer

Make a bash script like this:

#!/bin/bash
rm -- "$*"
sleep 0.5

Save it with name deleter.sh for example. Run chmod u+x deleter.sh to make it executable.

This script deletes all files passed to it as arguments, and then sleeps 0.5 seconds.

Then, you can run

find cache.bak -print0 | xargs -0 -n 5 deleter.sh

This command retrieves a list of all files in cache.bak and passes the five filenames at a time to the delete script.

So, you can adjust how many files are deleted at a time, and how long a delay is between each delete operation.