Linux – How to remove a file where the file name has utf-8 character issues

bashlinuxrm

I want to remove a file from a server via bash rm command.

This is a sample file Test_ Mürz.jgp.

How would one go about removeing files with such chars issues in the filename on a grand scale … especially when you don't know the position of the chars.

Best Answer

For single files, or small sets of files, if wildcard globbing doesn't allow you the precision you feel you need, you can combine ls -i (or stat, if available) and find -inum.

For safety, when using find's -inum, always make sure to also use -xdev to constrain the search to a single file system. Not doing so may have unexpected results.

For example:

~$ ls -i myweirdfile
183435818 myweirdfile
~$ find . -xdev -inum 183435818 -exec rm -i '{}' ';'
rm: remove regular file `./myweirdfile'? y
~$

Alternatively, in a single invocation (this might depend on GNU coreutils stat, which should be a fairly safe assumption on Linux, and uses sh-style process substitution):

~$ find . -xdev -inum $(stat -c '%i' 'myweirdfile') -exec rm -i '{}' ';'
rm: remove regular file `./myweirdfile'? y
~$

You can also use find's -delete action rather than -exec'ing rm. For really weird file names, this may be safer. Use -print or -ls first to verify which file will be deleted. Something like the following:

~$ ls -i myweirdfile
183435818 myweirdfile
~$ find . -xdev -inum 183435818 -print
./myweirdfile
~$ find . -xdev -inum 183435818 -delete
~$ find . -xdev -inum 183435818 -print
~$

Do keep in mind that hardlinks use the same inode number for multiple names, so you want to make sure there's no stray additional name anywhere that gets deleted as well (unless you want to do that, obviously).