I have a string that I want to use as a filename, so I want to remove all characters that wouldn't be allowed in filenames, using Python.
I'd rather be strict than otherwise, so let's say I want to retain only letters, digits, and a small set of other characters like "_-.() "
. What's the most elegant solution?
The filename needs to be valid on multiple operating systems (Windows, Linux and Mac OS) – it's an MP3 file in my library with the song title as the filename, and is shared and backed up between 3 machines.
Best Answer
You can look at the Django framework for how they create a "slug" from arbitrary text. A slug is URL- and filename- friendly.
The Django text utils define a function,
slugify()
, that's probably the gold standard for this kind of thing. Essentially, their code is the following.And the older version:
There's more, but I left it out, since it doesn't address slugification, but escaping.