Linux – Disk / Filesystem as a LRU Cache (with transparent expiry)

cachedisk-cachefilesystemslinux

I have a case where I'd want to use local disk as a LRU cache for (hot) files from a separate web service (something like S3). If the file doesn't exist on disk, the file is read over the internet, written to the local disk and then future requests can use the local cache instead of reading it from the original source.

Since the amount of data stored in the web service will exceed local storage, I wish to expunge local files automagically and transparently when a new file is written if the store is already full. If possible I'd like to avoid a situation where I have a cron task that checks atime and expires files after a certain time, as there is no particular reason to expire cache items based on time if no files are being written.

I've tried finding something like tmpfs that allows me to implement something similar as a purely disk backed cache (on SSDs) as transparently as possible to the application that uses the cache, but I've been unable to find anything that implements this functionality (similar to what CacheFS does for NFS, but in a more general way).

Best Answer

You could try nginx files caching for that, if you are ok with http interface. See nginx content-caching.

Related Topic