Ftp – Best way to share storage between multiple computers

ftpnetwork-attached-storagenfsstorage

I need to provide shared storage for multiple application server nodes in a cluster in a Linux environment. In other words, each of the nodes in my cluster needs to be able to read/write files to a single centralized place (a NAS device) in the same network segment. Notice that the files are binary, do not need to be versioned (a cvs would be overkill) and not intended to be stored in a data base – only in a file system.

I've narrowed my options to these:

  1. Transfer the files to/from the NAS using ftp/ftps/ftpfs . Each node will have access to an ftp client, and an ftp server must be setup
  2. Share a directory in the NAS, mounting a reference to it in each node, and using a protocol such as NFS/SMB for performing the actual transfer of files

Which is the best practice for this kind of situation? What is the fastest option? Any caveats or problems (w.r.t. availability, concurrency, security, performance, required configuration, etc.) you can think of? Any advice you can provide will be greatly appreciated.

Best Answer

You want one central place to save files so multiple workstations can work with the files from that system.

You'd set up a server to hold the files, and then share them out.

If you're using Linux on the file server, you can use NFS, or you can set up SAMBA, and create and mount a file share from that.

Or you can set up sshfs and use FUSE to mount the remote directory.

There is no best practice for this other than you choose a storage connection option that suits your environment; if it's all Linux then NFS would work fine. Or run NFS and SAMBA. Nothing says you can't run both. And you have to factor in how you're authenticating and what kind of security you need for the files.

concurrency issues? Well, how are you USING the files? Will the application be trying to read and write at the same time as an application from another server? That's not a file sharing protocol's issue so much as an architecture of your software issue.

Security? What kind of security do you need? NTFS's find-grained ACL's? Encryption end to end on the wire? Standard POSIX? You don't say what you need.

Performance? They all should work well enough for...well, you don't say. What kind of performance do you need? What kind of disk subsystem will you be running? Gigabit ethernet? Dedicated RAID controller with large cache? Are you teaming the NICs? What kind of NICs? Even a crappy disk subsystem will work if you're not pushing it, and a fast disk subsystem could choke when delivering data simultaneously to 25 servers pushing at full throttle.

And what for availability do you mean? This is an architecture problem again, and depends on your application. If you can cluster your storage, you have more availability. If you're just thinking of RAID, that's fine until your power supply dies. Unless you have dual supplies. Then it dies because you have them on the same UPS. Or the motherboard fails. Or your one RAID controller dies. How much redundancy do you need? And on what budget?

To answer the question as given, set up SAMBA and NFS and see which one can be mounted most easily to your workstations and go with that.

Related Topic