How does OpenStack’s object storage (swift) know where to find objects

I've been reading the architectural docs for swift (http://docs.openstack.org/developer/swift/overview_architecture.html). I half understand what 'rings' are for but I think I'm missing some details.

It seems to me that when an object is PUT into the storage, a client connects to a server which somehow finds a ring. It then writes the data to a device and somehow updates a ring (probably) to say what has been written and where (the docs say that storage servers don't modify rings themselves, so I'm guessing there's a central pool of servers that take care of updating rings and pushing them out to storage servers). It seems that sqlite databases are used to store the mapping between object IDs and locations. The sqlite databases are then replicated along with objects around the cluster.

For a GET operation, a client makes a request to a server which somehow knows where to find the database mapping that particular object ID to a physical location. It then connects using a proxy server to retrieve the object and returns it to the client.

If I've got this right, it seems to me that the sqlite 'ring' mapping object IDs to physical locations is replicated to at least 3 nodes, so not to the entire cluster. So how does the system know, when retrieving an object, where to find the 'ring' database that contains a mapping of object ID to location? Perhaps this is stored in an 'account' ring, but then the same question applies – what subsystem does the public-facing server connect to to find out which nodes contain the actual objects that need to be retrieved/deleted?

Best Answer

Here is a really good page describing where data is stored: https://julien.danjou.info/blog/2012/openstack-swift-consistency-analysis

Basically, a bunch of partitions are created in the ring, and partitions are assigned to devices. When an object is stored, a hash value is created for the object and used to look up what partition to store it at. Then it is placed on the device based on the partition, in directories corresponding to the partition and hash value of the object.

So when you need to retrieve the object, it hashes it again, looks up the partition, and then goes directly to the location, on any one of the storage devices that contains it, usually the nearest.

The page above explains in more detail.

Best Answer

Related Solutions

Object storage: when to choose OpenStack (Swift) vs. Ceph

Nfs – How to setup an NFS server that caches a network share

Related Topic