Downsides of storing binary data in Riak

databasenosqlriakscalability

What are the problems, if any, of storing binary data in Riak?

Does it effect the maintainability and performance of the clustering?

What would the performance differences be between using Riak for this rather than a distributed file system?

Best Answer

Adding to @Oscar-Godson's excellent answer, you're likely to experience problems with values much larger than 50MBs. Bitcask is best suited for values that are up to a few KBs. If you're storing large values, you may want to consider alternative storage backends, such as innostore.

I don't have experience with storing binary values, but we've a medium-sized cluster in production (5 nodes, on the order of 100M values, 10's of TBs) and we're seeing frequent errors related to inserting and retrieving values that are 100's of KBs in size. Performance in this case is inconsistent - some times it works, others it doesn't - so if you're going to test, test at scale.

We're also seeing problems with large values when running map-reduce queries - they simply time out. However that may be less relevant to binary values... (as @Matt-Ranney mentioned).

Also see @Stephen-C's answer here

Related Topic