Aggregation queries in Cassandra CQL

We are currently evaluating Cassandra as the data store for an analytical application. The plan was to dump raw data in Cassandra and then run mainly aggregation queries over it. Looking at CQL, it does not seem to support some traditional SQL operators like:

Typical aggregation functions like average, sum, count-Distinct etc.
Groupby-having operators

I did not find anything that can help achieve the above in the documentation. Also checked if there were any hooks for providing such functions as extensions. Say like in database map-reduce in Mongodb, or user-defined-functions in Relational DBs.

People do talk about the paid Datastax Enterprise Edition, and that too achieves this not via plain Cassandra, but through separate components like Hadoop-Hive-Pig-Hadoop etc. Or there are suggestions about doing needed pre-aggregations before dumping data to the DB since Cassandra writes are fast.

It looked like too much of overheads, at least for basic stuff we need. Am I missing something fundamental here?

Would highly appreciate help on this.

Aggregation queries in Cassandra CQL

Best Answer

Related Topic

Best Answer

Related Solutions

Hadoop – Analytics and Mining of data sitting on Cassandra

R – Problem in using Sed to remove leading and trailing spaces

Related Topic