Hadoop – HBase distributed scanner

hadoophbase

In the "API usage example" on "Getting started" page in HBase documentation there is an example of scanner usage:

Scanner scanner = table.getScanner(new
String[]{"myColumnFamily:columnQualifier1"});

RowResult rowResult = scanner.next();
 while (rowResult != null) {
  //...
  rowResult = scanner.next(); 

}

As I understand, this code will be executed on one machine (name node) and all scanning and filtering work will be not distributed. Only data storing and data loading will be distributed. How can I use distributed scanner, which will work separetly on each node.

Which is the best practise of fast data filtering?
Thanks.

Best Answer

This is old, anyway: the scanner is just a cursor-like api for retrieval of computed results. For computation, you use MapReduce jobs (hbase.mapred).

Related Topic