SHDP provides basic configuration for HBase
through the hbase-configuration
namespace element (or its backing
HbaseConfigurationFactoryBean).
<!-- default bean id is 'hbaseConfiguration' that uses the existing 'hadoopCconfiguration' object --> <hdp:hbase-configuration configuration-ref="hadoopCconfiguration" />
The above declaration does more than easily create an HBase
configuration object; it will also manage the backing HBase connections:
when the application context shuts down, so will any HBase connections
opened - this behavior can be adjusted through the stop-proxy
and
delete-connection
attributes:
<!-- delete associated connections but do not stop the proxies --> <hdp:hbase-configuration stop-proxy="false" delete-connection="true"> foo=bar property=value </hdp:hbase-configuration>
Additionally, one can specify the ZooKeeper port used by the HBase server - this is especially useful when connecting to a remote instance (note one can fully configure HBase including the ZooKeeper host and port through properties; the attributes here act as shortcuts for easier declaration):
<!-- specify ZooKeeper host/port --> <hdp:hbase-configuration zk-quorum="${hbase.host}" zk-port="${hbase.port}">
Notice that like with the other elements, one can specify additional
properties specific to this configuration. In fact hbase-configuration
provides the same properties configuration knobs
as
#hadoop:config:properties[hadoop configuration]:
<hdp:hbase-configuration properties-ref="some-props-bean" properties-location="classpath:/conf/testing/hbase.properties"/>
One of the most popular and powerful feature in Spring Framework is the Data Access Object (or DAO) support. It makes dealing with data access technologies easy and consistent allowing easy switch or interconnection of the aforementioned persistent stores with minimal friction (no worrying about catching exceptions, writing boiler-plate code or handling resource acquisition and disposal). Rather than reiterating here the value proposal of the DAO support, we recommend the DAO section in the Spring Framework reference documentation
SHDP provides the same functionality for Apache HBase through its
org.springframework.data.hadoop.hbase
package: an HbaseTemplate
along with several callbacks such as TableCallback
, RowMapper
and
ResultsExtractor
that remove the low-level, tedious details for
finding the HBase table, run the query, prepare the scanner, analyze the
results then clean everything up, letting the developer focus on her
actual job (users familiar with Spring should find the class/method
names quite familiar).
At the core of the DAO support lies HbaseTemplate
- a high-level
abstraction for interacting with HBase. The template requires an HBase
configuration, once it’s set, the template is thread-safe
and can be reused across multiple instances at the same time:
// default HBase configuration <hdp:hbase-configuration/> // wire hbase configuration (using default name 'hbaseConfiguration') into the template <bean id="htemplate" class="org.springframework.data.hadoop.hbase.HbaseTemplate" p:configuration-ref="hbaseConfiguration"/>
The template provides generic callbacks, for executing logic against the tables or doing result or row extraction, but also utility methods (the so-called _one-liner_s) for common operations. Below are some examples of how the template usage looks like:
// writing to 'MyTable' template.execute("MyTable", new TableCallback<Object>() { @Override public Object doInTable(HTable table) throws Throwable { Put p = new Put(Bytes.toBytes("SomeRow")); p.add(Bytes.toBytes("SomeColumn"), Bytes.toBytes("SomeQualifier"), Bytes.toBytes("AValue")); table.put(p); return null; } });
// read each row from 'MyTable' List<String> rows = template.find("MyTable", "SomeColumn", new RowMapper<String>() { @Override public String mapRow(Result result, int rowNum) throws Exception { return result.toString(); } }));
The first snippet showcases the generic TableCallback
- the most
generic of the callbacks, it does the table lookup and resource cleanup
so that the user code does not have to. Notice the callback signature -
any exception thrown by the HBase API is automatically caught, converted
to Spring’s
DAO
exceptions and resource clean-up applied transparently. The second
example, displays the dedicated lookup methods - in this case find
which, as the name implies, finds all the rows matching the given
criteria and allows user code to be executed against each of them
(typically for doing some sort of type conversion or mapping). If the
entire result is required, then one can use ResultsExtractor
instead
of RowMapper
.
Besides the template, the package offers support for automatically
binding HBase table to the current thread through HbaseInterceptor
and
HbaseSynchronizationManager
. That is, each class that performs DAO
operations on HBase can be
wrapped
by HbaseInterceptor
so that each table in use, once found, is bound to
the thread so any subsequent call to it avoids the lookup. Once the call
ends, the table is automatically closed so there is no leakage between
requests. Please refer to the Javadocs for more information.