Quick reference for a Standalone HBase installation on a Mac. Please note that these are quick working notes for reference and not an elaborate documentation.

Apache HBase Reference Guide: https://hbase.apache.org/book.html

Prerequisite: JDK needs to be installed

Download Apache HBase: https://hbase.apache.org/downloads.html => (download the bin for the stable version) = > download from https://www.apache.org/dyn/closer.lua/hbase/2.4.9/hbase-2.4.9-bin.tar.gz

Extract and install: tar -xvzf hbase-2.4.9-bin.tar.gz

Standalone mode using local filesystem

A standalone instance of HBase has all the HBase daemons included:

  • Master
  • RegionServers
  • ZooKeeper

All the 3 daemons run in a single JVM persisting to the local filesystem.

Configure:

Check Java_HOME: echo $JAVA_HOME

`$/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/jre`

Update conf/hbase-env.sh in HBase installation directory with:

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/jre

Update ~/.zshrc file:

export PATH="$PATH:</install path>/hbase-2.4.9/bin"

Start HBase: start-hbase.sh

Received the following warnings:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/...../hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/...../hbase-2.4.9/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
running master, logging to /..../hbase-2.4.9/bin/../logs/hbase-.....local.out

Verify that there is one running process HMaster. In standalone mode HBase runs all three daemons (HMaster, HRegionServer and ZooKeeper) within the single JVM.

jps

80356 Jps
80156 HMaster

HBase Web UI at http://localhost:16010 was not accessible

Solved above warnings/error:

In ~/.zshrc commented #export PATH=”$PATH:/Users/shouvik/opt/hadoop-3.3.1/bin”

Start HBase: bin/start-hbase.sh

The warning is gone!

...
running master, logging to /Users/shouvik/opt/hbase-2.4.9/bin/../logs/hbase-.......local.out
...

jps

81571 HMaster
81732 Jps

HBase Web UI: http://localhost:16010 is now accessible => http://localhost:16010/master-status

Connect to HBase

Run HBase shell: hbase shell

Version 2.4.9, ........
Took 0.0031 seconds                                                                                                          
hbase:001:0> 

Works!!

Exploring the HBase shell commands:

Quick commands to get started from the reference guide.

Create table - must provide one column family along with table name

hbase:003:0> create 'test', 'cf'

Created table test
Took 1.2568 seconds                                                                                                          
=> Hbase::Table - test
hbase:004:0> list 'test'
TABLE                                                                                                                        
test                                                                                                                         
1 row(s)
Took 0.0358 seconds                                                                                                          
=> ["test"]

hbase:005:0> describe 'test'

Table test is ENABLED                                                                                                        
test                                                                                                                         
COLUMN FAMILIES DESCRIPTION                                                                                                  
{NAME => 'cf', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODIN
G => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLIC
ATION_SCOPE => '0'}                                                                                                          

1 row(s)
Quota is disabled
Took 0.1286 seconds                                                                                                           hbase:006:0> `put 'test', 'row1', 'cf:a', 'value1'`

Took 0.1017 seconds                                                                                                           hbase:007:0> `put 'test', 'row2', 'cf:b', 'value2'`

Took 0.0069 seconds                                                                                                           hbase:008:0> `put 'test', 'row3', 'cf:c', 'value3'`

Took 0.0150 seconds                                                                                                          

hbase:009:0> scan 'test'

ROW                              COLUMN+CELL                                                                                 
row1                            column=cf:a, timestamp=2022-02-09T15:53:39.281, value=value1                                
row2                            column=cf:b, timestamp=2022-02-09T15:54:02.022, value=value2                                
row3                            column=cf:c, timestamp=2022-02-09T15:54:14.169, value=value3                                
3 row(s)
Took 0.0546 seconds                                                                                                          
hbase:010:0> get 'test', 'row1'
COLUMN                           CELL                                                                                        
cf:a                            timestamp=2022-02-09T15:53:39.281, value=value1                                             
1 row(s)
Took 0.0136 seconds                                                                                                          

hbase:011:0> disable 'test'

Took 0.3587 seconds                                                                                                          

hbase:012:0> list 'test'

TABLE                                                                                                                        
test                                                                                                                         
1 row(s)
Took 0.0049 seconds                                                                                                          
=> ["test"]

hbase:013:0> put 'test', 'row4', 'cf:d', 'value4'

ERROR: Table test is disabled!

For usage try 'help "put"'

Took 0.4446 seconds                                                                                                          

hbase:014:0> enable 'test'

Took 0.6702 seconds                                                                                                          
hbase:015:0> put 'test', 'row4', 'cf:d', 'value4'
Took 0.0094 seconds                                                                                                          
hbase:016:0> scan 'test'
ROW                              COLUMN+CELL                                                                                 
row1                            column=cf:a, timestamp=2022-02-09T15:53:39.281, value=value1                                
row2                            column=cf:b, timestamp=2022-02-09T15:54:02.022, value=value2                                
row3                            column=cf:c, timestamp=2022-02-09T15:54:14.169, value=value3                                
row4                            column=cf:d, timestamp=2022-02-09T15:56:33.403, value=value4                                
4 row(s)
Took 0.0160 seconds                                                                                                          

hbase:017:0> disable 'test'

Took 0.3560 seconds                                                                                                          
hbase:018:0> drop test
Traceback (most recent call last):
ArgumentError (wrong number of arguments (given 0, expected 2..3))

hbase:019:0> drop 'test'

Took 0.1380 seconds                                                                                                          

hbase:020:0> list

TABLE                                                                                                                        
0 row(s)
Took 0.0116 seconds                                                                                                          
=> []

Standalone mode with HBase over HDFS

To run in standalone mode and use HDFS instead of local filesystem

Check jps to check nothing is running. Stop HBase if running.

stop-hbase.sh

stopping hbase..............

jps

86473 Jps

To configure this standalone variant, edit your hbase-site.xml setting hbase.rootdir to point at a directory in your HDFS instance but then set hbase.cluster.distributed to false.

Update conf/hbase-site.xml

<property>
    <name>hbase.cluster.distributed</name>
    <value>false</value>
</property>

<property>
    <name>hbase.rootdir</name>
    <value>hdfs://localhost:9000/hbase</value>
</property>

# Remove existing configuration for hbase.tmp.dir and hbase.unsafe.stream.capability.enforce  

<property>
    <name>hbase.tmp.dir</name>
    <value>./tmp</value>
</property>

<property>
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
</property>

Start Hadoop

sbin/start-dfs.sh

Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [Shouviks-MacBook-Pro.local]

jps

87649 NameNode
87761 DataNode
87991 Jps
87898 SecondaryNameNode

Verify http://localhost:9870/dfshealth.html#tab-overview

Start HBase

bin/start-hbase.sh

The authenticity of host '127.0.0.1 (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:.....
Are you sure you want to continue connecting (yes/no/[fingerprint])? y
Please type 'yes', 'no' or the fingerprint: yes
127.0.0.1: Warning: Permanently added '127.0.0.1' (ECDSA) to the list of known hosts.
127.0.0.1: running zookeeper, logging to /..../opt/hbase-2.4.9/bin/../logs/hbase-shouvik-zookeeper-....local.out
running master, logging to /..../opt/hbase-2.4.9/bin/../logs/hbase-shouvik-master-....local.out
: running regionserver, logging to /..../opt/hbase-2.4.9/bin/../logs/hbase-shouvik-....local.out

Issue faced:

jps

3075 DataNode
2964 NameNode
4134 Jps
3214 SecondaryNameNode

jps command should show the HMaster processes running. HMaster was not running

Solution:

  • Cleaned up Hadoop configuration - stopped dfs, deleted /tmp/hadoop-dir, name node format, start hadoop
  • If there is a Hadoop installation and a path exists in .zshrc file, comment it to prevent teh Log4J error

bin/start-hbase.sh

127.0.0.1: running zookeeper, logging to /..../opt/hbase-2.4.9/bin/../logs/hbase-shouvik-zookeeper-.....local.out
running master, logging to /..../opt/hbase-2.4.9/bin/../logs/hbase-shouvik-master-.....local.out
: running regionserver, logging to /..../opt/hbase-2.4.9/bin/../logs/hbase-shouvik-regionserver-......local.out

jps

3075 DataNode
2964 NameNode
4084 HMaster
4134 Jps
3214 SecondaryNameNode

HMaster process running after the cleanup.

Exploring HBase and data directories

bin/hdfs dfs -ls /

drwxr-xr-x   - shouvik supergroup          0 2022-02-09 21:08 /hbase

hbase shell

hbase:001:0> create 'test', 'cf'

Created table test
Took 1.9204 seconds                                                                                                          
=> Hbase::Table - test

hbase:002:0> put 'test', 'row1', 'cf:a', 'value1'

hbase:003:0> put 'test', 'row2', 'cf:b', 'value2'

hbase:004:0> put 'test', 'row3', 'cf:c', 'value3'

hbase:007:0> scan 'test'

ROW                              COLUMN+CELL                                                                                 
row1                            column=cf:a, timestamp=2022-02-09T21:24:50.673, value=value1                                
row2                            column=cf:b, timestamp=2022-02-09T21:24:58.527, value=value2                                
row3                            column=cf:c, timestamp=2022-02-09T21:25:04.519, value=value3                                
3 row(s)
Took 0.0625 seconds    

Following directories were created after table creation. This confirms that HBase is using the HDFS file system

bin/hadoop fs -ls /hbase

/hbase/.hbck
/hbase/.tmp
/hbase/MasterData
/hbase/WALs
/hbase/archive
/hbase/corrupt
/hbase/data
/hbase/hbase.id
/hbase/hbase.version
/hbase/mobdir
/hbase/oldWALs
/hbase/staging

Pseudo-Distributed mode with HBase over HDFS

In Pseudo-distributed mode HBase runs each of the following daemons as a separate process on a single host

  • HMaster
  • HRegionServer
  • ZooKeeper

Stop HBase if running: bin/stop-hbase.sh

Update conf/hbase-site.xml:

    <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    </property>

    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://localhost:9000/hbase</value>
    </property>

Note: HBase will create /hbase when it is started. No need for manual creation.

Start Hbase: bin/start-hbase.sh

jps

37088 DataNode
**38305 HRegionServer**
38434 Jps
36978 NameNode
37226 SecondaryNameNode
**38139 HMaster**
38063 HQuorumPeer

hdfs dfs -ls /

drwxr-xr-x   - shouvik supergroup          0 2022-02-17 15:50 /hbase

hdfs dfs -ls /hbase shows the list of directories:

/hbase/MasterData
/hbase/WALs
/hbase/archive
/hbase/corrupt
/hbase/data
/hbase/hbase.id
/hbase/hbase.version
/hbase/mobdir
/hbase/oldWALs
/hbase/staging