HBase can be installed in three modes: standalone, pseudo-distributed and distributed – each mode has uses and advantages and disadvantages and slightly different install steps because of it. This article will guide you through the installation of HBase in standalone mode on Ubuntu 18.04.
So why use standalone mode? Firstly standalone mode is not recommended for production use however for development or non-important data standalone mode can be ideal. Standalone mode uses the local file system for data storage instead of Hadoop and because of this is pretty quick and easy to get running and you only need one server. Let’s begin!
The first think we are going to do is install OpenJDK 8. At time of writing this is the recommended OpenJDK version for Hbase 2.1+ however if this changes you can find supported versions here. SSH into your Ubuntu box and install OpenJDK 8.
sudo apt install openjdk-8-jdk
Next we are going to make a directory for our HBase data. By default HBase in standalone mode will store data in a temporary directory that is wiped on reboot which is not what we want.
sudo mkdir -p /var/hbase
Now we can get on with installing HBase. Head over to the Apache download mirrors site and click the recommended mirror. Once there we want to find the latest HBase version and locate the version ending in -bin.tar.gz. Once we have that we can use wget to download it.
wget https://www.mirrorservice.org/sites/ftp.apache.org/hbase/2.2.4/hbase-2.2.4-bin.tar.gz
Now extract the downloaded file
tar xzvf hbase-2.2.4-bin.tar.gz
and move into the extracted directory
cd hbase-2.2.4
We now need to do a little bit of configuration, let’s start by setting the JAVA_HOME variable. You’re going to need to know your Java installation directory, for me this is /usr/lib/jvm/java-8-openjdk-amd64/jre however it may be different for you. Once located we need to edit conf/hbase-env.sh
sudo nano conf/hbase-env.sh
Set JAVA_HOME to your Java installation directory and then save and exit
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre
Now we need to tell HBase about the directory we created earlier so let’s also edit conf/hbase-site.xml and set that directory.
sudo nano conf/hbase-site.xml
Set hbase.tmp.dir to the directory we created earlier. Your configuration should look something like this:
<configuration>
<property>
<name>hbase.tmp.dir</name>
<value>/var/hbase</value>
</property>
</configuration>
Save and exit.
That’s pretty much it when it comes to installing HBase. We should be able to start HBase by running the start-hbase script
sudo ./bin/start-hbase.sh
After a few minutes HBase should have started, you can test this by connecting to it using the HBase shell
./bin/hbase shell
If everything has gone to plan after a few seconds you will be at the HBase shell.
Well done so far! The only thing left is to configure HBase to start on system startup. Our current config requires the start-hbase.sh script to be manually ran which is not ideal. We are going to be using Supervisor to auto run our start script when the system starts. Install Supervisor using the following command:
sudo apt-get install supervisor
Now we need to make a conf file so Supervisor knows about HBase and can start it for us.
sudo nano /etc/supervisor/conf.d/hbase.conf
Enter the following configuration. One thing to be aware of is that the hbase-x.x.x directory may be in a slightly different directory for you so make sure you enter the correct path.
[program:hbase]
command=bash -c "/home/tom/hbase-2.2.4/bin/start-hbase.sh"
priority=100
stdout_logfile=/var/log/hbase.out.log
stderr_logfile=/var/log/hbase.err.log
autostart=true
autorestart=true
Save and exit like usual and run the following commands to tell Supervisor that there is a new configuration and to run it.
sudo supervisorctl reread
sudo supervisorctl update
You should now be able to access the HBase web UI! If you aren’t sure what port it’s running on you can cd into the HBase logs directory and run the following command to show the ports HBase is running on:
grep 'Jetty' *
In my case the UI can be accessed on 16030. If you can’t access the web UI you can tail the hbase error log for hints as to what’s going wrong.
tail /var/log/hbase.err.log
Remember that standalone is not recommended for production environments and you risk data loss by running it in this configuration. Thanks!
After the initial install you need to start HBase as root, otherwise it can’t access /var/hbase:
sudo ./bin/start-hbase.sh
Thanks, I’ve updated the article. Appreciate the comment.