Apache Hive is a data warehousing solution for Hadoop which provides data summarization, query, and ad-hoc analysis. It is used to process structured and semi-structured data in Hadoop.

Installation with derby database :

PREREQUISITES:

  • Java 7 /8 installed
  • Dedicated user for hadoop (not mandatory)
  • SSH configured

1.Download the tarball file apache-hive-2.1.1-bin.tar.gz
2.Extract the file to /usr/local/ path
3.Downlaod the db derby file db-derby-10.9.1.0-bin.tar.gz
4.Extract the file to /usr/local/
5.Add the homepaths in .bashrc file and run the .bashrc file

1. Configure hive with Hadoop edit the hive-env.sh file, which is placed in the $HIVE_HOME/conf directory.

Edit the hive-env.sh file by appending the following line:

2. Get into conf directory under apache-hive-2.1.1-bin folder and rename hive-default.xml.template to hive-site.xml

Replace following values in hive-site.xml

 

With these values

Hive installation is completed successfully.

Now you require an external database server to configure Metastore. We use Apache Derby database.

Configuring Hive with derby database

1.Create a directory to store Metastore
Create a directory named data in $DERBY_HOME directory to store Metastore data.
mkdir $DERBY_HOME/data

2.Configuring Metastore of Hive
Configuring Metastore means specifying to Hive where the database is stored. You can do this by editing the hive-site.xml file, which is in the $HIVE_HOME/conf directory. First of all, copy the template file using the following command:

Edit hive-site.xml and append the following lines between the <configuration> and </configuration> tags:

 

Here we are creating metastore_db derby database
3.Create a file named jpox.properties and add the following lines into it:

4.Verifying Hive Installation

Before running Hive, you need to create the /tmp folder and a separate Hive folder in HDFS. Here, we use the /user/hive/warehouse folder. You need to set write permission for these newly created folders as shown below:

5.The following commands are used to verify Hive installation:

you can check the process is running or not

6.To enter to hive shell

$ bin/hive

On successful installation of Hive, you get to see the following response:

Using Beeline to connect with hiveserver2

Prerequisites:
Either run any one of them hive or hiveserver2, we can connect to anyone of them at a single .
Kill the metastore hive if it is running .To connect to hiveserver2

Check the process

connect to beeline

After successful connection connect to hive2 with following command

Check the databases and tables

How to connect to hive through JDBC

Add proxy in coresite.xml in hadoop as following and restart hadoop and hive

<property>

<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>

 

 

    1. We at DBACLASS are trying our best, to publish articles on different database techonologies especially on database admin jobs.
    For any queries or suggestion ,Please post in our forum forum.dbaclass.com.

Keep visiting us.