Need help with RHive?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

nexr
123 Stars 63 Forks 212 Commits 55 Opened issues

Description

RHive is an R extension facilitating distributed computing via Apache Hive.

Services available

!
?

Need anything else?

Contributors list

NexR RHive 2.0

RHive is an R extension facilitating distributed computing via HIVE query. RHive allows easy usage of HQL(Hive SQL) in R, and allows easy usage of R objects and R functions in Hive.

Before installing RHive, you have to have installed Hadoop and Hive

Install Hadoop

  1. Single Node
  2. Cluster Node
  3. set HADOOP_HOME at local machine on which R runs

Install Hive

  1. install local machine and remote machine on which NameNode runs or Hive-Server runs.
  2. Installation Guide
  3. set HIVE_HOME at local machine on which R runs.
  4. launch Hive Server with following command on remote machine. it should be as a background process.
    • $HIVE_HOME/bin/hive --service hiveserver

Install R and Packages

  1. install R
    • need to install R on all tasktracker nodes
  2. install rJava
    • only install rJava on local machine.
  3. install Rserve
    • need to install Rserve on all tasktracker nodes
    • make configuration in path (/etc/Rserv.conf) on all tasktracker nodes. edit this file to add 'remote enable' to allow remote connection.
    • launch all Rserve on all tasktracker nodes.
      • e.q> R CMD Rserve
  4. setting tasktracker nodes
    • add RHOME path at $HADOOPHOME/conf/hadoop-env.sh
      • e.q> export R_HOME=/usr/lib/R
  5. install RUnit

Install RHive

  1. Requirements
    • ant (in order to build java files)
  2. Installing RHive
    1. Download source code: git clone https://github.com/nexr/RHive.git
    2. Change your working directory: cd RHive
    3. Set the environment variables HIVEHOME and HADOOPHOME: export HIVEHOME=/path/to/your/hive/directory export HADOOPHOME=/path/to/your/hadoop/directory
    4. Build java files using ant: ant build
    5. Build RHive: R CMD build RHive
    6. Install RHive: R CMD INSTALL RHive_.tar.gz

Loading RHive and connecting to Hive

  1. Set the environment variables HIVEHOME and HADOOPHOME:
    • Set the environment variables: export HIVEHOME=/path/to/your/hive/directory export HADOOPHOME=/path/to/your/hadoop/directory export HADOOPCONFDIR=/path/to/your/hadoop/conf/directory
    • Or, add environment variables into Renviron HIVEHOME=/path/to/your/hive/directory HADOOPHOME=/path/to/your/hadoop/directory HADOOPCONFDIR=/path/to/your/hadoop/conf/directory
  2. launch R
    library(RHive)
    rhive.connect(host, port, hiveServer2)

Tutorials

Requirements

  • Java 1.6
  • R 2.13.0
  • Rserve 0.6-0
  • rJava 0.9-0
  • Hadoop 0.20.x (x >= 1)
  • Hive 0.8.x (x >= 0)

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.