Pig Installation
Before we start with the actual process, change user to 'hduser' (user used for Hadoop configuration).
1) Download stable latest release of Pig (version 0.12.1 used for this tutorial) from any one of the mirrors sites available at
http://pig.apache.org/releases.html
Select tar.gz (and not src.tar.gz) file to download.
2) Once download is complete, navigate to the directory containing the downloaded tar file and move the tar to the location where you want to setup Pig. In this case we will move to /usr/local.
Move to directory containing Pig Files
cd /usr/local
Extract contents of tar file as below
sudo tar -xvf pig-0.12.1.tar.gz
3). Modify ~/.bashrc to add Pig related environment variables
Open ~/.bashrc file in any text editor of your choice and do below modifications-
export PIG_HOME=<Installation directory of Pig> export PATH=$PIG_HOME/bin:$HADOOP_HOME/bin:$PATH
4) Now, source this environment configuration using below command
. ~/.bashrc
5) We need to recompile PIG to support Hadoop 2.2.0. For the purpose follow the steps mentioned
Go to PIG home directory
cd $PIG_HOME
Install ant
sudo apt-get install ant
Note:
Download will start and will consume time as per your internet speed.
Recompile PIG
sudo ant clean jar-all -Dhadoopversion=23
Note:
In this recompilation process multiple components are downloaded. So, system should be connected to internet.
Also, if the process stuck somewhere and you dont see any movement on command prompt for more than 20 minutes then press ctrl + c and rerun the same command.
6) Test the Pig installation using command
pig -help