Wednesday, July 1, 2015

Cloudera Quickstart VM 5.3 Apache Pig configuration

Introduction

Cloudera provides a pseudo-distributed node for working with Apache Hadoop. It is called the Cloudera Quickstart VM. While most tools in the Hadoop ecosystem such as Apache Sqoop and Apache Hive work right out of the box , Apache Pig requires some additional configuration to make it work smoothly. This blog post provides the details of such additional configuration steps.


Solution

1. Open a new Terminal.
2. su - root (Enter cloudera as the password)
3. cd /etc/pig/conf
4.

a. mv log4j.properties log4j.properties.orig (Let us make a copy of the default file)
b. cp -p log4j.properties.orig log4j.properties

5. 

a. mv pig.properties pig.properties.orig (Let us make a copy of the default file)
b. cp -p pig.properties.orig pig.properties

6. Edit log4j.properties as below

a.Replace log4j.logger.org.apache.pig=info, A with the below
 log4j.logger.org.apache.pig=error, A
b. Then add a new line log4j.logger.org.apache.hadoop=error, A

7. Edit pig.properties as below

a. Uncomment (remove the #) the line log4jconf=./conf/log4j.properties if it is already commented and let the line start with no blank spaces.

b. Replace the line starting with #clustername with quickstart.cloudera:50010

quickstart.cloudera:50010 is the Hadoop cluster name in the Quickstart VM. You can find this information by running the hdfs dfsadmin -report command.

8. chmod -R o+w /etc/pig/conf.dist


9. cp -p /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar /usr/lib/hive/lib


Conclusion


The above steps will help avoid the following errors when Pig is run in interactive mode using the Grunt shell.

ls: cannot access /usr/lib/hive/lib/slf4j-api-*.jar: No such file or directory


WARN pig.Main: Cannot write to log file: /etc/pig/conf.dist/pig_1435724561990.log

ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1435707575650_0004' doesn't exist in RM.

No comments: