Introduction
Cloudera provides a pseudo-distributed node for working with Apache Hadoop. It is called the Cloudera Quickstart VM. While most tools in the Hadoop ecosystem such as Apache Sqoop and Apache Hive work right out of the box , Apache Pig requires some additional configuration to make it work smoothly. This blog post provides the details of such additional configuration steps.
Solution
1. Open a new Terminal.
2. su - root (Enter cloudera as the password)
3. cd /etc/pig/conf
4.
a. mv log4j.properties log4j.properties.orig (Let us make a copy of the default file)
b. cp -p log4j.properties.orig log4j.properties
a. mv log4j.properties log4j.properties.orig (Let us make a copy of the default file)
b. cp -p log4j.properties.orig log4j.properties
5.
a. mv pig.properties pig.properties.orig (Let us make a copy of the default file)
b. cp -p pig.properties.orig pig.properties
a. mv pig.properties pig.properties.orig (Let us make a copy of the default file)
b. cp -p pig.properties.orig pig.properties
6. Edit log4j.properties as below
a.Replace log4j.logger.org.apache.pig=info, A with the below
log4j.logger.org.apache.pig=error, A
log4j.logger.org.apache.pig=error, A
b. Then add a new line log4j.logger.org.apache.hadoop=error, A
7. Edit pig.properties as below
a. Uncomment (remove the #) the line log4jconf=./conf/log4j.properties if it is already commented and let the line start with no blank spaces.
b. Replace the line starting with #clustername with quickstart.cloudera:50010
quickstart.cloudera:50010 is the Hadoop cluster name in the Quickstart VM. You can find this information by running the hdfs dfsadmin -report command.
8. chmod -R o+w /etc/pig/conf.dist
9. cp -p /usr/lib/hadoop/lib/slf4j-api-1.7.5.jar /usr/lib/hive/lib
Conclusion
The above steps will help avoid the following errors when Pig is run in interactive mode using the Grunt shell.
WARN pig.Main: Cannot write to log file: /etc/pig/conf.dist/pig_1435724561990.log
ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1435707575650_0004' doesn't exist in RM.
No comments:
Post a Comment