The
HADOOP procedure enables you to submit HDFS commands, MapReduce programs,
and Pig language code against Hadoop data. To connect to the Hadoop
server, a Hadoop configuration file is required that specifies the
name and JobTracker addresses for the specific server.
fs.default.name
is
the URI (protocol specifier, host name, and port) that describes the
NameNode for the cluster. Here is an example of a Hadoop configuration
file:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://xxx.us.company.com:8020</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>xxx.us.company.com:8021</value>
</property>
</configuration>
The PROC HADOOP statement
supports several options that control access to the Hadoop server.
For example, you can identify the Hadoop configuration file and specify
the user ID and password on the Hadoop server. You can specify the
server options on all PROC HADOOP statements. For the list of server
options,
see Hadoop Server Options.