HADOOP Procedure

MAPREDUCE Statement

Submits MapReduce programs into a Hadoop cluster.

Example:

Submitting a MapReduce Program

MapReduce Options

Syntax

MAPREDUCE <hadoop-server-options> <mapreduce-options>;

MapReduce Options

COMBINE=class-name: specifies the name of the combiner class in dot notation.

DELETERESULTS: deletes the MapReduce results.

GROUPCOMPARE=class-name: specifies the name of the grouping comparator (GroupComparator) class in dot notation.

INPUT=HDFS-path: specifies the HDFS path to the MapReduce input file.

INPUTFORMAT=class-name: specifies the name of the input format class in dot notation.

JAR='external-file(s)': specifies the locations of the JAR files that contain the MapReduce program and named classes. Include the complete pathname and the filename. Enclose each location in single or double quotation marks.

MAP=class-name: specifies the name of the map class in dot notation. A map class contains elements that are formed by the combination of a key value and a mapped value.

OUTPUT=HDFS-path: specifies a new HDFS path for the MapReduce output.

OUTPUTFORMAT=class-name: specifies the name of the output format class in dot notation.

OUTPUTKEY=class-name: specifies the name of the output key class in dot notation.

OUTPUTVALUE=class-name: is the name of the output value class in dot notation.

PARTITIONER=class-name: specifies the name of the partitioner class in dot notation. A partitioner class controls the partitioning of the keys of the intermediate map-outputs.

REDUCE=class-name: specifies the name of the reducer class in dot notation. The reduce class reduces a set of intermediate values that share a key to a smaller set of values.

REDUCETASKS=integer: specifies the number of reduce tasks.

SORTCOMPARE=class-name: specifies the name of the sort comparator class in dot notation.

WORKINGDIR=HDFS-path: specifies the name of the HDFS working directory path.

Copyright © SAS Institute Inc. All rights reserved.