Hadoop- Unit test for Map Reduce using MRUnit
|We have already seen how to use Junit to write unit tests for your java classes. There is specialized test suite for testing mapreduce jobs, known as MRUnit. In this blog we will how to test the wordcount example using the MRUnit.
you need to include the following dependencies in the pom.xml
<dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.4</version> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.mrunit</groupId> <artifactId>mrunit</artifactId> <version>1.1.0</version> <classifier>hadoop2</classifier> <scope>test</scope> </dependency>
MRUnit provides a way to drive mapper , reducer and drive both mapper,reducer together.
MapDriver
ReduceDriver
MapReduceDriver
Lets how to test mapper classes
Testing Mapper
We will provide a record and as input and provide what is expected form the map, the order should match with the actual output for the test to pass
@Test public void mapperBreakesTheRecord() throws IOException { new MapDriver<LongWritable,Text,Text,IntWritable>() .withMapper(new WcMapper()) .withInput(new LongWritable(0), new Text("msg1 msg2 msg1")) .withAllOutput(Arrays.asList( new Pair<Text,IntWritable>(new Text("msg1"),new IntWritable(1)), new Pair<Text,IntWritable>(new Text("msg2"),new IntWritable(1)), new Pair<Text,IntWritable>(new Text("msg1"),new IntWritable(1)) )) .runTest(); }
Testing Reducer
The wordcount reducer sums the values of a key, to test we need to pass list of values as input to the reducer.
@Test public void testSumReducer() throws IOException { new ReduceDriver<Text,IntWritable,Text,IntWritable>() .withReducer(new WcReducer()) .withInput(new Text("msg1"), Arrays.asList(new IntWritable(1),new IntWritable(1))) .withOutput(new Text("msg1"), new IntWritable(2)) .runTest(); }
Testing MapReduce Job:
we can test the both mapper and reduce together as shown below
@Test @Test public void testWordCount() throws IOException{ new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable>() .withMapper(new WcMapper()) .withReducer(new WcReducer()) .withInput(new LongWritable(0), new Text("msg1 msg2 msg1")) .withAllOutput(Arrays.asList( new Pair<Text,IntWritable>(new Text("msg1"),new IntWritable(2)), new Pair<Text,IntWritable>(new Text("msg2"),new IntWritable(1)) )) .runTest(); }
Full project is available on github.