In the previous post we’ve seen how to install Hadoop on Ubuntu, now it’s time to run our first Hadoop MapReduce job. We will use the WordCount example job which reads text files and counts how often words occur. The input is text files and the output is text files, each line of which contains a word and the count of how often it occurred, separated by a tab. 1. Download example input data We will use three ebooks from Project Gutenberg for this example: The Outline of Science, Vol. 1 (of 4) by J. Arthur Thomson The Notebooks of Leonardo Da Vinci Ulysses by James Joyce Download each ebook and store the files in a local temporary directory of choice, for example ‘/tmp/gutenberg’. Now we have tochange the file ownership to hduser . Open a terminal and run: 1 sudo chown –R hduser:hadoop /tmp/gutenberg 2. Restart the Hadoop cluster Open a new terminal and restart your Hadoop cluster if it’s not running already 1 su - hduser 2 /usr/local/hadoop/bin/sta