Mapreduce wordcount作业中找不到类的exception

我正在尝试在hadoop中运行wordcount作业。但总是得到一个类未找到exception。我发布了我写的类和我用来运行作业的命令

import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; public class WordCount { public static class Map extends Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } } public static class Reduce extends Reducer { public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "WordCount"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); job.setJarByClass(WordCount.class); } } 

wordcount.jar导出到我的下载文件夹这是我用来运行作业的命令

  jeet@jeet-Vostro-2520:~/Downloads$ hadoop jar wordcount.jar org.gamma.WordCount /user/jeet/getty/gettysburg.txt /user/jeet/getty/out 

在这种情况下,我的mapreduce作业已启动但它正在进程的中间结束。打印exception树。

  14/01/27 13:16:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/01/27 13:16:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 14/01/27 13:16:02 INFO input.FileInputFormat: Total input paths to process : 1 14/01/27 13:16:02 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/01/27 13:16:02 WARN snappy.LoadSnappy: Snappy native library not loaded 14/01/27 13:16:03 INFO mapred.JobClient: Running job: job_201401271247_0001 14/01/27 13:16:04 INFO mapred.JobClient: map 0% reduce 0% 14/01/27 13:16:11 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847) ... 8 more 14/01/27 13:16:16 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847) ... 8 more 14/01/27 13:16:20 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847) ... 8 more 14/01/27 13:16:26 INFO mapred.JobClient: Job complete: job_201401271247_0001 14/01/27 13:16:26 INFO mapred.JobClient: Counters: 7 14/01/27 13:16:26 INFO mapred.JobClient: Job Counters 14/01/27 13:16:26 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=20953 14/01/27 13:16:26 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 14/01/27 13:16:26 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 14/01/27 13:16:26 INFO mapred.JobClient: Launched map tasks=4 14/01/27 13:16:26 INFO mapred.JobClient: Data-local map tasks=4 14/01/27 13:16:26 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 14/01/27 13:16:26 INFO mapred.JobClient: Failed map tasks=1 somebody please please help i think i am very close of it 

尝试添加此function

 Job job = new Job(conf, "wordcount"); job.setJarByClass(WordCount.class); 

我也遇到了同样的问题并通过从我执行jar的同一目录中删除相同的WordCount.class文件来修复它。 看起来它正在将课程带到jar子外面。 尝试

我怀疑这个:

14/01/27 13:16:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).

使用CDH4.6时出现同样的错误,解决上述警告后解决了。

您必须添加此方法

job.setJarByClass(WordCount.class);

在调用方法之前

job.waitForCompletion(真);

如下所示:

job.setJarByClass(WordCount.class);

job.waitForCompletion(真);

尝试job.setJar("wordcount.jar"); ,其中wordcount.jar是您要打包到的jar文件。 这个方法适合我,但setJarByClass用于setJarByClass

使用以下代码解决此问题。 job.setJarByClass(DriverClass.class);

job.setJarByClass(WordCount.class); job.waitForCompletion(真);

虽然MapReduce程序是并行处理。 Mapper,Combiner和Reducer类具有顺序流。 必须等待完成每个流程依赖于其他类所以需要job.waitForCompletion(true); 但是必须在启动Mapper,Combiner和Reducer类之前设置输入和输出路径。 参考

像这样更改你的代码:

  public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "WordCount"); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.waitForCompletion(true); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); } 

我希望这会奏效。

我使用JobConf#setJar(String)