Hadoop map-reduce操作在写入输出时失败

我终于能够在Hadoop上启动map-reduce工作(在单个debian机器上运行)。 但是,map reduce作业总是失败,并出现以下错误:

hadoopmachine@debian:~$ ./hadoop-1.0.1/bin/hadoop jar hadooptest/main.jar nl.mydomain.hadoop.debian.test.Main /user/hadoopmachine/input /user/hadoopmachine/output Warning: $HADOOP_HOME is deprecated. 12/04/03 07:29:35 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. ****hdfs://localhost:9000/user/hadoopmachine/input 12/04/03 07:29:35 INFO input.FileInputFormat: Total input paths to process : 1 12/04/03 07:29:35 INFO mapred.JobClient: Running job: job_201204030722_0002 12/04/03 07:29:36 INFO mapred.JobClient: map 0% reduce 0% 12/04/03 07:29:41 INFO mapred.JobClient: Task Id : attempt_201204030722_0002_m_000002_0, Status : FAILED Error initializing attempt_201204030722_0002_m_000002_0: ENOENT: No such file or directory at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method) at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:692) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:647) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344) at org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:239) at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:196) at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1226) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1201) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1116) at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2404) at java.lang.Thread.run(Thread.java:636) 12/04/03 07:29:41 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201204030722_0002_m_000002_0&filter=stdout 12/04/03 07:29:41 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201204030722_0002_m_000002_0&filter=stderr 

不幸的是,它只说:“ENOENT:没有这样的文件或目录”,它没有说明它实际尝试访问的目录。 Ping localhost工作,输入目录确实存在。 jar子的位置也是正确的。

任何人都可以给我一个指针,指出如何解决这个错误,或者如何找出Hadoop试图访问哪个文件?

我在Hadoop邮件列表上发现了几个类似的问题,但没有对这些问题做出回应……

谢谢!

PS mapred.local.dir的配置如下所示(在mapred-site.xml中):

  mapred.local.dir /home/hadoopmachine/hadoop_data/mapred true  

根据要求,ps auxww |的输出 grep TaskTracker是:

 1000 4249 2.2 0.8 1181992 30176 ? Sl 12:09 0:00 /usr/lib/jvm/java-6-openjdk/bin/java -Dproc_tasktracker -Xmx1000m -Dhadoop.log.dir=/home/hadoopmachine/hadoop-1.0.1/libexec/../logs -Dhadoop.log.file=hadoop-hadoopmachine-tasktracker-debian.log -Dhadoop.home.dir=/home/hadoopmachine/hadoop-1.0.1/libexec/.. -Dhadoop.id.str=hadoopmachine -Dhadoop.root.logger=INFO,DRFA -Dhadoop.security.logger=INFO,NullAppender -Djava.library.path=/home/hadoopmachine/hadoop-1.0.1/libexec/../lib/native/Linux-i386-32 -Dhadoop.policy.file=hadoop-policy.xml -classpath [ommitted very long list of jars] org.apache.hadoop.mapred.TaskTracker 

从作业跟踪器中,确定执行此任务的hadoop节点。 SSH到该节点并识别hadoop.log.dir目录的位置(检查此节点的mapred-site.xml) – 我的猜测是hadoop用户没有在此文件夹中创建子目录的正确权限

它尝试创建的实际文件夹位于$ {hadoop.log.dir} / userlogs文件夹下 – 检查此文件夹是否具有正确的权限

在你的情况下,看看ps输出,我猜这是你需要检查权限的文件夹:

 /home/hadoopmachine/hadoop-1.0.1/libexec/../logs