I try to run the main method directly in java to submit the job to yarn for execution. But get the following error message:
2018-08-26 10:25:37,544 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1375)) - Job job_1535213323614_0010 failed with state FAILED due to: Application application_1535213323614_0010 failed 2 times due to AM Container for appattempt_1535213323614_0010_000002 exited with exitCode: -1000 due to: File file:/tmp/hadoop-yarn/staging/nasuf/.staging/job_1535213323614_0010/job.jar does not exist
.Failing this attempt.. Failing the application.
and there is no log information for this HADOOP_HOME in the log directory of job.
the mapper code is as follows
public class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
String[] words = StringUtils.split(line, " ");
for (String word: words) {
context.write(new Text(word), new LongWritable(1));
}
}
}
the reducer code is as follows:
public class WCReducer extends Reducer<Text, LongWritable, Text, LongWritable>{
@Override
protected void reduce(Text key, Iterable<LongWritable> values, Context context)
throws IOException, InterruptedException {
long count = 0;
for (LongWritable value: values) {
count += value.get();
}
context.write(key, new LongWritable(count));
}
}
The main method is as follows:
public class WCRunner {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
conf.set("mapreduce.job.jar", "wc.jar");
conf.set("mapreduce.framework.name", "yarn");
conf.set("yarn.resourcemanager.hostname", "hdcluster01");
conf.set("yarn.nodemanager.aux-services", "mapreduce_shuffle");
Job job = Job.getInstance(conf);
// jobjar
job.setJarByClass(WCRunner.class);
// jobmapperreducer
job.setMapperClass(WCMapper.class);
job.setReducerClass(WCReducer.class);
// reducerkvmappermapperreducer
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
// mapperkv
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(LongWritable.class);
//
FileInputFormat.setInputPaths(job, new Path("hdfs://hdcluster01:9000/wc/srcdata"));
//
FileOutputFormat.setOutputPath(job, new Path("hdfs://hdcluster01:9000/wc/output3"));
// job
job.waitForCompletion(true);
}
}
the operating system on which my code executes locally is MacOS, the username is nasuf, and the hadoop for remote deployment is pseudo-distributed mode, hdfs and yarn are on the same server, and the user is parallels.
I checked that the path / tmp/hadoop-yarn/staging/nasuf/.staging/job_1535213323614_0010/job.jar mentioned in the log does not exist. There is no / hadoop-yarn directory under / tmp.
what is the cause of this problem?
Thank you