I wrote a worldcount program for spark, which can be debugged in eclipse using local mode, or run through the maven packaged java-jar command:
SparkConf sparkConf = new SparkConf().setAppName("JavaWordCount");
sparkConf.setMaster("local");
JavaSparkContext ctx = new JavaSparkContext(sparkConf);
JavaRDD<String> lines = ctx.textFile("file:///c:/sparkTest.txt");
JavaRDD<String> words = lines.flatMap(new FlatMapFunction<String, String>() {
@Override
public Iterable<String> call(String s) {
return Arrays.asList(SPACE.split(s));
}
});
System.out.println(words.count());
I thought that when I switched to standalone client mode, I just changed the second line to
sparkConf.setMaster("spark://localhost:7077");
is fine, but I have read some articles that say that spark programs must be run using the spark-submit command or another Java program that contains sparklauncher, so direct Java-jar is not possible. Isn"t this contradictory to what I observed in local mode? Or is local mode a special case, and other standalone client modes, standalone cluster mode and yarn mode, can only be run with the spark-submit command? With the spark-submit command, it is impossible to debug in eclipse, which is very inconvenient.