scala - Spark and YARN. How does the SparkPi example use the first argument as the master url? -
i'm starting out learning spark, , i'm trying replicate sparkpi
example copying code new project , building jar. source sparkpi
is: https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/sparkpi.scala
i have working yarn cluster (running cdh 5.0.1), , i've uploaded spark assembly jar , set it's hdfs location in spark_jar
.
if run command, example works:
$ spark_classpath=/usr/lib/spark/examples/lib/spark-examples_2.10-0.9.0-cdh5.0.1.jar /usr/lib/spark/bin/spark-class org.apache.spark.examples.sparkpi yarn-client 10
however, if copy source new project , build jar , run same command (with different jar , classname), following error:
$ spark_classpath=spark.jar /usr/lib/spark/bin/spark-class spark.sparkpi yarn-client 10 exception in thread "main" org.apache.spark.sparkexception: master url must set in configuration @ org.apache.spark.sparkcontext.<init>(sparkcontext.scala:113) @ spark.sparkpi$.main(sparkpi.scala:9) @ spark.sparkpi.main(sparkpi.scala)
somehow, first argument isn't being passed master in sparkcontext in version, works fine in example.
looking @ sparkpi code, seems expect single numeric argument.
so there spark examples jar file intercepts first argument , somehow sets spark.master
property that?
this recent change — running old code in first case , running new code in second.
here change: https://github.com/apache/spark/commit/44dd57fb66bb676d753ad8d9757f9f4c03364113
i think right command now:
/usr/lib/spark/bin/spark-submit spark.jar --class spark.sparkpi yarn-client 10
Comments
Post a Comment