spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anahita Talebi <anahita.t.am...@gmail.com>
Subject Re: Running a spark code using submit job in google cloud platform
Date Fri, 13 Jan 2017 15:53:16 GMT
Hello,

Thanks a lot Dinko.
Yes, now it is working perfectly.


Cheers,
Anahita

On Fri, Jan 13, 2017 at 2:19 PM, Dinko Srko─Ź <dinko.srkoc@gmail.com> wrote:

> On 13 January 2017 at 13:55, Anahita Talebi <anahita.t.amiri@gmail.com>
> wrote:
> > Hi,
> >
> > Thanks for your answer.
> >
> > I have chose "Spark" in the "job type". There is not any option where we
> can
> > choose the version. How I can choose different version?
>
> There's "Preemptible workers, bucket, network, version,
> initialization, & access options" link just above the "Create" and
> "Cancel" buttons on the "Create a cluster" page. When you click it,
> you'll find "Image version" field where you can enter the image
> version.
>
> Dataproc versions:
> * 1.1 would be Spark 2.0.2,
> * 1.0 includes Spark 1.6.2
>
> More about versions can be found here:
> https://cloud.google.com/dataproc/docs/concepts/dataproc-versions
>
> Cheers,
> Dinko
>
> >
> > Thanks,
> > Anahita
> >
> >
> > On Thu, Jan 12, 2017 at 6:39 PM, A Shaikh <shaikh.afzal@gmail.com>
> wrote:
> >>
> >> You may have tested this code on Spark version on your local machine
> >> version of which may be different to whats in Google Cloud Storage.
> >> You need to select appropraite Spark version when you submit your job.
> >>
> >> On 12 January 2017 at 15:51, Anahita Talebi <anahita.t.amiri@gmail.com>
> >> wrote:
> >>>
> >>> Dear all,
> >>>
> >>> I am trying to run a .jar file as a job using submit job in google
> cloud
> >>> console.
> >>> https://cloud.google.com/dataproc/docs/guides/submit-job
> >>>
> >>> I actually ran the spark code on my local computer to generate a .jar
> >>> file. Then in the Argument folder, I give the value of the arguments
> that I
> >>> used in the spark code. One of the argument is training data set that
> I put
> >>> in the same bucket that I save my .jar file. In the bucket, I put only
> the
> >>> .jar file, training dataset and testing dataset.
> >>>
> >>> Main class or jar
> >>> gs://Anahita/test.jar
> >>>
> >>> Arguments
> >>>
> >>> --lambda=.001
> >>> --eta=1.0
> >>> --trainFile=gs://Anahita/small_train.dat
> >>> --testFile=gs://Anahita/small_test.dat
> >>>
> >>> The problem is that when I run the job I get the following error and
> >>> actually it cannot read  my training and testing data sets.
> >>>
> >>> Exception in thread "main" java.lang.NoSuchMethodError:
> >>> org.apache.spark.rdd.RDD.coalesce(IZLscala/math/
> Ordering;)Lorg/apache/spark/rdd/RDD;
> >>>
> >>> Can anyone help me how I can solve this problem?
> >>>
> >>> Thanks,
> >>>
> >>> Anahita
> >>>
> >>>
> >>
> >
>

Mime
View raw message