spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Kudryavtsev <kudryavtsev.konstan...@gmail.com>
Subject Re: Unable to run Spark 1.0 SparkPi on HDP 2.0
Date Mon, 07 Jul 2014 12:05:19 GMT
thank you Krishna!

 Could you please explain why do I need install spark on each node if Spark
official site said: If you have a Hadoop 2 cluster, you can run Spark
without any installation needed

I have HDP 2 (YARN) and that's why I hope I don't need to install spark on
each node

Thank you,
Konstantin Kudryavtsev


On Mon, Jul 7, 2014 at 1:57 PM, Krishna Sankar <ksankar42@gmail.com> wrote:

> Konstantin,
>
>    1. You need to install the hadoop rpms on all nodes. If it is Hadoop
>    2, the nodes would have hdfs & YARN.
>    2. Then you need to install Spark on all nodes. I haven't had
>    experience with HDP, but the tech preview might have installed Spark as
>    well.
>    3. In the end, one should have hdfs,yarn & spark installed on all the
>    nodes.
>    4. After installations, check the web console to make sure hdfs, yarn
>    & spark are running.
>    5. Then you are ready to start experimenting/developing spark
>    applications.
>
> HTH.
> Cheers
> <k/>
>
>
> On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <
> kudryavtsev.konstantin@gmail.com> wrote:
>
>> guys, I'm not talking about running spark on VM, I don have problem with
>> it.
>>
>> I confused in the next:
>> 1) Hortonworks describe installation process as RPMs on each node
>> 2) spark home page said that everything I need is YARN
>>
>> And I'm in stucj with understanding what I need to do to run spark on
>> yarn (do I need RPMs installations or only build spark on edge node?)
>>
>>
>> Thank you,
>> Konstantin Kudryavtsev
>>
>>
>> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <srobertjames@gmail.com>
>> wrote:
>>
>>> I can say from my experience that getting Spark to work with Hadoop 2
>>> is not for the beginner; after solving one problem after another
>>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>>
>>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>>> why, but, given so, Hadoop 2 has too many bumps
>>>
>>> On 7/6/14, Marco Shaw <marco.shaw@gmail.com> wrote:
>>> > That is confusing based on the context you provided.
>>> >
>>> > This might take more time than I can spare to try to understand.
>>> >
>>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express VM.
>>> >
>>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't
>>> running by
>>> > default.
>>> >
>>> > I can't remember for MapR...
>>> >
>>> > Marco
>>> >
>>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>>> >> <kudryavtsev.konstantin@gmail.com> wrote:
>>> >>
>>> >> Marco,
>>> >>
>>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that
>>> you
>>> >> can try
>>> >> from
>>> >>
>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>> >>
>>> >> On other hand, http://spark.apache.org/ said "
>>> >> Integrated with Hadoop
>>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>>> >> existing Hadoop data.
>>> >>
>>> >> If you have a Hadoop 2 cluster, you can run Spark without any
>>> installation
>>> >> needed. "
>>> >>
>>> >> And this is confusing for me... do I need rpm installation on not?...
>>> >>
>>> >>
>>> >> Thank you,
>>> >> Konstantin Kudryavtsev
>>> >>
>>> >>
>>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <marco.shaw@gmail.com>
>>> >>> wrote:
>>> >>> Can you provide links to the sections that are confusing?
>>> >>>
>>> >>> My understanding, the HDP1 binaries do not need YARN, while the
HDP2
>>> >>> binaries do.
>>> >>>
>>> >>> Now, you can also install Hortonworks Spark RPM...
>>> >>>
>>> >>> For production, in my opinion, RPMs are better for manageability.
>>> >>>
>>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>> >>>> <kudryavtsev.konstantin@gmail.com> wrote:
>>> >>>>
>>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs
suggest
>>> >>>> install spark rpm on each node, but on Spark main page said
that
>>> yarn
>>> >>>> enough and I don't need to install it... What the difference?
>>> >>>>
>>> >>>> sent from my HTC
>>> >>>>
>>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vinayshukla@gmail.com>
wrote:
>>> >>>>> Konstantin,
>>> >>>>>
>>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1
that you
>>> can
>>> >>>>> try
>>> >>>>> from
>>> >>>>>
>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>> >>>>>
>>> >>>>> Let me know if you see issues with the tech preview.
>>> >>>>>
>>> >>>>> "spark PI example on HDP 2.0
>>> >>>>>
>>> >>>>> I downloaded spark 1.0 pre-build from
>>> >>>>> http://spark.apache.org/downloads.html
>>> >>>>> (for HDP2)
>>> >>>>> The run example from spark web-site:
>>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>> >>>>> --master
>>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g
>>> --executor-memory 2g
>>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar
2
>>> >>>>>
>>> >>>>> I got error:
>>> >>>>> Application application_1404470405736_0044 failed 3 times
due to AM
>>> >>>>> Container for appattempt_1404470405736_0044_000003 exited
with
>>> >>>>> exitCode: 1
>>> >>>>> due to: Exception from container-launch:
>>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>> >>>>> at
>>> >>>>>
>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>> >>>>> at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>> >>>>> at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>> >>>>> at
>>> >>>>>
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>> >>>>> at
>>> >>>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> >>>>> at
>>> >>>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> >>>>> at java.lang.Thread.run(Thread.java:744)
>>> >>>>> .Failing this attempt.. Failing the application.
>>> >>>>>
>>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>> >>>>> --executor-cores, 1,
>>> >>>>> --num-executors, 3)
>>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>> >>>>> Options:
>>> >>>>>   --jar JAR_PATH       Path to your application's JAR file
>>> (required)
>>> >>>>>   --class CLASS_NAME   Name of your application's main class
>>> >>>>> (required)
>>> >>>>> ...bla-bla-bla
>>> >>>>> "
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> --
>>> >>>>> View this message in context:
>>> >>>>>
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>> >>>>> Sent from the Apache Spark User List mailing list archive
at
>>> >>>>> Nabble.com.
>>> >>
>>> >
>>>
>>
>>
>

Mime
View raw message