spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chester @work" <ches...@alpinenow.com>
Subject Re: Unable to run Spark 1.0 SparkPi on HDP 2.0
Date Mon, 07 Jul 2014 13:22:32 GMT
In Yarn cluster mode, you can either have spark on all the cluster nodes or supply the spark
jar yourself. In the 2nd case, you don't need install spark on cluster at all. As you supply
the spark assembly as we as your app jar together. 

I hope this make it clear

Chester

Sent from my iPhone

> On Jul 7, 2014, at 5:05 AM, Konstantin Kudryavtsev <kudryavtsev.konstantin@gmail.com>
wrote:
> 
> thank you Krishna!
> 
> Could you please explain why do I need install spark on each node if Spark official site
said: If you have a Hadoop 2 cluster, you can run Spark without any installation needed
> 
> I have HDP 2 (YARN) and that's why I hope I don't need to install spark on each node

> 
> Thank you,
> Konstantin Kudryavtsev
> 
> 
>> On Mon, Jul 7, 2014 at 1:57 PM, Krishna Sankar <ksankar42@gmail.com> wrote:
>> Konstantin,
>> You need to install the hadoop rpms on all nodes. If it is Hadoop 2, the nodes would
have hdfs & YARN.
>> Then you need to install Spark on all nodes. I haven't had experience with HDP, but
the tech preview might have installed Spark as well.
>> In the end, one should have hdfs,yarn & spark installed on all the nodes.
>> After installations, check the web console to make sure hdfs, yarn & spark are
running.
>> Then you are ready to start experimenting/developing spark applications.
>> HTH.
>> Cheers
>> <k/>
>> 
>> 
>>> On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <kudryavtsev.konstantin@gmail.com>
wrote:
>>> guys, I'm not talking about running spark on VM, I don have problem with it.
>>> 
>>> I confused in the next:
>>> 1) Hortonworks describe installation process as RPMs on each node
>>> 2) spark home page said that everything I need is YARN
>>> 
>>> And I'm in stucj with understanding what I need to do to run spark on yarn (do
I need RPMs installations or only build spark on edge node?)
>>> 
>>> 
>>> Thank you,
>>> Konstantin Kudryavtsev
>>> 
>>> 
>>>> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <srobertjames@gmail.com>
wrote:
>>>> I can say from my experience that getting Spark to work with Hadoop 2
>>>> is not for the beginner; after solving one problem after another
>>>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>>> 
>>>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>>>> why, but, given so, Hadoop 2 has too many bumps
>>>> 
>>>> On 7/6/14, Marco Shaw <marco.shaw@gmail.com> wrote:
>>>> > That is confusing based on the context you provided.
>>>> >
>>>> > This might take more time than I can spare to try to understand.
>>>> >
>>>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express
VM.
>>>> >
>>>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't running
by
>>>> > default.
>>>> >
>>>> > I can't remember for MapR...
>>>> >
>>>> > Marco
>>>> >
>>>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>>>> >> <kudryavtsev.konstantin@gmail.com> wrote:
>>>> >>
>>>> >> Marco,
>>>> >>
>>>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1
that you
>>>> >> can try
>>>> >> from
>>>> >> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>>> >>
>>>> >> On other hand, http://spark.apache.org/ said "
>>>> >> Integrated with Hadoop
>>>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>>>> >> existing Hadoop data.
>>>> >>
>>>> >> If you have a Hadoop 2 cluster, you can run Spark without any installation
>>>> >> needed. "
>>>> >>
>>>> >> And this is confusing for me... do I need rpm installation on not?...
>>>> >>
>>>> >>
>>>> >> Thank you,
>>>> >> Konstantin Kudryavtsev
>>>> >>
>>>> >>
>>>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <marco.shaw@gmail.com>
>>>> >>> wrote:
>>>> >>> Can you provide links to the sections that are confusing?
>>>> >>>
>>>> >>> My understanding, the HDP1 binaries do not need YARN, while
the HDP2
>>>> >>> binaries do.
>>>> >>>
>>>> >>> Now, you can also install Hortonworks Spark RPM...
>>>> >>>
>>>> >>> For production, in my opinion, RPMs are better for manageability.
>>>> >>>
>>>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>>> >>>> <kudryavtsev.konstantin@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs
suggest
>>>> >>>> install spark rpm on each node, but on Spark main page said
that yarn
>>>> >>>> enough and I don't need to install it... What the difference?
>>>> >>>>
>>>> >>>> sent from my HTC
>>>> >>>>
>>>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vinayshukla@gmail.com>
wrote:
>>>> >>>>> Konstantin,
>>>> >>>>>
>>>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP
2.1 that you can
>>>> >>>>> try
>>>> >>>>> from
>>>> >>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>>>>
>>>> >>>>> Let me know if you see issues with the tech preview.
>>>> >>>>>
>>>> >>>>> "spark PI example on HDP 2.0
>>>> >>>>>
>>>> >>>>> I downloaded spark 1.0 pre-build from
>>>> >>>>> http://spark.apache.org/downloads.html
>>>> >>>>> (for HDP2)
>>>> >>>>> The run example from spark web-site:
>>>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>> >>>>> --master
>>>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g --executor-memory
2g
>>>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar
2
>>>> >>>>>
>>>> >>>>> I got error:
>>>> >>>>> Application application_1404470405736_0044 failed 3
times due to AM
>>>> >>>>> Container for appattempt_1404470405736_0044_000003 exited
with
>>>> >>>>> exitCode: 1
>>>> >>>>> due to: Exception from container-launch:
>>>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>> >>>>> at
>>>> >>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>> >>>>> at
>>>> >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> >>>>> at
>>>> >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> >>>>> at java.lang.Thread.run(Thread.java:744)
>>>> >>>>> .Failing this attempt.. Failing the application.
>>>> >>>>>
>>>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>>> >>>>> --executor-cores, 1,
>>>> >>>>> --num-executors, 3)
>>>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster
[options]
>>>> >>>>> Options:
>>>> >>>>>   --jar JAR_PATH       Path to your application's JAR
file (required)
>>>> >>>>>   --class CLASS_NAME   Name of your application's main
class
>>>> >>>>> (required)
>>>> >>>>> ...bla-bla-bla
>>>> >>>>> "
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> View this message in context:
>>>> >>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>>> >>>>> Sent from the Apache Spark User List mailing list archive
at
>>>> >>>>> Nabble.com.
>>>> >>
>>>> >
> 

Mime
View raw message