spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sathish Kumaran Vairavelu <vsathishkuma...@gmail.com>
Subject Re: Spark, Mesos, Docker and S3
Date Mon, 01 Feb 2016 23:56:33 GMT
Thank you for the clarification. It works!
On Fri, Jan 29, 2016 at 6:36 PM Mao Geng <mao@sumologic.com> wrote:

> Sathish,
>
> The constraint you described is Marathon's, not Mesos's :)
>
> Spark.mesos.constraints are applied to slave attributes like tachyon=true
> ;us-east-1=false, as described in
> https://issues.apache.org/jira/browse/SPARK-6707.
>
> Cheers,
> -Mao
>
> On Fri, Jan 29, 2016 at 2:51 PM, Sathish Kumaran Vairavelu <
> vsathishkumaran@gmail.com> wrote:
>
>> Hi
>>
>> Quick question. How to pass constraint [["hostname", "CLUSTER", "
>> specific.node.com"]] to mesos?
>>
>> I was trying --conf spark.mesos.constraints=hostname:specific.node.com.
>> But it didn't seems working
>>
>>
>> Please help
>>
>>
>> Thanks
>>
>> Sathish
>>
>> On Thu, Jan 28, 2016 at 6:52 PM Mao Geng <mao@sumologic.com> wrote:
>>
>>> From my limited knowledge, only limited options such as network mode,
>>> volumes, portmaps can be passed through. See
>>> https://github.com/apache/spark/pull/3074/files.
>>>
>>> https://issues.apache.org/jira/browse/SPARK-8734 is open for exposing
>>> all docker options to spark.
>>>
>>> -Mao
>>>
>>> On Thu, Jan 28, 2016 at 1:55 PM, Sathish Kumaran Vairavelu <
>>> vsathishkumaran@gmail.com> wrote:
>>>
>>>> Thank you., I figured it out. I have set executor memory to minimal and
>>>> it works.,
>>>>
>>>> Another issue has come.. I have to pass --add-host option while running
>>>> containers in slave nodes.. Is there any option to pass docker run
>>>> parameters from spark?
>>>> On Thu, Jan 28, 2016 at 12:26 PM Mao Geng <mao@sumologic.com> wrote:
>>>>
>>>>> Sathish,
>>>>>
>>>>> I guess the mesos resources are not enough to run your job. You might
>>>>> want to check the mesos log to figure out why.
>>>>>
>>>>> I tried to run the docker image with "--conf spark.mesos.coarse=false"
>>>>> and "true". Both are fine.
>>>>>
>>>>> Best,
>>>>> Mao
>>>>>
>>>>> On Wed, Jan 27, 2016 at 5:00 PM, Sathish Kumaran Vairavelu <
>>>>> vsathishkumaran@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On the same Spark/Mesos/Docker setup, I am getting warning "Initial
>>>>>> Job has not accepted any resources; check your cluster UI to ensure
that
>>>>>> workers are registered and have sufficient resources". I am running
in
>>>>>> coarse grained mode. Any pointers on how to fix this issue? Please
help. I
>>>>>> have updated both docker.properties and spark-default.conf with 
spark.mesos.executor.docker.image
>>>>>> and other properties.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Sathish
>>>>>>
>>>>>> On Wed, Jan 27, 2016 at 9:58 AM Sathish Kumaran Vairavelu <
>>>>>> vsathishkumaran@gmail.com> wrote:
>>>>>>
>>>>>>> Thanks a lot for your info! I will try this today.
>>>>>>> On Wed, Jan 27, 2016 at 9:29 AM Mao Geng <mao@sumologic.com>
wrote:
>>>>>>>
>>>>>>>> Hi Sathish,
>>>>>>>>
>>>>>>>> The docker image is normal, no AWS profile included.
>>>>>>>>
>>>>>>>> When the driver container runs with --net=host, the driver
host's
>>>>>>>> AWS profile will take effect so that the driver can access
the protected s3
>>>>>>>> files.
>>>>>>>>
>>>>>>>> Similarly,  Mesos slaves also run Spark executor docker container
>>>>>>>> in --net=host mode, so that the AWS profile of Mesos slaves
will take
>>>>>>>> effect.
>>>>>>>>
>>>>>>>> Hope it helps,
>>>>>>>> Mao
>>>>>>>>
>>>>>>>> On Jan 26, 2016, at 9:15 PM, Sathish Kumaran Vairavelu <
>>>>>>>> vsathishkumaran@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hi Mao,
>>>>>>>>
>>>>>>>> I want to check on accessing the S3 from Spark docker in
Mesos.
>>>>>>>> The EC2 instance that I am using has the AWS profile/IAM
included.  Should
>>>>>>>> we build the docker image with any AWS profile settings or
--net=host
>>>>>>>> docker option takes care of it?
>>>>>>>>
>>>>>>>> Please help
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Sathish
>>>>>>>>
>>>>>>>> On Tue, Jan 26, 2016 at 9:04 PM Mao Geng <mao@sumologic.com>
wrote:
>>>>>>>>
>>>>>>>>> Thank you very much, Jerry!
>>>>>>>>>
>>>>>>>>> I changed to "--jars
>>>>>>>>> /opt/spark/lib/hadoop-aws-2.7.1.jar,/opt/spark/lib/aws-java-sdk-1.7.4.jar"
>>>>>>>>> then it worked like a charm!
>>>>>>>>>
>>>>>>>>> From Mesos task logs below, I saw Mesos executor downloaded
the
>>>>>>>>> jars from the driver, which is a bit unnecessary (as
the docker image
>>>>>>>>> already has them), but that's ok - I am happy seeing
Spark + Mesos + Docker
>>>>>>>>> + S3 worked together!
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Mao
>>>>>>>>>
>>>>>>>>> 16/01/27 02:54:45 INFO Executor: Using REPL class URI:
http://172.16.3.98:33771
>>>>>>>>> 16/01/27 02:55:12 INFO CoarseGrainedExecutorBackend:
Got assigned task 0
>>>>>>>>> 16/01/27 02:55:12 INFO Executor: Running task 0.0 in
stage 0.0 (TID 0)
>>>>>>>>> 16/01/27 02:55:12 INFO Executor: Fetching http://172.16.3.98:3850/jars/hadoop-aws-2.7.1.jar
with timestamp 1453863280432
>>>>>>>>> 16/01/27 02:55:12 INFO Utils: Fetching http://172.16.3.98:3850/jars/hadoop-aws-2.7.1.jar
to /tmp/spark-7b8e1681-8a62-4f1d-9e11-fdf8062b1b08/fetchFileTemp1518118694295619525.tmp
>>>>>>>>> 16/01/27 02:55:12 INFO Utils: Copying /tmp/spark-7b8e1681-8a62-4f1d-9e11-fdf8062b1b08/-19880839621453863280432_cache
to /./hadoop-aws-2.7.1.jar
>>>>>>>>> 16/01/27 02:55:12 INFO Executor: Adding file:/./hadoop-aws-2.7.1.jar
to class loader
>>>>>>>>> 16/01/27 02:55:12 INFO Executor: Fetching http://172.16.3.98:3850/jars/aws-java-sdk-1.7.4.jar
with timestamp 1453863280472
>>>>>>>>> 16/01/27 02:55:12 INFO Utils: Fetching http://172.16.3.98:3850/jars/aws-java-sdk-1.7.4.jar
to /tmp/spark-7b8e1681-8a62-4f1d-9e11-fdf8062b1b08/fetchFileTemp8868621397726761921.tmp
>>>>>>>>> 16/01/27 02:55:12 INFO Utils: Copying /tmp/spark-7b8e1681-8a62-4f1d-9e11-fdf8062b1b08/8167072821453863280472_cache
to /./aws-java-sdk-1.7.4.jar
>>>>>>>>> 16/01/27 02:55:12 INFO Executor: Adding file:/./aws-java-sdk-1.7.4.jar
to class loader
>>>>>>>>>
>>>>>>>>> On Tue, Jan 26, 2016 at 5:40 PM, Jerry Lam <chilinglam@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Mao,
>>>>>>>>>>
>>>>>>>>>> Can you try --jars to include those jars?
>>>>>>>>>>
>>>>>>>>>> Best Regards,
>>>>>>>>>>
>>>>>>>>>> Jerry
>>>>>>>>>>
>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>
>>>>>>>>>> On 26 Jan, 2016, at 7:02 pm, Mao Geng <mao@sumologic.com>
wrote:
>>>>>>>>>>
>>>>>>>>>> Hi there,
>>>>>>>>>>
>>>>>>>>>> I am trying to run Spark on Mesos using a Docker
image as
>>>>>>>>>> executor, as mentioned
>>>>>>>>>> http://spark.apache.org/docs/latest/running-on-mesos.html#mesos-docker-support
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>> I built a docker image using the following Dockerfile
(which is
>>>>>>>>>> based on
>>>>>>>>>> https://github.com/apache/spark/blob/master/docker/spark-mesos/Dockerfile
>>>>>>>>>> ):
>>>>>>>>>>
>>>>>>>>>> FROM mesosphere/mesos:0.25.0-0.2.70.ubuntu1404
>>>>>>>>>>
>>>>>>>>>> # Update the base ubuntu image with dependencies
needed for Spark
>>>>>>>>>> RUN apt-get update && \
>>>>>>>>>>     apt-get install -y python libnss3 openjdk-7-jre-headless
curl
>>>>>>>>>>
>>>>>>>>>> RUN curl
>>>>>>>>>> http://www.carfab.com/apachesoftware/spark/spark-1.6.0/spark-1.6.0-bin-hadoop2.6.tgz
>>>>>>>>>> | tar -xzC /opt && \
>>>>>>>>>>     ln -s /opt/spark-1.6.0-bin-hadoop2.6 /opt/spark
>>>>>>>>>> ENV SPARK_HOME /opt/spark
>>>>>>>>>> ENV MESOS_NATIVE_JAVA_LIBRARY /usr/local/lib/libmesos.so
>>>>>>>>>>
>>>>>>>>>> Then I successfully ran spark-shell via this docker
command:
>>>>>>>>>> docker run --rm -it --net=host <registry>/<image>:<tag>
>>>>>>>>>> /opt/spark/bin/spark-shell --master mesos://<master_host>:5050
--conf
>>>>>>>>>> <registry>/<image>:<tag>
>>>>>>>>>>
>>>>>>>>>> So far so good. Then I wanted to call sc.textFile
to load a file
>>>>>>>>>> from S3, but I was blocked by some issues which I
couldn't figure out. I've
>>>>>>>>>> read
>>>>>>>>>> https://dzone.com/articles/uniting-spark-parquet-and-s3-as-an-alternative-to
>>>>>>>>>> and
>>>>>>>>>> http://blog.encomiabile.it/2015/10/29/apache-spark-amazon-s3-and-apache-mesos,
>>>>>>>>>> learned that I need to add hadood-aws-2.7.1 and aws-java-sdk-2.7.4
into the
>>>>>>>>>> executor and driver's classpaths, in order to access
s3 files.
>>>>>>>>>>
>>>>>>>>>> So, I added following lines into Dockerfile and build
a new
>>>>>>>>>> image.
>>>>>>>>>> RUN curl
>>>>>>>>>> https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar
>>>>>>>>>> -o /opt/spark/lib/aws-java-sdk-1.7.4.jar
>>>>>>>>>> RUN curl
>>>>>>>>>> http://central.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.1/hadoop-aws-2.7.1.jar
>>>>>>>>>> -o /opt/spark/lib/hadoop-aws-2.7.1.jar
>>>>>>>>>>
>>>>>>>>>> Then I started spark-shell again with below command:
>>>>>>>>>> docker run --rm -it --net=host <registry>/<image>:<tag>
>>>>>>>>>> /opt/spark/bin/spark-shell --master mesos://<master_host>:5050
--conf
>>>>>>>>>> <registry>/<image>:<tag> --conf
spark.executor.extraClassPath=/opt/spark/lib/hadoop-aws-2.7.1.jar:/opt/spark/lib/aws-java-sdk-1.7.4.jar
--conf
>>>>>>>>>> spark.driver.extraClassPath=/opt/spark/lib/hadoop-aws-2.7.1.jar:/opt/spark/lib/aws-java-sdk-1.7.4.jar
>>>>>>>>>>
>>>>>>>>>> But below command failed when I ran it in spark-shell:
>>>>>>>>>> scala> sc.textFile("s3a://<bucket_name>/<file_name>").count()
>>>>>>>>>> [Stage 0:>
>>>>>>>>>>    (0 + 2) / 2]16/01/26 23:05:23 WARN TaskSetManager:
Lost task 0.0 in
>>>>>>>>>> stage 0.0 (TID 0, ip-172-16-14-203.us-west-2.compute.internal):
>>>>>>>>>> java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class
>>>>>>>>>> org.apache.hadoop.fs.s3a.S3AFileSystem not found
>>>>>>>>>> at
>>>>>>>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
>>>>>>>>>> at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2578)
>>>>>>>>>> at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
>>>>>>>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>>>>>>>>>> at
>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
>>>>>>>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
>>>>>>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>>>>>>>>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
>>>>>>>>>> at
>>>>>>>>>> org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:107)
>>>>>>>>>> at
>>>>>>>>>> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>>>>>>>>>> at
>>>>>>>>>> org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:237)
>>>>>>>>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
>>>>>>>>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>>>>>>>>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>>>>>>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>>>>>>>>> at
>>>>>>>>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>>>>>>>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>>>>>>>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>>>>>>>>> at
>>>>>>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>>>>>>>>> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>>>>>>>>> at
>>>>>>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>>>>>>>>>> at
>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>>> at
>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>> Caused by: java.lang.ClassNotFoundException: Class
>>>>>>>>>> org.apache.hadoop.fs.s3a.S3AFileSystem not found
>>>>>>>>>> at
>>>>>>>>>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
>>>>>>>>>> at
>>>>>>>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072)
>>>>>>>>>> ... 23 more
>>>>>>>>>>
>>>>>>>>>> I checked hadoop-aws-2.7.1.jar,
>>>>>>>>>> the org.apache.hadoop.fs.s3a.S3AFileSystem class
file is in it. I also
>>>>>>>>>> checked the Environment page of driver's Web UI at
4040 port, both
>>>>>>>>>> hadoop-aws-2.7.1.jar and aws-java-sdk-1.7.4.jar are
in the
>>>>>>>>>> Classpath Entries (system path). And following code
ran fine in spark-shell:
>>>>>>>>>> scala> val clazz =
>>>>>>>>>> Class.forName("org.apache.hadoop.fs.s3a.S3AFileSystem")
>>>>>>>>>> clazz: Class[_] = class org.apache.hadoop.fs.s3a.S3AFileSystem
>>>>>>>>>>
>>>>>>>>>> scala> clazz.getClassLoader()
>>>>>>>>>> res2: ClassLoader = sun.misc.Launcher$AppClassLoader@770848b9
>>>>>>>>>>
>>>>>>>>>> So, I am confused why the task failed with "java.lang.ClassNotFoundException"
>>>>>>>>>> Exception? Is there something wrong in the command
line options I
>>>>>>>>>> used to start spark-shell, or in the docker image,
or in the "s3a://" url?
>>>>>>>>>> Or is something related to the Docker executor of
Mesos? I studied a bit
>>>>>>>>>> https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala
>>>>>>>>>> but didn't understand it well...
>>>>>>>>>>
>>>>>>>>>> Appreciate if anyone will shed some lights on me.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Mao Geng
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>
>>>
>

Mime
View raw message