spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacob Eisinger <jeis...@us.ibm.com>
Subject Re: Local Dev Env with Mesos + Spark Streaming on Docker: Can't submit jobs.
Date Tue, 20 May 2014 17:30:04 GMT

Howdy Gerard,

Yeah, the docker link feature seems to work well for client-server
interaction.  But, peer-to-peer architectures need more for service
discovery.

As for you addressing requirements, I don't completely understand what you
are asking for... you may also want to check out xip.io .  Their wild card
domains sometimes makes for an easy, neat hack.

Finally for the ports, Docker's new host networking [1] feature helps out
with making a Spark Docker container.  (Security is still an issue.)

Jacob

[1] http://blog.docker.io/2014/05/docker-0-11-release-candidate-for-1-0/

Jacob

Jacob D. Eisinger
IBM Emerging Technologies
jeising@us.ibm.com - (512) 286-6075



From:	Gerard Maas <gerard.maas@gmail.com>
To:	user@spark.apache.org
Date:	05/16/2014 10:26 AM
Subject:	Re: Local Dev Env with Mesos + Spark Streaming on Docker: Can't
            submit jobs.



Hi Jacob,

Thanks for the help & answer on the docker question. Have you already
experimented with the new link feature in Docker? That does not help the
HDFS issue as the DataNode needs the namenode and vice-versa but it does
facilitate simpler client-server interactions.

My issue described at the beginning is  related to networking between the
host and the docker images, but I was loosing too much time tracking down
the exact problem, so I moved my Spark job driver into the mesos node and
it started working.  Sadly, my Mesos UI is partially crippled as workers
are not addressable (therefore spark job logs are hard to gather)

Your discussion about dynamic port allocation is very relevant to
understand why some components cannot talk with each other.  I'll need to
have a more in-depth read of that discussion to  find a better solution for
my local development environment.

regards,  Gerard.



On Tue, May 6, 2014 at 3:30 PM, Jacob Eisinger <jeising@us.ibm.com> wrote:
  Howdy,

  You might find the discussion Andrew and I have been having about Docker
  and network security [1] applicable.

  Also, I posted an answer [2] to your stackoverflow question.

  [1]
  http://apache-spark-user-list.1001560.n3.nabble.com/spark-shell-driver-interacting-with-Workers-in-YARN-mode-firewall-blocking-communication-tp5237p5441.html

  [2]
  http://stackoverflow.com/questions/23410505/how-to-run-hdfs-cluster-without-dns/23495100#23495100


  Jacob D. Eisinger
  IBM Emerging Technologies
  jeising@us.ibm.com - (512) 286-6075

  Inactive hide details for Gerard Maas ---05/05/2014 04:18:08 PM---Hi
  Benjamin, Yes, we initially used a modified version of theGerard Maas
  ---05/05/2014 04:18:08 PM---Hi Benjamin, Yes, we initially used a
  modified version of the AmpLabs docker scripts

  From: Gerard Maas <gerard.maas@gmail.com>
  To: user@spark.apache.org
  Date: 05/05/2014 04:18 PM
  Subject: Re: Local Dev Env with Mesos + Spark Streaming on Docker: Can't
  submit jobs.




  Hi Benjamin,

  Yes, we initially used a modified version of the AmpLabs docker scripts
  [1]. The amplab docker images are a good starting point.
  One of the biggest hurdles has been HDFS, which requires reverse-DNS and
  I didn't want to go the dnsmasq route to keep the containers relatively
  simple to use without the need of external scripts. Ended up running a
  1-node setup nnode+dnode. I'm still looking for a better solution for
  HDFS [2]

  Our usecase using docker is to easily create local dev environments both
  for development and for automated functional testing (using cucumber). My
  aim is to strongly reduce the time of the develop-deploy-test cycle.
  That  also means that we run the minimum number of instances required to
  have a functionally working setup. E.g. 1 Zookeeper, 1 Kafka broker, ...

  For the actual cluster deployment we have Chef-based devops toolchain
  that  put things in place on public cloud providers.
  Personally, I think Docker rocks and would like to replace those complex
  cookbooks with Dockerfiles once the technology is mature enough.

  -greetz, Gerard.

  [1] https://github.com/amplab/docker-scripts
  [2]
  http://stackoverflow.com/questions/23410505/how-to-run-hdfs-cluster-without-dns



  On Mon, May 5, 2014 at 11:00 PM, Benjamin <bbouille@gmail.com> wrote:
        Hi,

        Before considering running on Mesos, did you try to submit the
        application on Spark deployed without Mesos on Docker containers ?

        Currently investigating this idea to deploy quickly a complete set
        of clusters with Docker, I'm interested by your findings on sharing
        the settings of Kafka and Zookeeper across nodes. How many broker
        and zookeeper do you use ?

        Regards,



        On Mon, May 5, 2014 at 10:11 PM, Gerard Maas <gerard.maas@gmail.com
        > wrote:
              Hi all,

              I'm currently working on creating a set of docker images to
              facilitate local development with Spark/streaming on Mesos
              (+zk, hdfs, kafka)

              After solving the initial hurdles to get things working
              together in docker containers, now everything seems to
              start-up correctly and the mesos UI shows slaves as they are
              started.

              I'm trying to submit a job from IntelliJ and the jobs
              submissions seem to get lost in Mesos translation. The logs
              are not helping me to figure out what's wrong, so I'm posting
              them here in the hope that they can ring a bell and somebdoy
              could provide me a hint on what's wrong/missing with my
              setup.


              ---- DRIVER (IntelliJ running a Job.scala main) ----
              14/05/05 21:52:31 INFO MetadataCleaner: Ran metadata cleaner
              for SHUFFLE_BLOCK_MANAGER
              14/05/05 21:52:31 INFO BlockManager: Dropping broadcast
              blocks older than 1399319251962
              14/05/05 21:52:31 INFO BlockManager: Dropping non broadcast
              blocks older than 1399319251962
              14/05/05 21:52:31 INFO MetadataCleaner: Ran metadata cleaner
              for BROADCAST_VARS
              14/05/05 21:52:31 INFO MetadataCleaner: Ran metadata cleaner
              for BLOCK_MANAGER
              14/05/05 21:52:32 INFO MetadataCleaner: Ran metadata cleaner
              for HTTP_BROADCAST
              14/05/05 21:52:32 INFO MetadataCleaner: Ran metadata cleaner
              for MAP_OUTPUT_TRACKER
              14/05/05 21:52:32 INFO MetadataCleaner: Ran metadata cleaner
              for SPARK_CONTEXT


              ---- MESOS MASTER ----
              I0505 19:52:39.718080   388 master.cpp:690] Registering
              framework 201405051517-67113388-5050-383-6995 at scheduler
              (1)@127.0.1.1:58115
              I0505 19:52:39.718261   388 master.cpp:493] Framework
              201405051517-67113388-5050-383-6995 disconnected
              I0505 19:52:39.718277   389
              hierarchical_allocator_process.hpp:332] Added framework
              201405051517-67113388-5050-383-6995
              I0505 19:52:39.718312   388 master.cpp:520] Giving framework
              201405051517-67113388-5050-383-6995 0ns to failover
              I0505 19:52:39.718431   389
              hierarchical_allocator_process.hpp:408] Deactivated framework
              201405051517-67113388-5050-383-6995
              W0505 19:52:39.718459   388 master.cpp:1388] Master returning
              resources offered to framework
              201405051517-67113388-5050-383-6995 because the framework has
              terminated or is inactive
              I0505 19:52:39.718567   388 master.cpp:1376] Framework
              failover timeout, removing framework
              201405051517-67113388-5050-383-6995



              ---- MESOS SLAVE ----
              I0505 19:49:27.662019    20 slave.cpp:1191] Asked to shut
              down framework 201405051517-67113388-5050-383-6803 by
              master@172.17.0.4:5050
              W0505 19:49:27.662072    20 slave.cpp:1206] Cannot shut down
              unknown framework 201405051517-67113388-5050-383-6803
              I0505 19:49:28.662153    18 slave.cpp:1191] Asked to shut
              down framework 201405051517-67113388-5050-383-6804 by
              master@172.17.0.4:5050
              W0505 19:49:28.662212    18 slave.cpp:1206] Cannot shut down
              unknown framework 201405051517-67113388-5050-383-6804
              I0505 19:49:29.662199    13 slave.cpp:1191] Asked to shut
              down framework 201405051517-67113388-5050-383-6805 by
              master@172.17.0.4:5050
              W0505 19:49:29.662256    13 slave.cpp:1206] Cannot shut down
              unknown framework 201405051517-67113388-5050-383-6805
              I0505 19:49:30.662443    16 slave.cpp:1191] Asked to shut
              down framework 201405051517-67113388-5050-383-6806 by
              master@172.17.0.4:5050
              W0505 19:49:30.662489    16 slave.cpp:1206] Cannot shut down
              unknown framework 201405051517-67113388-5050-383-6806


              Thanks in advance,

              Gerard.



        --
        Benjamin Bouillé
        +33 665 050 285






Mime
View raw message