spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pillis Work <pillis.w...@gmail.com>
Subject Re: About Spark job web ui persist(JIRA-969)
Date Fri, 17 Jan 2014 01:07:12 GMT
Hello,
If changes are acceptable, I would like to request assignment of JIRA to me
for implementation.
Regards
pillis


On Thu, Jan 16, 2014 at 9:28 AM, Pillis Work <pillis.work@gmail.com> wrote:

> Hi Junluan,
> 1. Yes, we could persist to HDFS or any FS. I think at a minimum we should
> persist it to local disk - keeps the core simple.
> We can think of HDFS interactions as level-2 functionality that can be
> implemented once we have a good local implementation. The
> persistence/hydration layer of a SparkContextData can be made pluggable as
> a next step.
> Also, as mentioned in previous mail, SparkUI will now show multiple
> SparkContexts using data from SparkContextDatas.
>
> 2. Yes, we could
>
> 3. Yes, SparkUI will need a rewrite to deal with SparkContextDatas (either
> live, or hydrated from historical JSONs).
> Regards
>
>
>
>
> On Thu, Jan 16, 2014 at 8:15 AM, Xia, Junluan <junluan.xia@intel.com>wrote:
>
>> Hi Pillis
>>
>> It sound goods
>> 1. For SparkContextData, I think we could persist in HDFS not in local
>> disk(one SparkUI service may show more than one sparkcontext)
>> 2. we also could consider SparkContextData as one metrics
>> input(MetricsSource), for long running spark job, SparkContextData will
>> shown in ganglia/jmx .....
>> 3. if we persist SparkContextData periodically, we need to rewrite the UI
>> logic as spark ui now just show one timestamp information.
>>
>> -----Original Message-----
>> From: Pillis Work [mailto:pillis.work@gmail.com]
>> Sent: Thursday, January 16, 2014 5:37 PM
>> To: dev@spark.incubator.apache.org
>> Subject: Re: About Spark job web ui persist(JIRA-969)
>>
>> Hello,
>> I wanted to write down at a high level the changes I was thinking of.
>> Please feel free to critique and suggest changes.
>>
>> SparkContext:
>> SparkContext start will not be starting UI anymore. Rather it will launch
>> a SparkContextObserver (has SparkListener trait) which will generate a
>> SparkContextData instance. SparkContextObserver keeps SparkContextData
>> uptodate. SparkContextData will have all the historical information anyone
>> needs. Stopping a SparkContext stops the SparkContextObserver.
>>
>> SparkContextData:
>> Has all historical information of a SparkContext run. Periodically
>> persists itself to disk as JSON. Can hydrate itself from the same JSON.
>> SparkContextDatas are created without any UI usage. SparkContextData can
>> evolve independently of what UI needs - like having non-UI data needed for
>> third party integration.
>>
>> SparkUI:
>> No longer needs SparkContext. Will need an array of SparkContextDatas
>> (either by polling folder or other means). UI pages at render time will
>> access appropriate SparkContextData and produce HTML. SparkUI can be
>> started and stopped independently of SparkContexts. Multiple SparkContexts
>> can be shown in UI.
>>
>> I have purposefully not gone into much detail. Please let me know if any
>> piece needs to be elaborated.
>> Regards,
>> Pillis
>>
>>
>>
>>
>> On Mon, Jan 13, 2014 at 1:32 PM, Patrick Wendell <pwendell@gmail.com>
>> wrote:
>>
>> > Pillis - I agree we need to decouple the representation from a
>> > particular history server. But why not provide as simple history
>> > server people can (optionally) run if they aren't using Yarn or Mesos?
>> > For people running the standalone cluster scheduler this seems
>> > important. Giving them only a JSON dump isn't super consumable for
>> > most users.
>> >
>> > - Patrick
>> >
>> > On Mon, Jan 13, 2014 at 10:43 AM, Pillis Work <pillis.work@gmail.com>
>> > wrote:
>> > > The listeners in SparkUI which update the counters can trigger saves
>> > along
>> > > the way.
>> > > The save can be on a 500ms delay after the last update, to batch
>> changes.
>> > > This solution would not require save on stop().
>> > >
>> > >
>> > >
>> > > On Mon, Jan 13, 2014 at 6:15 AM, Tom Graves <tgraves_cs@yahoo.com>
>> > wrote:
>> > >
>> > >> So the downside to just saving stuff at the end is that if the app
>> > crashes
>> > >> or exits badly you don't have anything.   Hadoop has taken the
>> approach
>> > of
>> > >> saving events along the way.  But Hadoop also uses that history
>> > >> file to start where it left off at if something bad happens and it
>> > >> gets
>> > restarted.
>> > >>  I don't think the latter really applies to spark though.
>> > >>
>> > >> Does mesos have a history server?
>> > >>
>> > >> Tom
>> > >>
>> > >>
>> > >>
>> > >> On Sunday, January 12, 2014 9:22 PM, Pillis Work
>> > >> <pillis.work@gmail.com
>> > >
>> > >> wrote:
>> > >>
>> > >> IMHO from a pure Spark standpoint, I don't know if having a
>> > >> dedicated history service makes sense as of now - considering that
>> > >> cluster
>> > managers
>> > >> have their own history servers. Just showing UI of history runs
>> > >> might be too thin a requirement for a full service. Spark should
>> > >> store history information that can later be exposed in required ways.
>> > >>
>> > >> Since each SparkContext is the logical entry and exit point for
>> > >> doing something useful in Spark, during its stop(), it should
>> > >> serialize that run's statistics into a JSON file - like
>> > "sc_run_[name]_[start-time].json".
>> > >> When SparkUI.stop() is called, it in turn asks its UI objects
>> > >> (which
>> > should
>> > >> implement a trait) to provide either a flat or hierarchical Map of
>> > String
>> > >> key/value pairs. This map (flat, hierarchical) is then serialized
>> > >> to a configured path (default being "var/history").
>> > >>
>> > >> With regards to Mesos or YARN, their applications during shutdown
>> > >> can
>> > have
>> > >> API to import this Spark history into their history servers - by
>> > >> making
>> > API
>> > >> calls etc.
>> > >>
>> > >> This way Spark's history information is persisted independent of
>> > >> cluster framework, and cluster frameworks can import the history
>> when/as needed.
>> > >> Hope this helps.
>> > >> Regards,
>> > >> pillis
>> > >>
>> > >>
>> > >>
>> > >> On Thu, Jan 9, 2014 at 6:13 AM, Tom Graves <tgraves_cs@yahoo.com>
>> > wrote:
>> > >>
>> > >> > Note that it looks like we are planning on adding support for
>> > application
>> > >> > specific frameworks to YARN sooner rather then later. There is
an
>> > initial
>> > >> > design up here: https://issues.apache.org/jira/browse/YARN-1530.
>> > >> > Note this has not been reviewed yet so changes are likely but
>> > >> > gives an
>> > idea of
>> > >> > the general direction.  If anyone has comments on how that might
>> > >> > work
>> > >> with
>> > >> > SPARK I encourage you to post to the jira.
>> > >> >
>> > >> > As Sandy mentioned it would be very nice if the solution could
be
>> > >> > compatible with that.
>> > >> >
>> > >> > Tom
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Wednesday, January 8, 2014 12:44 AM, Sandy Ryza <
>> > >> > sandy.ryza@cloudera.com> wrote:
>> > >> >
>> > >> > Hey,
>> > >> >
>> > >> > YARN-321 is targeted for the Hadoop 2.4.  The minimum feature
set
>> > doesn't
>> > >> > include application-specific data, so that probably won't be part
>> > >> > of
>> > 2.4
>> > >> > unless other things delay the release for a while.  There are
no
>> > >> > APIs
>> > for
>> > >> > it yet and pluggable UIs have been discussed but not agreed upon.
>> > >> > I
>> > >> think
>> > >> > requirements from Spark could be useful in helping shape what
>> > >> > gets
>> > done
>> > >> > there.
>> > >> >
>> > >> > -Sandy
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Tue, Jan 7, 2014 at 4:13 PM, Patrick Wendell
>> > >> > <pwendell@gmail.com>
>> > >> > wrote:
>> > >> >
>> > >> > > Hey Sandy,
>> > >> > >
>> > >> > > Do you know what the status is for YARN-321 and what version
of
>> > >> > > YARN it's targeted for? Also, is there any kind of
>> > >> > > documentation or API
>> > for
>> > >> > > this? Does it control the presentation of the data itself
(e.g.
>> > >> > > it actually has its own UI)?
>> > >> > >
>> > >> > > @Tom - having an optional history server sounds like a good
idea.
>> > >> > >
>> > >> > > One question is what format to use for storing the data and
how
>> > >> > > the persisted format relates to XML/HTML generation in the
live
>> > >> > > UI. One idea would be to add JSON as an intermediate format
>> > >> > > inside of the current WebUI, and then any JSON page could
be
>> > >> > > persisted and
>> > rendered
>> > >> > > by the history server using the same code. Once a SparkContext
>> > >> > > exits it could dump a series of named paths each with a JSON
>> > >> > > file. Then
>> > the
>> > >> > > history server could load those paths and pass them through
the
>> > second
>> > >> > > rendering stage (JSON => XML) to create each page.
>> > >> > >
>> > >> > > It would be good if SPARK-969 had a good design doc before
>> > >> > > anyone starts working on it.
>> > >> > >
>> > >> > > - Patrick
>> > >> > >
>> > >> > > On Tue, Jan 7, 2014 at 3:18 PM, Sandy Ryza
>> > >> > > <sandy.ryza@cloudera.com
>> > >
>> > >> > > wrote:
>> > >> > > > As a sidenote, it would be nice to make sure that whatever
>> > >> > > > done
>> > here
>> > >> > will
>> > >> > > > work with the YARN Application History Server (YARN-321),
a
>> > generic
>> > >> > > history
>> > >> > > > server that functions similarly to MapReduce's
>> JobHistoryServer.
>> >  It
>> > >> > will
>> > >> > > > eventually have the ability to store application-specific
data.
>> > >> > > >
>> > >> > > > -Sandy
>> > >> > > >
>> > >> > > >
>> > >> > > > On Tue, Jan 7, 2014 at 2:51 PM, Tom Graves
>> > >> > > > <tgraves_cs@yahoo.com>
>> > >> > wrote:
>> > >> > > >
>> > >> > > >> I don't think you want to save the html/xml files.
I would
>> > >> > > >> rather
>> > >> see
>> > >> > > the
>> > >> > > >> info saved into a history file in like a json format
that
>> > >> > > >> could
>> > then
>> > >> > be
>> > >> > > >> re-read and the web ui display the info, hopefully
without
>> > >> > > >> much
>> > >> change
>> > >> > > to
>> > >> > > >> the UI parts.  For instance perhaps the history
server could
>> > >> > > >> read
>> > >> the
>> > >> > > file
>> > >> > > >> and populate the appropriate Spark data structures
that the
>> > >> > > >> web
>> > ui
>> > >> > > already
>> > >> > > >> uses.
>> > >> > > >>
>> > >> > > >> I would suggest making it so the history server
is an
>> > >> > > >> optional
>> > >> server
>> > >> > > and
>> > >> > > >> could be run on any node. That way if the load on
a
>> > >> > > >> particular
>> > node
>> > >> > > becomes
>> > >> > > >> to much it could be moved, but you also could run
it on the
>> > >> > > >> same
>> > >> node
>> > >> > as
>> > >> > > >> the Master.  All it really needs to know is where
to get the
>> > history
>> > >> > > files
>> > >> > > >> from and have access to that location.
>> > >> > > >>
>> > >> > > >> Hadoop actually has a history server for MapReduce
which
>> > >> > > >> works
>> > very
>> > >> > > >> similar to what I mention above.   One thing to
keep in minds
>> > here
>> > >> is
>> > >> > > >> security.  You want to make sure that the history
files can
>> > >> > > >> only
>> > be
>> > >> > > read by
>> > >> > > >> users who have the appropriate permissions.  The
history
>> > >> > > >> server
>> > >> itself
>> > >> > > >> could run as  a superuser who has permission to
server up
>> > >> > > >> the
>> > files
>> > >> > > based
>> > >> > > >> on the acls.
>> > >> > > >>
>> > >> > > >>
>> > >> > > >>
>> > >> > > >> On Tuesday, January 7, 2014 8:06 AM, "Xia, Junluan"
<
>> > >> > > junluan.xia@intel.com>
>> > >> > > >> wrote:
>> > >> > > >>
>> > >> > > >> Hi all
>> > >> > > >>          Spark job web ui will not be available
when job is
>> > >> > > >> over,
>> > >> but
>> > >> > it
>> > >> > > >> is convenient for developer to debug with persisting
job web
>> ui.
>> > I
>> > >> > just
>> > >> > > >> come up with draft for this issue.
>> > >> > > >>
>> > >> > > >> 1.       We could simply save the web page with
html/xml
>> > >> > > >> format(stages/executors/storages/environment) to
certain
>> > >> > > >> location
>> > >> when
>> > >> > > job
>> > >> > > >> finished
>> > >> > > >>
>> > >> > > >> 2.       But it is not easy for user to review the
job info
>> with
>> > #1,
>> > >> > we
>> > >> > > >> could build extra job history service for developers
>> > >> > > >>
>> > >> > > >> 3.       But where will we build this history service?
In
>> Driver
>> > >> node
>> > >> > or
>> > >> > > >> Master node?
>> > >> > > >>
>> > >> > > >> Any suggestions about this improvement?
>> > >> > > >>
>> > >> > > >> regards,
>> > >> > > >> Andrew
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message