My previous post seems not to be delivered successfully. Try to gzip the
patch. The patch is large since it contains jquery and viz.js.
On Fri, Mar 15, 2013 at 11:02 PM, Chao Shi <stepinto@live.com> wrote:
> Hey guys,
>
> I have a very simple prototype for this. It uses DotfileWriter to generate
> the dot file and renders it with viz.js.
>
> There are lots things that could be improved:
> - show completed/running jobs in different colors, perhaps as well as job
> progress in percentage
> - interactive things on UI, e.g. click on a job will navigate to JT page,
> auto refresh
> - configurable port
> - .. and more
>
> I'd like to hear what do you think of the prototype before continue. A
> quick way to demo it is to patch it and run some integration tests. During
> the integration tests, you can navigate to http://localhost:10080.
>
> On Wed, Feb 27, 2013 at 3:30 PM, Matthias Friedrich <matt@mafr.de> wrote:
>
>> On Wednesday, 2013-02-27, Chao Shi wrote:
>> > I'm developing a complex pipeline (30+ MRs plus lots of joins). I have a
>> > hard time to understand which part of the pipeline spends most running
>> time
>> > and how much intermediate output does it produce. Crunch's optimization
>> > work is great, but it makes the execution plan difficult to be
>> understood.
>> > Each time I modified the pipeline, I have to dump the dot file and run
>> > graphviz to generate a new picture and examine if there's anything
>> wrong.
>> >
>> > About security, I'm not familiar with how Hadoop does it. I will try to
>> > reuse hadoop's HttpServer (does it have something to do with security?).
>> > The bottom line is to make this feature disabled by default, and let
>> users
>> > enable it at their own risk.
>>
>> OK, sounds good.
>>
>> > If this feature is enabled, the user can choose to use unused port or
>> > specified port. I haven't got an idea that how the user know the
>> randomly
>> > picked port (via log?) . I will be working on a prototype version first,
>> > and see if the status page is generally useful.
>>
>> Yeah, logging the URL would probably be the only thing that works. Not
>> counting fancy stuff like MDNS ;-)
>>
>> In my opinion, we should try to get this done with the dependencies that
>> we already get through Hadoop. Each additional library we add to Crunch
>> will cause interoperability problems for someone.
>>
>> Regards,
>> Matthias
>>
>>
>
|