tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Achal Soni <as...@twitter.com>
Subject Re: Using ByteBuffers for Configuration
Date Fri, 26 Jul 2013 20:57:47 GMT
Ah yes I see that makes sense. Although to be fair, if there is an
expectation that there will be processors and such that extend beyond Map
and Reduce, they will still be part of the Hadoop code base and a common
means to configure should be established -- which is the Configuration
interface.


On Fri, Jul 26, 2013 at 1:50 PM, Bikas Saha <bikas@hortonworks.com> wrote:

> The payload abstracts out the different ways in which user runtime code
> can be initialized. The default in Hadoop code base is Configuration. So
> we have a helper method to convert conf to userpayload. Other code
> projects may have different means of configuring their code and we
> possibly couldn't define a common interface for all of them.
>
> That being said, we probably need to revisit the API's and impl of the Tez
> engine so that ambiguities like this are ironed out.
>
> Bikas
>
> -----Original Message-----
> From: Achal Soni [mailto:asoni@twitter.com]
> Sent: Thursday, July 25, 2013 9:31 PM
> To: dev@tez.incubator.apache.org
> Subject: Re: Using ByteBuffers for Configuration
>
> Ah. I am respecting these semantics. Actually in local mode, the
> application attempt property needs to be set. This is not a client side
> property -- it is something the job runner should be setting. I was
> setting it in the configuration that gets passed to the local job runner,
> and then passing that configuration to the
>
> initialize(Configuration conf, byte[] userPayload,  Master master)
>
> method of MRRuntimeTask with the hopes that this property would be
> propagated forward. Unfortunately upon looking at MRRuntimeTask, I see
> that the bytebuffer is used as the configuration instead of the
> configuration that is being passed in. I have fixed my issues by
> deserialized the bytebuffer, setting it, and serializing it again. But it
> just felt as if that original conf is pointless. I know it's being passed
> in in case the userpayload is empty, but I just viewed the the use of
> UserPayload in a different way. I think I understand what you are aiming
> for. But I am not sure what it's use cases may be. What are your thoughts
> on the potential use of the userpayload?
>
>
>
> On Thu, Jul 25, 2013 at 6:17 PM, Bikas Saha <bikas@hortonworks.com> wrote:
>
> > Any user related stuff, including config, is expected to be passed as
> > a user payload via the Vertex API. For the MR case, the only payload
> > that needs to get sent is Configuration and so that is serialized to
> > userpayload and sent over. Not quite sure how your impl of the local
> > mode handles these semantics.
> >
> > -----Original Message-----
> > From: Achal Soni [mailto:asoni@twitter.com]
> > Sent: Thursday, July 25, 2013 5:36 PM
> > To: dev@tez.incubator.apache.org
> > Subject: Using ByteBuffers for Configuration
> >
> > Hello All,
> >
> > I am slightly concerned about using ByteBuffers as means for
> > localizing the per vertex configuration.
> >
> > I thought the purpose of the byte buffers were to allow Pig/Hive to
> > send additional information (not necessarily in the form of
> > Configuration
> > objects) to the vertices. This seems to defeat the purpose because we
> > are using it exclusively to deserialize the configuration object for
> > the vertex.
> >
> > I am running into this issue at the initialize method of
> > MRRuntimeTask. In local mode, I am setting some additional configs in
> > the Configuration object for that vertex and passing that forward, so
> > it should propagate in the following method:
> >
> > initialize(Configuration conf, byte[] userPayload,
> >
> >       Master master)
> >
> > But now I see that I have to instead deserialize the user payload
> > myself, set the configs there, and then reserialize them. This is not
> > a big change or anything but it got my thinking.
> >
> > Thoughts on this would be great!
> >
> > - Achal
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message