tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikas Saha <bi...@hortonworks.com>
Subject RE: Using ByteBuffers for Configuration
Date Fri, 26 Jul 2013 20:50:43 GMT
The payload abstracts out the different ways in which user runtime code
can be initialized. The default in Hadoop code base is Configuration. So
we have a helper method to convert conf to userpayload. Other code
projects may have different means of configuring their code and we
possibly couldn't define a common interface for all of them.

That being said, we probably need to revisit the API's and impl of the Tez
engine so that ambiguities like this are ironed out.

Bikas

-----Original Message-----
From: Achal Soni [mailto:asoni@twitter.com]
Sent: Thursday, July 25, 2013 9:31 PM
To: dev@tez.incubator.apache.org
Subject: Re: Using ByteBuffers for Configuration

Ah. I am respecting these semantics. Actually in local mode, the
application attempt property needs to be set. This is not a client side
property -- it is something the job runner should be setting. I was
setting it in the configuration that gets passed to the local job runner,
and then passing that configuration to the

initialize(Configuration conf, byte[] userPayload,  Master master)

method of MRRuntimeTask with the hopes that this property would be
propagated forward. Unfortunately upon looking at MRRuntimeTask, I see
that the bytebuffer is used as the configuration instead of the
configuration that is being passed in. I have fixed my issues by
deserialized the bytebuffer, setting it, and serializing it again. But it
just felt as if that original conf is pointless. I know it's being passed
in in case the userpayload is empty, but I just viewed the the use of
UserPayload in a different way. I think I understand what you are aiming
for. But I am not sure what it's use cases may be. What are your thoughts
on the potential use of the userpayload?



On Thu, Jul 25, 2013 at 6:17 PM, Bikas Saha <bikas@hortonworks.com> wrote:

> Any user related stuff, including config, is expected to be passed as
> a user payload via the Vertex API. For the MR case, the only payload
> that needs to get sent is Configuration and so that is serialized to
> userpayload and sent over. Not quite sure how your impl of the local
> mode handles these semantics.
>
> -----Original Message-----
> From: Achal Soni [mailto:asoni@twitter.com]
> Sent: Thursday, July 25, 2013 5:36 PM
> To: dev@tez.incubator.apache.org
> Subject: Using ByteBuffers for Configuration
>
> Hello All,
>
> I am slightly concerned about using ByteBuffers as means for
> localizing the per vertex configuration.
>
> I thought the purpose of the byte buffers were to allow Pig/Hive to
> send additional information (not necessarily in the form of
> Configuration
> objects) to the vertices. This seems to defeat the purpose because we
> are using it exclusively to deserialize the configuration object for
> the vertex.
>
> I am running into this issue at the initialize method of
> MRRuntimeTask. In local mode, I am setting some additional configs in
> the Configuration object for that vertex and passing that forward, so
> it should propagate in the following method:
>
> initialize(Configuration conf, byte[] userPayload,
>
>       Master master)
>
> But now I see that I have to instead deserialize the user payload
> myself, set the configs there, and then reserialize them. This is not
> a big change or anything but it got my thinking.
>
> Thoughts on this would be great!
>
> - Achal
>

Mime
View raw message