tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikas Saha <bi...@hortonworks.com>
Subject RE: Using ByteBuffers for Configuration
Date Fri, 26 Jul 2013 20:50:43 GMT
The payload abstracts out the different ways in which user runtime code
can be initialized. The default in Hadoop code base is Configuration. So
we have a helper method to convert conf to userpayload. Other code
projects may have different means of configuring their code and we
possibly couldn't define a common interface for all of them.

That being said, we probably need to revisit the API's and impl of the Tez
engine so that ambiguities like this are ironed out.


-----Original Message-----
From: Achal Soni [mailto:asoni@twitter.com]
Sent: Thursday, July 25, 2013 9:31 PM
To: dev@tez.incubator.apache.org
Subject: Re: Using ByteBuffers for Configuration

Ah. I am respecting these semantics. Actually in local mode, the
application attempt property needs to be set. This is not a client side
property -- it is something the job runner should be setting. I was
setting it in the configuration that gets passed to the local job runner,
and then passing that configuration to the

initialize(Configuration conf, byte[] userPayload,  Master master)

method of MRRuntimeTask with the hopes that this property would be
propagated forward. Unfortunately upon looking at MRRuntimeTask, I see
that the bytebuffer is used as the configuration instead of the
configuration that is being passed in. I have fixed my issues by
deserialized the bytebuffer, setting it, and serializing it again. But it
just felt as if that original conf is pointless. I know it's being passed
in in case the userpayload is empty, but I just viewed the the use of
UserPayload in a different way. I think I understand what you are aiming
for. But I am not sure what it's use cases may be. What are your thoughts
on the potential use of the userpayload?

On Thu, Jul 25, 2013 at 6:17 PM, Bikas Saha <bikas@hortonworks.com> wrote:

> Any user related stuff, including config, is expected to be passed as
> a user payload via the Vertex API. For the MR case, the only payload
> that needs to get sent is Configuration and so that is serialized to
> userpayload and sent over. Not quite sure how your impl of the local
> mode handles these semantics.
> -----Original Message-----
> From: Achal Soni [mailto:asoni@twitter.com]
> Sent: Thursday, July 25, 2013 5:36 PM
> To: dev@tez.incubator.apache.org
> Subject: Using ByteBuffers for Configuration
> Hello All,
> I am slightly concerned about using ByteBuffers as means for
> localizing the per vertex configuration.
> I thought the purpose of the byte buffers were to allow Pig/Hive to
> send additional information (not necessarily in the form of
> Configuration
> objects) to the vertices. This seems to defeat the purpose because we
> are using it exclusively to deserialize the configuration object for
> the vertex.
> I am running into this issue at the initialize method of
> MRRuntimeTask. In local mode, I am setting some additional configs in
> the Configuration object for that vertex and passing that forward, so
> it should propagate in the following method:
> initialize(Configuration conf, byte[] userPayload,
>       Master master)
> But now I see that I have to instead deserialize the user payload
> myself, set the configs there, and then reserialize them. This is not
> a big change or anything but it got my thinking.
> Thoughts on this would be great!
> - Achal

View raw message