crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish <paliwalash...@gmail.com>
Subject Re: Finding Input Split from DoFn
Date Thu, 22 Nov 2012 13:58:56 GMT
Thanks Josh !

It worked, my inverted index example using Crunch is complete. Slowly
getting addicted to crunch coding style.


On Thu, Nov 22, 2012 at 4:05 PM, Josh Wills <jwills@cloudera.com> wrote:

> getContext() from inside of a DoFn during or after initialize() will
> return the TaskInputOutputContext, which will be a MapContext when you call
> it from a Mapper, and MapContext has a getInputSplit() method. We don't
> normally want a DoFn to worry about whether it's on the map-side or the
> reduce-side of a MapReduce job, so we don't indicate the distinction by
> default, which means you need to do something like:
>
> if (getContext() instanceof MapContext) {
>   InputSplit split = ((MapContext) getContext()).getInputSplit()
> }
>
> which is a little ugly-- sorry about that.
>
> J
>
>
> On Thu, Nov 22, 2012 at 1:45 AM, Ashish <paliwalashish@gmail.com> wrote:
>
>> Hi All,
>>
>> Is there a way to find the InputSplit from within an implementation of
>> DoFn?
>>
>>  I am trying to implement Inverted Index example using crunch. Have
>> tried peeking in DoFn code, but couldn't find a way to retrieve InputSplit.
>> Can someone point me in right direction.
>>
>> --
>> thanks
>> ashish
>>
>> Blog: http://www.ashishpaliwal.com/blog
>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>
>


-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

Mime
View raw message