crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Chu <rob...@wibidata.com>
Subject Re: Saving collection to text files in Scrunch
Date Thu, 29 Nov 2012 21:41:29 GMT
If I remember correctly, this is an issue caused by avro strings not being
written properly to text files.


On Thu, Nov 29, 2012 at 8:40 AM, Roman V. Shapovalov <
shapovalov@graphics.cs.msu.su> wrote:

> Hi Josh,
>
> The trick you suggested works the way I expected: it saves strings as text.
>
> Thank you!
> Roman
>
> On Thu, Nov 29, 2012 at 8:03 PM, Josh Wills <jwills@cloudera.com> wrote:
> > Hey Roman,
> >
> > While I take a look at that, would you try using the writeTextFile
> function
> > (e.g., writeTextFile(<pcollection>, args(1)) ) and let me know if that
> does
> > the trick?
> >
> > Josh
> >
> >
> > On Thu, Nov 29, 2012 at 6:34 AM, Roman V. Shapovalov
> > <shapovalov@graphics.cs.msu.su> wrote:
> >>
> >> Dear crunch-users,
> >>
> >> I am trying to solve some toy MapReduce problem using Scrunch. When I
> >> write the final result in the pipeline app, i.e. call
> >>
> >> write(to.textFile(args(1)))
> >>
> >> and get object names in the output file, like:
> >>
> >> org.apache.avro.mapred.AvroWrapper@80
> >> org.apache.avro.mapred.AvroWrapper@17a73
> >>
> >> This happens only if I perform some mapping (even identity); just
> >> reading and writing results in good strings in the file.
> >>
> >> It seems that mapping wraps the strings using the AvroWrapper, but
> >> writing to the text file does not unwrap them. Is it supposed to
> >> unwrap them?
> >>
> >> There is a factory method To.formattedFile() in Crunch (I guess it may
> >> help, but it is not documented), but it is not ported to Scrunch. Is
> >> there another idiom for writing strings?
> >>
> >> Thanks in advance,
> >> Roman
> >
> >
> >
> >
> > --
> > Director of Data Science
> > Cloudera
> > Twitter: @josh_wills
> >
>

Mime
View raw message