crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-97) Add helpers for parsing PCollection<String> instances
Date Wed, 12 Dec 2012 20:06:19 GMT


Gabriel Reid commented on CRUNCH-97:

I don't have any direct use or need for it right now, but I do have the feeling that something
like this is a really useful addition to Crunch, so I think it would be a shame to close this
out now. I'm pretty indifferent about the Scanner vs Tokenizer discussion, but I don't have
much context to base an opinion on for now.

In any case, even if this would be primarily a help for prototyping, I think that that is
more than enough reason to include it. It's a similar situation to reflection-based Avro --
it might not be what you want to use in production, but it's incredibly useful for quick iterations
in development. In any case, I'm very much in favour of adding this (in one of its incarnations)
to Crunch.

> Add helpers for parsing PCollection<String> instances
> -----------------------------------------------------
>                 Key: CRUNCH-97
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>             Fix For: 0.5.0
>         Attachments: CRUNCH-97.patch, CRUNCH-97-take2.patch, CRUNCH-97-Tokenizer-v1.patch,
CRUNCH-97v3.patch, CRUNCH-97v4.patch
> We should make it a bit easier to parse delimited text files into specific data types
(e.g., ints, floats, etc.) or combinations of types-- e.g., pairs of strings and ints, a Tuple3
of booleans, etc.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message