cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy Hanna (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1497) Add input support for Hadoop Streaming
Date Fri, 08 Oct 2010 15:19:32 GMT


Jeremy Hanna commented on CASSANDRA-1497:

Just as a status update.  I ran into some issues with using Avro as an input format for now
- avro maps currently require strings as their keys.  There is talk about facilitating other
types of keys in the future -
- and I should probably create a ticket on that.

However for now, it looks like I'll need to go with thrift for the input streaming solution.
 So far I've created an AbstractColumnFamilyRecordReader and AbstractColumnFamilyInputFormat
that pulls out a lot of the code, and then created thrift and avro specific extensions of
each.  I've created a thrift input writer, required for input streaming.  I've also created
an initial that uses the thrift deserialization.

Anyway, so that's where things stand.  I'm getting the kinks out and will hopefully have patches
soon.  I'll probably create a separate ticket for a thrift based output format for sake of
completeness.  The avro specific CFRR and CFIF will just be there and functional for when
the avro stuff becomes available - already had it done and it works in Java.

> Add input support for Hadoop Streaming
> --------------------------------------
>                 Key: CASSANDRA-1497
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>            Reporter: Jeremy Hanna
>            Assignee: Jeremy Hanna
>             Fix For: 0.7.0
>         Attachments: 0001-1497-foundation-changes.patch
> related to CASSANDRA-1368 - create similar functionality for input streaming.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message