gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alfonso Nishikawa <alfonso.nishik...@gmail.com>
Subject Re: Gora-174 in Gora-Cassandra
Date Wed, 06 Feb 2013 23:23:12 GMT
Forgot about your last question.

I suggest to create a sub-task. Can you create one? If not, I will create
it for you (Menu "More Actions > Create sub-task").

Best regards,

Alfonso Nishikawa

2013/2/6 Alfonso Nishikawa <alfonso.nishikawa@gmail.com>

> Hi Renato,
> I saw in the code that Cassandra has its own serializers. Can you give us
> a small summary about how does it works and what affects before your
> modifications? This will help understanding your aproaches.
> Does Cassandra have some penalties for the new column? In HBase that
> approach is not necessary since the union-index gets serialized (by Avro)
> and stored before the proper data (I know you know that :) just
> remembering).
> About generating classes, there's no need to modify the compiler (check if
> you really need to modify it). Taking into account that an union can't have
> 2 same types (avro specs):
> - When you are writing, you can implement the approach of avro show in
> GenericData#resolveUnion():333 [0] (avro 1.3.3) called from [1], where
> iterates on union types until matches the type of the data being written.
> - When reading, you know the index. The aproach of Avro is in [2].
> I suggest not modifying (if possible) because for HBase it gets a
> duplicated state, where one will be ignored and becomes noise in the
> structures.
> My oppinion, of course :)
> Thanks for all!!
> Best regards,
> Alfonso Nishikawa
> [0] -
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/avro/1.3.3/org/apache/avro/generic/GenericData.java?av=f#333
> [1] - GenericDatumWriter#write():59 -
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/avro/1.3.3/org/apache/avro/generic/GenericDatumWriter.java?av=f#59
> [2] - GenericDatumReader#read():84 -
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/avro/1.3.3/org/apache/avro/generic/GenericDatumReader.java?av=f#77
> 2013/2/6 Renato MarroquĂ­n Mogrovejo <renatoj.marroquin@gmail.com>
>> Hi all,
>> This is a really long overdue email. Finally I got the time to get
>> around to this while I am on holidays (:
>> I've made some changes to the Gora-Cassandra to support AvroUnion data
>> types even though Cassandra doesn't rely on Avro for serializing data.
>>  So what it has been done is a workaround to save specialized data
>> types e.g. UNIONS. I faced the same problems and doubts that Alfonso
>> described, and Alfonso, your post was very illustrative mate ;)
>> I will just explain the general approach so the changes can be
>> understood and the changes themselves can be found inside the code, or
>> reply to this email to talk about it.
>> ** For storing Union data **
>> We are creating a new column only on at the moment in which we are
>> flushing the data into the data store. This generated column will
>> store the index of the schema used within the Union data type.
>> ** For retrieving Union data **
>> Retrieving the data directly from Cassandra, Gora can make it by
>> itself. The problem here was to determine which serializer to use
>> while getting this data back. So the first thing to do is to get the
>> value stored within the generated column, and use that value to select
>> the appropriate serializer. After that is just using what Gora has in
>> it.
>> ** For generating classes **
>> I am not particularly happy with the changes I've made here. I changed
>> GoraCompiler directly to create the extra field to store the selected
>> schema of the Union data type. I tried to only add a new field to the
>> schema before compiling and then let the compiler work but I kept on
>> getting a lock exception from Avro which didn't let me get through
>> this change as I wanted. If anybody could help me out on how to do it,
>> then  give me a shout! :)
>> I didn't know where to upload this patch or to Gora-174 because it
>> addresses an issues caused by it, or to create a new issue to handle
>> the Avro Union per data store.
>> Thanks for reading until the end!
>> Renato M.
> --
> "Drinking bloody marys all night will make you feel like a corpse in the
> morning."

"Drinking bloody marys all night will make you feel like a corpse in the

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message