gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Renato MarroquĂ­n Mogrovejo <renatoj.marroq...@gmail.com>
Subject Re: Document about GORA-174
Date Wed, 06 Feb 2013 18:43:50 GMT
Hi all,

I am really sorry it has taken me so long to get to this thread,
anyways let's get down to the important parts (:
While reviewing Alfonso's emails and some Avro documentation, I think
Alfonso's proposal is the best approach right now.
I mean we will end up paying a price to have the chance to persist
optional data types using Avro Union. The price we will be paying this
time is storing an extra column whenever we use the Union data type in
order to keep track of what type of data we had stored.
So similar to Alfonso's example, we first stored:

 col: This is the text

After implementing this, we would be storing:

 col_name index value
 -------- ----- ----------------------
 col :     \01   This is the text

I think though we shouldn't use the word "index" because it can be
misleading. Maybe using "colName_index"? I am not sure about this yet,
we should reach a consensus on this one my friends.
I have made several several changes to the Cassandra Module but I
would like to discuss them in a separate thread, but in general terms
I also think this is the way we should go.

Renato M.

2013/1/17 Alfonso Nishikawa <alfonso.nishikawa@gmail.com>:
> Hi, Lewis.
> It refers to both Gora and Avro.
> About Avro, very hidden in documentation [0] talks about default value in
> unions.
> About Gora and specifically HBase, it doesn't matter what is the value in
> "default":"..." (schema) because it is not being used to read/store. For
> example: HBaseStore#newInstance() doesn't fill values for not present
> "family:column"s in HBase.
> I am pretty sure the best option will be implement the "possible solution"
> configuration option. But maybe make it not deprecated, sincre sometimes
> will be desirable to write raw data on top level columns (not serialized
> records) like still happens in /trunk revision (with the restriction about
> null shown in [0]), so It could be read directly calling HBase interface.
> And maybe should be something configurable when creating the DataStore.
> What do you think about this?
> Thank you very much for the feedback! :)
> Best,
> Alfonso Nishikawa
> [0] - http://avro.apache.org/docs/current/spec.html#schema_record section
> "Records > fields > default"
> 2013/1/17 Lewis John Mcgibbney <lewis.mcgibbney@gmail.com>
>> Hi Alfonso,
>> When you say that "...the first element in the union is considered as the
>> default element, at this moment it is not implemented nor planned" does
>> this refer to Avro?
>> On Sunday, January 13, 2013, Alfonso Nishikawa <
>> alfonso.nishikawa@gmail.com>
>> wrote:
>> > Hello everybody.
>> >
>> > I wrote an article [0] regarding GORA-174 where I try to explain a
>> > compatibility issue with old data in HBase.
>> > I really don't know how it affects other backends. Need some info if
>> anyone
>> > knows. (@Renato: maybe you can tell me something about how is it in
>> > Cassandra :)
>> > I will appreciate your thoughts :)
>> >
>> > Thank you very much!
>> >
>> > Alfonso Nishikawa
>> >
>> > [0] - http://people.apache.org/~alfonsonishikawa/gora-174.html
>> >
>> --
>> *Lewis*
> --
> "Drinking bloody marys all night will make you feel like a corpse in the
> morning."

View raw message