james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Charles <e...@apache.org>
Subject Re: GSoC: Avro Serialization over HBase
Date Tue, 12 Jun 2012 12:40:24 GMT
True.
What do you intend to store in Avro format (these bytes being retrieved 
by any means on the RPC side)?
Thx, Eric

On 06/12/2012 02:14 PM, Ioan Eugen Stan wrote:
> Hi,
>
>  From what I know Avro deprecation is for RPC communication. The
> Put/Delete/ etc operations are serialized with Avro instead of the
> usual Writables. Regardless of what serialization the RPC sub-system
> uses, the data stored by the operations (Put/Get/Delete) is viewed as
> byte array. If we store Avro objects as binary blobs in HBase then we
> have no issues.
>
> Cheers,
>
> 2012/6/12 Mihai Soloi<mihai.soloi@gmail.com>:
>> On 12.06.2012 11:30, Eric Charles wrote:
>>>
>>> Hi Mihai,
>>>
>>> Glad to hear your exams are over (I hope they went fine) :)
>>
>> Hi Eric,
>>
>> Thanks, they went very well, I got high marks.
>>
>>>
>>> As Ioan said, Avro serialization HBase will be deprecated in favor of
>>> Protobuf (if I understand well...).
>>
>>
>> I think Avro could be changed rather easily with Protobuf as they're both
>> doing basically the same thing, only that Avro uses JSON schemas and can be
>> used with any other language, which is of no of value to the project.
>>
>>>
>>> I also like Avro because it gives you serialization&  storage format in
>>> one box, but is this what we want? The key point here is more an effective
>>> access to the persisted data.
>>
>>
>> If the data is passed through Avro we'll have it serialized and
>> deserialization is basically handled by Avro, but we'll always have to
>> interact with the schemas. In Protobuf we have the objects compiled into our
>> classes, from what i gather it's mostly usefull for RPC, but Avro also has
>> the protocol in which by using the avro-maven-plugin you can generate you
>> own classes with which to interact. I can't say I'm an expert in either but
>> I fancy Avro.
>>
>>>
>>>
>>> There has been a few tentatives so far to marry HBase and Lucene (see [1],
>>> [2], [3] and [4] for example, see also [5] for a more recent article).
>>>
>> Thank you for the github links, i will look thouroughly through the
>> projects. I was already aware of Basene and Solandra(former Lucandra), they
>> have simillar aproaches.
>>
>>> The questions I am wondering:
>>>
>>> 1. Will you focus on a 'generic' solution (reusable outside James), or on
>>> a very specific one tuned/optimized only for James mailbox needs?
>>
>> I was thinking of writing generic code so that maybe it could be used
>> outside of James but the data format would be specific to James mailbox
>> needs, so the answer in the end is that it will be tuned for James.
>>
>>>
>>> 2. What strategy will you take (custom Directory or custom
>>> IndexReader/Writer, usage of Coprocessor or not...)?
>>
>> I was thinking that a custom Directory was the way to go, but I soon
>> realized that it's not as simple as it sounds and overriding the higher
>> level classes of IndexReader and IndexWriter would be more appropriate.(as
>> in article [5]) So by bypassing the Directory I would have to make use of
>> Hbase Coprocessors. As far as I can think of it, a RegionObserver would be
>> employed to gather frequently performed on data for the Lucene queries and
>> Endpoints.
>>
>>
>>
>> [1] https://github.com/akkumar/hbasene
>> [2] https://github.com/thkoch2001/lucehbase
>> [3] https://github.com/jasonrutherglen/HBASE-SEARCH
>> [4] https://github.com/jasonrutherglen/LUCENE-FOR-HBASE
>> [5] http://www.infoq.com/articles/LuceneHbase
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
>> For additional commands, e-mail: server-dev-help@james.apache.org
>>
>
>
>

-- 
eric | http://about.echarles.net | @echarles

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Mime
View raw message