hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yavuz gokirmak <ygokir...@gmail.com>
Subject Re: Change data capture tool for hbase
Date Tue, 04 Jun 2013 06:41:22 GMT
Hi Asaf,

This CDC pattern will be used for directing changes to another system,
Assume I have a table "hbase_alarms" in hbase with columns
"Severity,Source,Time" and tracking changes with this CDC tool.  Some
external system is putting alarms with their severity and source to
hbase_alarms table .

Now I have a source system and I need to take some action tracking changes.
For example one example may be inserting "some" critical alarms to another
table in rdms database as well. So using such kind of CDC tool, I can write
rules like that "if severity=critical and source=router insert record to
psql_alarms" .


This is just an example, as I wrote I am planning implement this tool as
flume source so I can take any action on any system using flume sinks. (
calling a webservice, doing an http request, writing to file etc... )

In RDMS world CDC pattern works like an triggering mechanism but it is much
more efficient than triggers (cdc tools extracts change information from
logs asynchronously therefore they do lengthen transaction ).

regards..



On 4 June 2013 06:57, Asaf Mesika <asaf.mesika@gmail.com> wrote:

> What's wrong with HBase native Master Slave replicate, or am I missing
> something here?
>
>
> On Mon, Jun 3, 2013 at 12:16 PM, yavuz gokirmak <ygokirmak@gmail.com>
> wrote:
>
> > Hi all,
> >
> > Currently we are working on a hbase change data capture (CDC) tool. I
> want
> > to share our ideas and continue development according to your feedback.
> >
> > As you know CDC tools are used for tracking the data changes and take
> > actions according to these changes[1].  For example in relational
> > databases, CDC tools are mainly used for replication. You can replicate
> > your source system continuously to another location or db using CDC
> tool.So
> > whenever an insert/update/delete is done on the source system, you can
> > reflect the same operation to the replicated environment.
> >
> > As I've said, we are working on a CDC tool that can track changes on a
> > hbase table and reflect those changes to any other system in real-time.
> >
> > What we are trying to implement the tool in a way that he will behave as
> a
> > slave cluster. So if we enable master-master replication in the source
> > system, we expect to get all changes and act accordingly. Once the proof
> of
> > concept cdc tool is implemented ( we need one week ) we will convert it
> to
> > a flume source. So using it as a flume source we can direct data changes
> to
> > any destination (sink)
> >
> > This is just a summary.
> > Please write your feedback and comments.
> >
> > Do you know any tool similar to this proposal?
> >
> > regards.
> >
> >
> >
> >
> >
> > 1- http://en.wikipedia.org/wiki/Change_data_capture
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message