sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shengjie min <kelvin....@gmail.com>
Subject Using Sqoop to merge/union databases
Date Sat, 03 Aug 2013 21:14:55 GMT
Hi All,

I've asked this question in HBase mailing list, people suggested me better off ask it here
:) so here I am. I am new to sqoop and having a use case where there is a few applications
running in house independently, Let's say applications A, B, C. Each has its own DB associated.
I wanna create a aggregated view on all the databases so that I don't have to jump into different
dbs to find the info I need. Simply example will be all three applications have a table called
"users", they are v similar, I wanna union the "users" table.

I've had a look at sqoop, looks like it allows me to move data from database A,B,C to a single/centralised
place - e.g. HBase? 

The solution I am looking for ideally need to do the followings:

1. the centralised storage keeps updated reasonably quick as the original db (A, B, C) gets
updated. By all means, I am not looking for one time bulk import, I wanna have incremental
updates after the initial import.
2. As long as I provide a schema mapping, Can A,B,C be imported to a single place, e.g. single
HBase table.

now, my question is:

Is Sqoop a suitable tool for this? I was originally considering to use mangodb and write the
periodic/parallel import piece myself. But for now, I am leaning towards sqoop more since
in house we have hadoop running already. Any advices are highly appreciated!

Thanks,
Shengjie 
Mime
View raw message