hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Ad-hoc reports against HBase - any way? any tools?
Date Fri, 25 Feb 2011 21:54:29 GMT
Hi J-D,

Yes, I'm interested in HBase-Hive integration.
Thanks for the pointer to the external tables.  I was aware of that at some 
point, but for some reason started thinking that data copying is necessary.

Are there any gotchas or serious limitations around this integration?

Thanks,
Otis





----- Original Message ----
> From: Jean-Daniel Cryans <jdcryans@apache.org>
> To: user@hbase.apache.org
> Sent: Fri, February 25, 2011 4:17:09 PM
> Subject: Re: Ad-hoc reports against HBase - any way? any tools?
> 
> We use the HBase+Hive integration here for ad-hoc queries, I don't
> understand  the data duplication you're talking about... when you
> create an external  table you can directly query your existing tables.
> We run with the latest  patch posted in HIVE-1634 since we have a lot
> of binary values and I made a  very very hacky patch to be able to use
> our binary composite row  keys.
> 
> I'll be happy to give you more details if you want to try going  down that 
>road.
> 
> J-D
> 
> On Fri, Feb 25, 2011 at 1:02 PM, Otis  Gospodnetic
> <otis_gospodnetic@yahoo.com>  wrote:
> > Hello,
> >
> > I have a HBase cluster chock-full of data  and would like to run canned 
>reports
> > (i.e.,
> >
> > reports  known ahead of time), but also ad-hoc reports against that data.
> > Are  there any open-source or commercial tools one can use?
> >
> > Here's  what I *think* I know so far, but please correct me wherever I wrong, 
>
>so
> >  I don't spread false info:
> >
> > * Use HBase-Hive Integration
> >   Pluses:
> >    - lots of tools to query Hive are available
> >   Minuses:
> >    - data duplication
> >    - Hive's copy of data is  always behind
> >    - I heard the integration is fairly alpha (e.g. you  can't copy deltas to
> > Hive, you have to copy all data every time you want  to update your Hive 
>store)
> >
> > * Use Pig
> >  https://issues.apache.org/jira/browse/PIG-970
> >  https://issues.apache.org/jira/browse/PIG-1205
> >  Pluses:
> >     - runs directly against HBase, no need to copy data
> >  Minuses:
> >     - PigLatin learning curve - in my case people wanting ad-hoc reports are  
>
>not
> >
> > techies
> >    - No pretty front-end with syntax  highlighting or visual querying or 
that
> > accepts SQL and translates it to  PigLatin
> >
> > * Use PigPen
> >  Pluses:
> >    - Visual ==  easy
> >  Minuses:
> >    - Looks abandoned justing by http://search-hadoop.com/m/Noacz1MECC7 and
> > https://issues.apache.org/jira/browse/PIG-366
> >
> > * Use Toad  for Cloud
> >  Pluses:
> >    - accepts SQL, runs, and returns  data
> >    - runs directly against HBase, no need to copy data
> >   Minuses:
> >    - some people reported it crashes
> >    - it allows  the person querying the data to also modify the data, which 
>is
> > bad in my  environment
> >
> > * Datameer DAS, Karmasphere Analyst, Pentaho,  Beeswax -- they all seem to 
be
> > able to get the
> >
> > data out  of Hive, but not out of HBase.  More info below:
> >
> > *  Pentaho
> >    * http://www.pentaho.com/products/hadoop/ - looks like it  supports only 
>Hive
> >    * http://forums.pentaho.com/showthread.php?77926-HBase-and-ETL
> >    * http://search-hadoop.com/?q=pentaho&src=moz-search
> >
> > *  Datameer
> >    * http://wiki.datameer.com/display/DAS1/DAS+Supported+Platforms - looks 
>like
> > it
> >
> > supports only Hive
> >    * http://wiki.datameer.com/display/DAS11/Using+the+Plug-in+SDK - looks 
>like
> > one
> >
> > can add support for HBase by writing a  plugin?
> >
> > Karmasphere Analyst
> >    * http://www.karmasphere.com/Products-Information/karmasphere-analyst.html 
>
>-
> >
> > Hive only
> >
> >
> > Is any of the above  incorrect?
> > Did I miss a tool, free or non-free, that I could use to run  ad-hoc reports
> > against data in HBase?
> >
> > Thanks,
> >  Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop -  HBase
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
> 


Mime
View raw message