carbondata-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lewis Goldstein <>
Subject Carbon Data integration with HIVE
Date Fri, 15 Jun 2018 20:00:28 GMT
Happened upon Apache CarbonData while searching for info on other Columnar Data Stores on HDFS.
  As I am looking for ways to accelerate consumption from Hadoop that could cover both large
query, interactive query, and OLAP this technology sounds quite promising.   On initial read
it sounds like CarbonData is considered another Columnar Data Store on HDFS analogous to Parquet
and ORC, but then on further reading it sounds like to load data to this format it must pass
through Spark;  I would like to know if this is truly the case?

Was hoping it would work similar to Parquet and Hive in that one would just define the Hive
Table as external with a designated file type of CarbonData - is this possible or does one
need Spark to be an intermediary?   Is CarbonData actually more like Druid than simply another
Columnar Data Store on HDFS?

Nothing in this message is intended to constitute an electronic signature unless a specific
statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to which it is
addressed. It may contain confidential and/or privileged material. Any review, transmission,
dissemination or other use, or taking of any action in reliance upon this message by persons
or entities other than the intended recipient is prohibited and may be unlawful. If you received
this message in error, please contact the sender and delete it from your computer.

View raw message