spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <>
Subject Re: HIVE SparkSQL
Date Wed, 18 Mar 2015 06:26:07 GMT

Depending non your needs, search technology, such as SolrCloud or
ElasticSearch makes more sense. If you go for the Cassandra solution you
can use the lucene text indexer...
I am not sure if hive or sparksql are very suitable for text. However, if
you do not need text search then feel free to go for them.
What kind of statistics / aggregates want to get out of of your logs?

Best regards
Le 18 mars 2015 04:29, "宫勐" <> a écrit :

> Hi:
>    I need to migrate a Log Analysis System from mysql + some C++ real time
> computer framwork to Hadoop ecosystem.
>    When I want to build a data warehouse. don't know which one is the
> right choice. Cassandra? HIVE? Or just SparkSQL ?
>     There is few benchmark for these systems.
>     My scenario as below:
>     Every 5 seconds, flume will translate a log file from IDC.   The log
> file is pre-format to adapt Mysql Load event。 There is many IDCs,and will
> close down OR reconnect to the flume random.
>     Every online IDC must receive analyse of their LOG every 5mins
> Any Suggestion?
> Thanks
> Yours
> Meng

View raw message