spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco Colombo <ing.marco.colo...@gmail.com>
Subject Hive and distributed sql engine
Date Mon, 25 Jul 2016 06:48:10 GMT
Hi all!
Among other use cases, I want to use spark as a distributed sql engine
via thrift server.
I have some tables in postegres and Cassandra: I need to expose them via
hive for custom reporting.
Basic implementation is simple and works, but I have some concerns and open
question:
- is there a better approach rather than mapping a temp table as a select
of the full table?
- What about query setup cost? I mean, is there a way to avoid db
connection setup costs using a pre-created connection pool?
- is it possibile from hiveql to use functions defined in the pg database
or should I have to rewrite them as udaf?

Thanks!



-- 
Ing. Marco Colombo

Mime
View raw message