spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: SparkSQL and star schema
Date Sat, 14 Feb 2015 17:30:17 GMT
Yes.  Though for good performance it is usually important to make sure that
you have statistics for the smaller dimension tables.  Today that can be
done by creating them in the hive metastore and running " ANALYZE TABLE
table COMPUTE STATISTICS noscan".

In Spark 1.3 this will happen automatically when you create a table using
the datasources API.

CREATE TABLE myTable
USING parquet
OPTIONS (path "/...")


On Fri, Feb 13, 2015 at 2:06 AM, Paolo Platter <paolo.platter@agilelab.it>
wrote:

>  Hi,
>
>  is SparkSQL + Parquet suitable to replicate a star schema ?
>
>  Paolo Platter
> AgileLab CTO
>
>

Mime
View raw message