spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andre Schumacher (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-1383) Spark-SQL: ParquetRelation improvements
Date Tue, 01 Apr 2014 14:44:26 GMT
Andre Schumacher created SPARK-1383:
---------------------------------------

             Summary: Spark-SQL: ParquetRelation improvements
                 Key: SPARK-1383
                 URL: https://issues.apache.org/jira/browse/SPARK-1383
             Project: Spark
          Issue Type: Improvement
    Affects Versions: 1.0.0
            Reporter: Andre Schumacher


Improve Spark-SQL's ParquetRelation as follows:
- Instead of files a ParquetRelation is should be backed by a directory, which simplifies
importing data from other sources
- InsertIntoParquetTable operation should supports switching between overwriting or appending
(at least in HiveQL)
- tests should use the new API
- Parquet logging should be forwarded to Log4J
- It should be possible to enable compression (default compression for Parquet files: GZIP,
as in parquet-mr)
- OverwriteCatalog should support dropping of tables





--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message