Hi
We used spark.sql to create a table using DELTA. We also have a hive metastore attached to the spark session. Hence, a table gets created in Hive metastore. We then tried to query the table from Hive. We faced following issues:
  1. SERDE is SequenceFile, should have been Parquet
  2. Scema fields are not passed.
Essentially the hive DDL looks like:

CREATE TABLE `TABLE NAME`(  `col` array<string> COMMENT 'from deserializer')

ROW FORMAT SERDE   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES (   'path'=WASB PATH')  STORED AS INPUTFORMAT   'org.apache.hadoop.mapred.SequenceFileInputFormat'

OUTPUTFORMAT   'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'  LOCATION   ' WASB PATH'

TBLPROPERTIES (   'spark.sql.create.version'='2.4.0',  'spark.sql.sources.provider'='DELTA',  'spark.sql.sources.schema.numParts'='1',  'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[]}',  'transient_lastDdlTime'='1556544657')

Is this expected? And will the use case be supported in future releases? 


We are now experimenting 

Best

Ayan 


On Fri, Jun 21, 2019 at 11:06 AM Liwen Sun <liwen.sun@databricks.com> wrote:
Hi James,

Right now we don't have plans for having a catalog component as part of Delta Lake, but we are looking to support Hive metastore and also DDL commands in the near future. 

Thanks,
Liwen

On Thu, Jun 20, 2019 at 4:46 AM James Cotrotsios <jamescotrotsios@gmail.com> wrote:
Is there a plan to have a business catalog component for the Data Lake? If not how would someone make a proposal to create an open source project related to that. I would be interested in building out an open source data catalog that would use the Hive metadata store as a baseline for technical metadata.


On Wed, Jun 19, 2019 at 3:04 PM Liwen Sun <liwen.sun@databricks.com> wrote:
We are delighted to announce the availability of Delta Lake 0.2.0!

To try out Delta Lake 0.2.0, please follow the Delta Lake Quickstart:
https://docs.delta.io/0.2.0/quick-start.html

To view the release notes:
https://github.com/delta-io/delta/releases/tag/v0.2.0

This release introduces two main features:

Cloud storage support
In addition to HDFS, you can now configure Delta Lake to read and write data on cloud storage services such as Amazon S3 and Azure Blob Storage. For configuration instructions, please see: https://docs.delta.io/0.2.0/delta-storage.html

Improved concurrency
Delta Lake now allows concurrent append-only writes while still ensuring serializability. For concurrency control in Delta Lake, please see: https://docs.delta.io/0.2.0/delta-concurrency.html

We have also greatly expanded the test coverage as part of this release.

We would like to acknowledge all community members for contributing to this release.

Best regards,
Liwen Sun



--
Best Regards,
Ayan Guha