spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyukjin Kwon <>
Subject Inconsistent file extensions and omitting file extensions written by CSV, TEXT and JSON data sources.
Date Wed, 09 Mar 2016 05:49:09 GMT
Hi all,

Currently, the output from CSV, TEXT and JSON data sources does not have
file extensions such as .csv, .txt and .json (except for compression
extensions such as .gz, .deflate and .bz4).

In addition, it looks Parquet has the extensions such as .gz.parquet or
.snappy.parquet according to compression codecs whereas ORC does not have
such extensions but it is just .orc.

I tried to search some JIRAs related with this but I could not find yet
but I did not open a JIRA directly because I feel like this is already

Maybe could I open a JIRA for this inconsistent file extensions?

It would be thankful if you give me some feedback


View raw message