> Does parquet file has limit in size ( 1TB ) ?
I did’t see any problem but 1TB is too big to operation need to divide into small pieces.
> Should we use SaveMode.APPEND for long running streaming app ?
Yes, but you need to partition it by time so it easy to maintain like update or delete a specific
time without rebuild them all.
> How should we store in HDFS (directory structure, ... )?
Should partition the file into small pieces.
> On Aug 28, 2016, at 9:43 PM, Kevin Tran <kevintvh@gmail.com> wrote:
>
> Hi,
> Does anyone know what is the best practises to store data to parquet file?
> Does parquet file has limit in size ( 1TB ) ?
> Should we use SaveMode.APPEND for long running streaming app ?
> How should we store in HDFS (directory structure, ... )?
>
> Thanks,
> Kevin.
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
|