spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Best practises to storing data in Parquet files
Date Mon, 29 Aug 2016 07:23:39 GMT
Hi Kevin.

When you say Kafka interacting with Oracle database (if I understand you
correctly) are you using GoldenGate with Kafka interface to push data from
Oracle to Kafka?

HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 29 August 2016 at 03:23, Chanh Le <giaosudau@gmail.com> wrote:

> > Does parquet file has limit in size ( 1TB ) ?
> I did’t see any problem but 1TB is too big to operation need to divide
> into small pieces.
> > Should we use SaveMode.APPEND for long running streaming app ?
> Yes, but you need to partition it by time so it easy to maintain like
> update or delete a specific time without rebuild them all.
> > How should we store in HDFS (directory structure, ... )?
> Should partition the file into small pieces.
>
>
> > On Aug 28, 2016, at 9:43 PM, Kevin Tran <kevintvh@gmail.com> wrote:
> >
> > Hi,
> > Does anyone know what is the best practises to store data to parquet
> file?
> > Does parquet file has limit in size ( 1TB ) ?
> > Should we use SaveMode.APPEND for long running streaming app ?
> > How should we store in HDFS (directory structure, ... )?
> >
> > Thanks,
> > Kevin.
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Mime
View raw message