spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iqbal Singh <iqbalkhattra.2...@gmail.com>
Subject Re: How more than one spark job can write to same partition in the parquet file
Date Sun, 05 Jan 2020 16:47:04 GMT
Hey Chetan,

I have not got your question. Are you trying to write to a partition from
two actions ?? or you are looking for writing from two jobs. Except for
maintaining the state for the dataset completeness in that case, I dont see
any issues.

We are writing data to a Partition using two different actions in a single
spark job also partition here meant as a HDFS directory, not a hive
partition.



On Thu, Dec 12, 2019 at 1:37 AM ayan guha <guha.ayan@gmail.com> wrote:

> We partitioned data logically for 2 different jobs...in our use case based
> on geography...
>
> On Thu, 12 Dec 2019 at 3:39 pm, Chetan Khatri <chetan.opensource@gmail.com>
> wrote:
>
>> Thanks, If you can share alternative change in design. I would love to
>> hear from you.
>>
>> On Wed, Dec 11, 2019 at 9:34 PM ayan guha <guha.ayan@gmail.com> wrote:
>>
>>> No we faced problem with that setup.
>>>
>>> On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <
>>> chetan.opensource@gmail.com> wrote:
>>>
>>>> Hi Spark Users,
>>>> would that be possible to write to same partition to the parquet file
>>>> through concurrent two spark jobs with different spark session.
>>>>
>>>> thanks
>>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>> --
> Best Regards,
> Ayan Guha
>

Mime
View raw message