spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco Mistroni <mmistr...@gmail.com>
Subject Re: [TorrentBroadcast] Pyspark Application terminated saying "Failed to get broadcast_1_ piece0 of broadcast_1 in Spark 2.0.0"
Date Thu, 29 Dec 2016 19:58:00 GMT
Hello
 no sorry i dont have any further insight into that.... i have seen similar
errors but for completely different issues, and in most of hte cases it had
to do with my data or my processing rather than Spark itself.
What does your program do? you say it runs for 2-3 hours, what is the
logic? just processing a huge amount of data?
doing ML ?
i'd say break down your application..  reduce functionality , run and see
outcome. then add more functionality, run and see again.
I found myself doing htese kinds of things when i got errors in my spark
apps.

To get a concrete help you will have to trim down the code to a few lines
that can reproduces the error  That will be a great start

Sorry for not being of much help

hth
 marco





On Thu, Dec 29, 2016 at 12:00 PM, Palash Gupta <spline_palash@yahoo.com>
wrote:

> Hi Marco,
>
> Thanks for your response.
>
> Yes I tested it before & am able to load from linux filesystem and it also
> sometimes have similar issue.
>
> However in both cases (either from hadoop or linux file system), this
> error comes in some specific scenario as per my observations:
>
> 1. When two parallel spark separate application is initiated from one
> driver (not all the time, sometime)
> 2. If one spark jobs are running for more than expected hour let say 2-3
> hours, the second application terminated giving the error.
>
> To debug the problem for me it will be good if you can share some possible
> reasons why failed to broadcast error may come.
>
> Or if you need more logs I can share.
>
> Thanks again Spark User Group.
>
> Best Regards
> Palash Gupta
>
>
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Thu, 29 Dec, 2016 at 2:57 pm, Marco Mistroni
> <mmistroni@gmail.com> wrote:
> Hi
>  Pls try to read a CSV from filesystem instead of hadoop. If you can read
> it successfully then your hadoop file is the issue and you can start
> debugging from there.
> Hth
>
> On 29 Dec 2016 6:26 am, "Palash Gupta" <spline_palash@yahoo.com. invalid>
> wrote:
>
>> Hi Apache Spark User team,
>>
>>
>>
>> Greetings!
>>
>> I started developing an application using Apache Hadoop and Spark using
>> python. My pyspark application randomly terminated saying "Failed to get
>> broadcast_1*" and I have been searching for suggestion and support in
>> Stakeoverflow at Failed to get broadcast_1_piece0 of broadcast_1 in
>> pyspark application
>> <http://stackoverflow.com/questions/41236661/failed-to-get-broadcast-1-piece0-of-broadcast-1-in-pyspark-application>
>>
>>
>> Failed to get broadcast_1_piece0 of broadcast_1 in pyspark application
>> I was building an application on Apache Spark 2.00 with Python 3.4 and
>> trying to load some CSV files from HDFS (...
>>
>> <http://stackoverflow.com/questions/41236661/failed-to-get-broadcast-1-piece0-of-broadcast-1-in-pyspark-application>
>>
>>
>> Could you please provide suggestion registering myself in Apache User
>> list or how can I get suggestion or support to debug the problem I am
>> facing?
>>
>> Your response will be highly appreciated.
>>
>>
>>
>> Thanks & Best Regards,
>> Engr. Palash Gupta
>> WhatsApp/Viber: +8801817181502
>> Skype: palash2494
>>
>>
>>

Mime
View raw message