spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Konwinski <>
Subject Fwd: Spark Streaming: Flume stream not found
Date Thu, 22 May 2014 16:46:04 GMT
I'm forwarding this email along which contains a question from a Spark user
Adrien (CC'd) who can't successfully get any emails through to the Apache
mailing lists.

Please reply-all when responding to include Adrien. See below for his
---------- Forwarded message ----------
From: "Adrien Legrand" <>
Date: May 22, 2014 1:06 AM
Subject: Re: Post validation
To: "Andy Konwinski" <>

Thanks, it would be nice ! Here is the question:

Title: Spark Streaming: Flume stream not found

Hi everyone,

I am currently trying to process a flume (avro) stream with spark streaming
on a yarn cluster. Everything is fine when I try to launch my code locally.
 To do so, I use the following args :

master = "local"
host, port = the machine I'm sending the stream to with flume (I triple
checked the concordance between flume / spark).

                val ssc = new StreamingContext(master, "FlumeEventCount",

                val stream = FlumeUtils.createStream(ssc, host, port.toInt)

But when I try to launch the same job parallelized (replacing "local" by
"yarn-standalone"), the jar is launched (I can see some print I used to
debug the code) but it shows the expected output (from the data processing)
only 1 time out of 5 or 10.
Here is the complete command line:

$SPARK_HOME/bin/spark-class org.apache.spark.deploy.yarn.Client --jar
/home/www/loganalysis-1.0-SNAPSHOT-jar-with-dependencies.jar --class
com.loganalysis.Computation --args yarn-standalone --args 9999 --num-workers 6 --master-memory 4g
--worker-memory 2g
--worker-cores 1

For no apparent reasons, sometimes the processing is done. The first guess
was that, since I use a big jars with all dependencies in it, other
machines don't have those dependencies and thus can't do the processing.
That's why I tried to add my executed jar with -addJars but the result was
the same.

2014-05-22 7:18 GMT+02:00 Andy Konwinski <>:

> One other idea occurred to me. You can try subscribing to the nabble
> mailing lists and send your emails to those. They will relay emails to the
> apache lists. Not sure if it'll help or not.
> I'm really sorry to hear you're having problems with the lists.
> If you forward me your questions I can relay them to the mailing list for
> you.
> On Wed, May 21, 2014 at 1:56 AM, Adrien Legrand <>wrote:
>> Hello Andy,
>> Thank you for your answer.
>> I used the mailing list's form to submit both of the subjects. Is it
>> still possible that the problem you've talked about may be the cause ?
>> I am also directly checking my mails in gmail, I don't use any mailing
>> client like outlook.
>> The only thing I'm thinking about the problem here is my company's proxy,
>> but it seems really unlikely...
>> Thank you,
>> Adrien
>> 2014-05-20 18:36 GMT+02:00 Andy Konwinski <>:
>> Hi Adrien,
>>> I'm not positive but I suspect something about how you at up your email
>>> client (or forwarding or something) may be triggering spam filters.
>>> In my gmail a note pops up in the email from you that says: "this
>>> message may not have been sent by".
>>> Other than that I'm not sure what else might be going wrong. You can try
>>> pinging the apache infra people for more help too.
>>> On May 20, 2014 2:42 AM, <> wrote:
>>>> Hello Andy,
>>>> I posted 2 different topics (the important one is "Spark streaming:
>>>> flume stream not found") on the spark user mailing list, but none of them
>>>> was accepted.
>>>> I registered myself before posting each subject and I think I didn't
>>>> made any mistakes.
>>>> After posting the last (and important) subject, I received the
>>>> following mail:
>>>> Hi. This is the qmail-send program at <>
>>>> I'm afraid I wasn't able to deliver your message to the following
>>>> addresses.
>>>> This is a permanent error; I've given up. Sorry it didn't work out.
>>>> Did I do something wrong ?
>>>> Thank you,
>>>> Adrien
>>>> _____________________________________
>>>> Sent from

View raw message