spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Read from file and broadcast before every Spark Streaming bucket?
Date Fri, 30 Jan 2015 09:22:44 GMT
You should say what errors you see. But I assume it is because you try to
create broadcast variables on the executors. Why? Sounds like you already
have the data you want everywhere to read locally.
On Jan 30, 2015 4:06 AM, "YaoPau" <jonrgregg@gmail.com> wrote:

> I'm creating a real-time visualization of counts of ads shown on my
> website,
> using that data pushed through by Spark Streaming.
>
> To avoid clutter, it only looks good to show 4 or 5 lines on my
> visualization at once (corresponding to 4 or 5 different ads), but there
> are
> 50+ different ads that show on my site.
>
> What I'd like to do is quickly change which ads to pump through Spark
> Streaming, without having to rebuild the .jar and push it to my edge node.
> Ideally I'd have a .csv file on my edge node with a list of 4 ad names, and
> every time a StreamRDD is created it reads from that tiny file, creates a
> broadcast variable, and uses that variable as a filter.  That way I could
> just open up the .csv file, save it, and the stream filters correctly
> automatically.
>
> I keep getting errors when I try this.  Has anyone had success with a
> broadcast variable that updates with each new streamRDD?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Read-from-file-and-broadcast-before-every-Spark-Streaming-bucket-tp21433.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message