spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcin Cylke <>
Subject Re: Using Neo4j with Apache Spark
Date Thu, 12 Mar 2015 08:17:13 GMT
On Thu, 12 Mar 2015 00:48:12 -0700
d34th4ck3r <> wrote:

> I'm trying to use Neo4j with Apache Spark Streaming but I am finding
> serializability as an issue.
> Basically, I want Apache Spark to parse and bundle my data in real
> time. After, the data has been bundled it should be stored in the
> database, Neo4j. However, I am getting this error:


It seems some things in your task aren't serializable. A quick look at
the code suggests graphDB as a potential problem. 

If you want to create that in one place (driver) and fetch it later in
the step you can do sth like this:

- create a container class, that you will broadcast

class LazyGraphDB extends Serializable {
  override lazy val graphDB = new GraphDatabase()

- than in driver code:

val graphDbBc = sc.broadcast(new LazyGraphDB)

- and in the task you'd like to use it, just write:


Just remember about all the "transient, lazy" modifiers.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message