spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kyle Ellrott <kellr...@soe.ucsc.edu>
Subject Re: Implementing TinkerPop on top of GraphX
Date Thu, 06 Nov 2014 23:38:19 GMT
I think I've already done most of the work for the OLTP objects (Graph,
Element, Vertex, Edge, Properties) when implementing Tinkerpop2. Singleton
write operations, like addVertex/deleteEdge, were cached locally until a
read operation was requested, then the set of build operations where
parallelized into an RDD and merged with the existing graph.
Its not efficient for large numbers of operations, but it passes unit tests
and works for small graph tweaking.

OLAP stuff looks completely new, but considering they have a Giraph
implementation, it should be pretty straight forward.

Kyle


On Thu, Nov 6, 2014 at 3:25 PM, York, Brennon <Brennon.York@capitalone.com>
wrote:

> This was my thought exactly with the TinkerPop3 release. Looks like, to
> move this forward, we’d need to implement gremlin-core per <
> http://www.tinkerpop.com/docs/3.0.0.M1/#_implementing_gremlin_core>. The
> real question lies in whether GraphX can only support the OLTP
> functionality, or if we can bake into it the OLAP requirements as well. At
> a first glance I believe we could create an entire OLAP system. If so, I
> believe we could do this in a set of parallel subtasks, those being the
> implementation of each of the individual API’s (Structure, Process, and, if
> OLAP, GraphComputer) necessary for gremlin-core. Thoughts?
>
>
> From: Kyle Ellrott <kellrott@soe.ucsc.edu>
> Date: Thursday, November 6, 2014 at 12:10 PM
> To: Kushal Datta <kushal.datta@gmail.com>
> Cc: Reynold Xin <rxin@databricks.com>, "York, Brennon" <
> brennon.york@capitalone.com>, "dev@spark.apache.org" <dev@spark.apache.org>,
> Matthias Broecheler <matthias@thinkaurelius.com>
> Subject: Re: Implementing TinkerPop on top of GraphX
>
> I still have to dig into the Tinkerpop3 internals (I started my work long
> before it had been released), but I can say that to get the Tinerpop2
> Gremlin pipeline to work in the GraphX was a bit of a hack. The
> whole Tinkerpop2 Gremlin design was based around streaming pipes of
> data, rather then large distributed map-reduce operations. I had to hack
> the pipes to aggregate all of the data and pass a single object wrapping
> the GraphX RDDs down the pipes in a single go, rather then streaming it
> element by element.
> Just based on their description, Tinkerpop3 may be more amenable to the
> Spark platform.
>
> Kyle
>
>
> On Thu, Nov 6, 2014 at 11:55 AM, Kushal Datta <kushal.datta@gmail.com>
> wrote:
>
>> What do you guys think about the Tinkerpop3 Gremlin interface?
>> It has MapReduce to run Gremlin operators in a distributed manner and
>> Giraph to execute vertex programs.
>>
>> The Tinkpop3 is better suited for GraphX.
>>
>> On Thu, Nov 6, 2014 at 11:48 AM, Kyle Ellrott <kellrott@soe.ucsc.edu>
>> wrote:
>>
>>> I've taken a crack at implementing the TinkerPop Blueprints API in
>>> GraphX (
>>> https://github.com/kellrott/sparkgraph ). I've also implemented
>>> portions of
>>> the Gremlin Search Language and a Parquet based graph store.
>>> I've been working out finalize some code details and putting together
>>> better code examples and documentation before I started telling people
>>> about it.
>>> But if you want to start looking at the code, I can answer any questions
>>> you have. And if you would like to contribute, I would really appreciate
>>> the help.
>>>
>>> Kyle
>>>
>>>
>>> On Thu, Nov 6, 2014 at 11:42 AM, Reynold Xin <rxin@databricks.com>
>>> wrote:
>>>
>>> > cc Matthias
>>> >
>>> > In the past we talked with Matthias and there were some discussions
>>> about
>>> > this.
>>> >
>>> > On Thu, Nov 6, 2014 at 11:34 AM, York, Brennon <
>>> > Brennon.York@capitalone.com>
>>> > wrote:
>>> >
>>> > > All, was wondering if there had been any discussion around this topic
>>> > yet?
>>> > > TinkerPop <https://github.com/tinkerpop> is a great abstraction
for
>>> > graph
>>> > > databases and has been implemented across various graph database
>>> backends
>>> > > / gaining traction. Has anyone thought about integrating the
>>> TinkerPop
>>> > > framework with GraphX to enable GraphX as another backend? Not sure
>>> if
>>> > > this has been brought up or not, but would certainly volunteer to
>>> > > spearhead this effort if the community thinks it to be a good idea!
>>> > >
>>> > > As an aside, wasn¹t sure if this discussion should happen on the
>>> board
>>> > > here or on JIRA, but a made a ticket as well for reference:
>>> > > https://issues.apache.org/jira/browse/SPARK-4279
>>> > >
>>> > > ________________________________________________________
>>> > >
>>> > > The information contained in this e-mail is confidential and/or
>>> > > proprietary to Capital One and/or its affiliates. The information
>>> > > transmitted herewith is intended only for use by the individual or
>>> entity
>>> > > to which it is addressed.  If the reader of this message is not the
>>> > > intended recipient, you are hereby notified that any review,
>>> > > retransmission, dissemination, distribution, copying or other use
>>> of, or
>>> > > taking of any action in reliance upon this information is strictly
>>> > > prohibited. If you have received this communication in error, please
>>> > > contact the sender and delete the material from your computer.
>>> > >
>>> > >
>>> > > ---------------------------------------------------------------------
>>> > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> > > For additional commands, e-mail: dev-help@spark.apache.org
>>> > >
>>> > >
>>> >
>>>
>>
>>
>
> ------------------------------
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed.  If the reader of this message is not the
> intended recipient, you are hereby notified that any review,
> retransmission, dissemination, distribution, copying or other use of, or
> taking of any action in reliance upon this information is strictly
> prohibited. If you have received this communication in error, please
> contact the sender and delete the material from your computer.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message