giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <>
Subject Re: On helping new contributors pitch in quickly...
Date Thu, 05 Apr 2012 15:05:29 GMT
Dan, you're definitely right that this has been mentioned a few times.  
The multigraph issue is one part of it, but a helper VertexInputFormat 
(and maybe VertexOutputFormat) would certainly still help as you 
mention.  Can you please open a JIRA (and help if you have time)?


On 4/5/12 1:49 AM, Dan Brickley wrote:
> On 5 April 2012 05:49, Jakob Homan<>  wrote:
>> Ack!, I suck.  Sorry.  I hadn't realized we'd gone through most of
>> them, which itself is a good thing.  I'll get some new ones added
>> first thing in the morning.  Sorry.
> Do we have something around "document a workflow to get RDF graph data
> into Giraph?". A few of us have been talking about it here or there,
> and I've heard various strategies mentioned (e.g. Ntriples as it's a
> simple line-oriented format; piggybacking on HBase or other storage
> that Giraph already has adaptors for; integrating Apache Jena; ...). I
> can't find much in JIRA but
> touches on the issue
> (since we can't currently easily represent fully general RDF graphs
> since two nodes might be connected by more than one typed edge). Even
> without multigraphs it ought to be possible to bring RDF-sourced data
> into Giraph, e.g. perhaps some app is only interested in say the
> Movies + People subset of a big RDF collection. And so perhaps most of
> the work is in preprocessing for now - e.g. via Ntriples + Pig; but
> still it would be great to have a clear HOWTO.
> As an interested party on the periphery, a JIRA for this would give a
> natural place to monitor, read up, maybe even help. And I'm sure I'm
> not alone...
> cheers,
> Dan

View raw message