lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-8888) Add shortestPath Streaming Expression
Date Mon, 28 Mar 2016 17:15:25 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214407#comment-15214407
] 

Joel Bernstein edited comment on SOLR-8888 at 3/28/16 5:14 PM:
---------------------------------------------------------------

First patch which implements a breadth first search using a threaded nested loop join. Each
join in the traversal is split up into batches and is executed in threads within the worker
node. This approach spreads the join across all replicas. The bottleneck in this scenario
will be the network as potentially dozens of search nodes will be returning nodes in parallel
to the same worker to satisfy the join. This bottleneck can be greatly reduced by compression
because the edges are returned sorted by the toField, which will cause large amount of repeated
data to be streamed in the same compression block. SOLR-8910 has been opened to add Lz4 compression
to the /export handler. 

In my last comment I mentioned using sorted memory mapped files for the book keeping. In this
patch all book keeping is done in memory using HashMaps. 


was (Author: joel.bernstein):
First patch which implements a breadth first search using a threaded nested loop join. Each
join in the traversal is split up into batches and is executed in threads within the worker
node. This approach spreads the join across all replicas. The bottleneck in this scenario
will be the network as potentially dozens of search nodes will be returning nodes in parallel
to the same worker to satisfy the join. This bottleneck can be greatly reduced by compression
because the edges are returned sorted by the toField, which will cause large amount of repeated
data to be streamed in the same compression block. SOLR-8910 has been opened to add Lz4 compression
to the /export handler. 

> Add shortestPath Streaming Expression
> -------------------------------------
>
>                 Key: SOLR-8888
>                 URL: https://issues.apache.org/jira/browse/SOLR-8888
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Joel Bernstein
>         Attachments: SOLR-8888.patch
>
>
> This ticket is to implement a distributed shortest path graph traversal as a Streaming
Expression.
> possible expression syntax:
> {code}
> shortestPath(collection, 
>                      from="colA:node1", 
>                      to="colB:node2", 
>                      fq="limiting query", 
>                      maxDepth="10")
> {code}
> This would start from colA:node1 and traverse from colA to colB iteratively until it
finds colB:node2. The shortestPath function would emit Tuples representing the shortest path.
> The optional fq could be used to apply a filter on the traversal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message