lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <>
Subject [jira] [Resolved] (SOLR-6526) Solr Streaming API
Date Thu, 05 Feb 2015 19:56:34 GMT


Joel Bernstein resolved SOLR-6526.
    Resolution: Duplicate

This ticket has been superseded by SOLR-7082. 

> Solr Streaming API
> ------------------
>                 Key: SOLR-6526
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>            Reporter: Joel Bernstein
>             Fix For: Trunk
>         Attachments: SOLR-6526.patch
> It would be great if there was a SolrJ library that could connect to Solr's /export handler
(SOLR-5244) and perform streaming operations on the sorted result sets.
> This ticket defines the base interfaces and implementations for the Streaming API. The
base API contains three classes:
> *SolrStream*: This represents a stream from a single Solr instance. It speaks directly
to the /export handler and provides methods to read() Tuples and close() the stream
> *CloudSolrStream*: This represents a stream from a SolrCloud collection. It speaks with
Zk to discover the Solr instances in the collection and then creates SolrStreams to make the
requests. The results from the underlying streams are merged inline to produce a single sorted
stream of tuples.
> *Tuple*: The data structure returned by the read() method of the SolrStream API. It is
nested to support grouping and Cartesian product set operations.
> Once these base classes are implemented it paves the way for building *Decorator* streams
that perform operations on the sorted Tuple sets. For example:
> {code}
> //Create three CloudSolrStreams to different solr cloud clusters. They could be anywhere
in the world.
> SolrStream stream1 = new CloudSolrStream(zkUrl1, queryRequest1, "a"); // Alias this stream
as "a"
> SolrStream stream2 = new CloudSolrStream(zkUrl2, queryRequest2, "b"); // Alias this stream
as "b"
> SolrStream stream3 = new CloudSolrStream(zkUrl3, queryRequest3, "c"); // Alias this stream
as "c"
> // Merge Join stream1 and stream2 using a comparator to compare tuples.
> MergeJoinStream joinStream1 = new MergeJoinStream(stream1, stream2, new MyComp());
> //Hash join the tuples from the joinStream1 with stream3 the HashKey()'s define the hashKeys
for tuples 
> HashJoinStream joinStream2 = new HashJoinStream(joinStream1,stream3, new HashKey(), new
> //Sum the aliased fields from the joined tuples.
> SumStream sumStream1 = new SumStream(joinStream2, "a.field1");
> SumStream sumStream2 = new SumStream(sumStream1, "b.field2");
> Tuple t = null;
> //Read from the stream until it's finished.
> while((t != sumStream2().read()) != null);
> //Get the sums from the joined data.
> long sum1 = sumStream1.getSum();
> long sum2 = sumStream2.getSum();
> {code}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message