lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Chyla <roman.ch...@gmail.com>
Subject Re: Processing a lot of results in Solr
Date Wed, 24 Jul 2013 15:23:18 GMT
Mikhail,
It is a slightly hacked JSONWriter - actually, while poking around, I have
discovered that dumping big hitsets would be possible - the main hurdle
right now, is that writer is expecting to receive docuemnts with fields
loaded, but if it received something that loads docs lazily, you could
stream thousands and thousands of recs just as it is done with the normal
response - standard operation. Well, people may cry this is not how SOLR is
meant to operate ;-)

roman


On Wed, Jul 24, 2013 at 5:28 AM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:

> Roman,
>
> Can you disclosure how that streaming writer works? What does it stream
> docList or docSet?
>
> Thanks
>
>
> On Wed, Jul 24, 2013 at 5:57 AM, Roman Chyla <roman.chyla@gmail.com>
> wrote:
>
> > Hello Matt,
> >
> > You can consider writing a batch processing handler, which receives a
> query
> > and instead of sending results back, it writes them into a file which is
> > then available for streaming (it has its own UUID). I am dumping many GBs
> > of data from solr in few minutes - your query + streaming writer can go
> > very long way :)
> >
> > roman
> >
> >
> > On Tue, Jul 23, 2013 at 5:04 PM, Matt Lieber <mlieber@impetus.com>
> wrote:
> >
> > > Hello Solr users,
> > >
> > > Question regarding processing a lot of docs returned from a query; I
> > > potentially have millions of documents returned back from a query. What
> > is
> > > the common design to deal with this ?
> > >
> > > 2 ideas I have are:
> > > - create a client service that is multithreaded to handled this
> > > - Use the Solr "pagination" to retrieve a batch of rows at a time
> > ("start,
> > > rows" in Solr Admin console )
> > >
> > > Any other ideas that I may be missing ?
> > >
> > > Thanks,
> > > Matt
> > >
> > >
> > > ________________________________
> > >
> > >
> > >
> > >
> > >
> > >
> > > NOTE: This message may contain information that is confidential,
> > > proprietary, privileged or otherwise protected by law. The message is
> > > intended solely for the named addressee. If received in error, please
> > > destroy and notify the sender. Any use of this email is prohibited when
> > > received in error. Impetus does not represent, warrant and/or
> guarantee,
> > > that the integrity of this communication has been maintained nor that
> the
> > > communication is free of errors, virus, interception or interference.
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  <mkhludnev@griddynamics.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message