lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghuveer Kancherla <raghuveer.kanche...@aplopio.com>
Subject Re: Retrieving large num of docs
Date Sat, 05 Dec 2009 17:05:49 GMT
Hi Otis,
I think my experiments are not conclusive about reduction in search time. I
was playing around with various configurations to reduce the time to
retrieve documents from Solr. I am sure that making the two multi valued
text fields from stored to un-stored, retrieval time (query time + time to
load the stored fields) became very fast. I was expecting the
lazyfieldloading setting in solrconfig to take care of this but apparently
it is not working as expected.

Out of curiosity, I removed these 2 fields from the index (this time I am
not even indexing them) and my search time got better (10 times better).
However, I am still trying to isolate the reason for the search time
reduction. It may be either because of 2 less fields to search in or because
of the reduction in size of the index or may be something else. I am not
sure if lazyfieldloading has any part in explaining this.

- Raghu



On Fri, Dec 4, 2009 at 3:07 AM, Otis Gospodnetic <otis_gospodnetic@yahoo.com
> wrote:

> Hm, hm, interesting.  I was looking into something like this the other day
> (BIG indexed+stored text fields).  After seeing enableLazyFieldLoading=true
> in solrconfig and after seeing "fl" didn't include those big fields, I
> though "hm, so Lucene/Solr will not be pulling those large fields from disk,
> OK".
>
> You are saying that this may not be true based on your experiment?
> And what I'm calling your "experiment" means that you reindexed the same
> data, but without the 2 multi-valued text fields... .and that was the only
> change you made and got cca x10 search performance improvement?
>
> Sorry for repeating your words, just trying to confirm and understand.
>
> Thanks,
> Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> ----- Original Message ----
> > From: Raghuveer Kancherla <raghuveer.kancherla@aplopio.com>
> > To: solr-user@lucene.apache.org
> > Sent: Thu, December 3, 2009 8:43:16 AM
> > Subject: Re: Retrieving large num of docs
> >
> > Hi Hoss,
> >
> > I was experimenting with various queries to solve this problem and in one
> > such test I remember that requesting only the ID did not change the
> > retrieval time. To be sure, I tested it again using the curl command
> today
> > and it confirms my previous observation.
> >
> > Also, enableLazyFieldLoading setting is set to true in my solrconfig.
> >
> > Another general observation (off topic) is that having a moderately large
> > multi valued text field (~200 entries) in the index seems to slow down
> the
> > search significantly. I removed the 2 multi valued text fields from my
> index
> > and my search got ~10 time faster. :)
> >
> > - Raghu
> >
> >
> > On Thu, Dec 3, 2009 at 2:14 AM, Chris Hostetter wrote:
> >
> > >
> > > : I think I solved the problem of retrieving 300 docs per request for
> now.
> > > The
> > > : problem was that I was storing 2 moderately large multivalued text
> fields
> > > : though I was not retrieving them during search time.  I reindexed all
> my
> > > : data without storing these fields. Now the response time (time for
> Solr
> > > to
> > > : return the http response) is very close to the QTime Solr is showing
> in
> > > the
> > >
> > > Hmmm....
> > >
> > > two comments:
> > >
> > > 1) the example URL from your previous mail...
> > >
> > > : >
> > >
> >
> http://localhost:1212/solr/select/?rows=300&q=%28ResumeAllText%3A%28%28%28%22java+j2ee%22+%28java+j2ee%29%29%29%5E4%29%5E1.0%29&start=0&wt=python
> > >
> > > ...doesn't match your earlier statemnet that you are only returning hte
> id
> > > field (there is no "fl" param in that URL) ... are you certain you
> werent'
> > > returning those large stored fields in teh response?
> > >
> > > 2) assuming you were actually using an fl param to limit the fields,
> make
> > > sure you have this setting in your solrconfig.xml...
> > >
> > >    true
> > >
> > > ..that should make it pretty fast to return only a few fields of each
> > > document, even if you do have some jumpto stored fields that aren't
> being
> > > returned.
> > >
> > >
> > >
> > > -Hoss
> > >
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message