lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yandong Yao <yydz...@gmail.com>
Subject Re: SolrCloud result correctness compared with single core
Date Fri, 30 Jan 2015 02:10:34 GMT
Pretty helpful, thanks Erick!

2015-01-24 9:48 GMT+08:00 Erick Erickson <erickerickson@gmail.com>:

> you might, but probably not enough to notice. At 50G, the tf/idf
> stats will _probably_ be close enough you won't be able to tell.
>
> That said, recently distributed tf/idf has been implemented but
> you need to ask for it, see SOLR-1632. This is Solr 5.0 though.
>
> I've rarely seen it matter except in fairly specialized situations.
> Consider a single core. Deleted documents still count towards
> some of the tf/idf stats. So your scoring could theoretically
> change after, say, an optimize.
>
> So called "bottom line" is that yes, the scoring may change, but
> IMO not any more radically than was possible with single cores,
> and I wouldn't worry about unless I had evidence that it was
> biting me.
>
> Best
> Erick
>
> On Fri, Jan 23, 2015 at 2:52 PM, Yandong Yao <yydzero@gmail.com> wrote:
>
> > Hi Guys,
> >
> > As the main scoring mechanism is based tf/idf, so will same query running
> > against SolrCloud return different result against running it against
> single
> > core with same data sets as idf will only count df inside one core?
> >
> > eg: Assume I have 100GB data:
> > A) Index those data using single core
> > B) Index those data using SolrCloud with two cores (each has 50GB data
> > index)
> >
> > Then If I query those with same query like 'apple', then will I get
> > different result for A and B?
> >
> >
> > Regards,
> > Yandong
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message