lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uwe Schindler <...@thetaphi.de>
Subject Re: Index-time boosting: Deprecated setBoost method
Date Mon, 21 Oct 2019 19:24:07 GMT
No. That's how you do it: BooleanQuery with 2 should clauses.

Or use a different query parser that offers this out of box.

Uwe

Am October 21, 2019 7:16:01 PM UTC schrieb baris.kazar@oracle.com:
>Hi,-
>
>Thanks.
>
>  lets apply to this case:
>
>QueryParser parser = new QueryParser("field1", analyzer) ;
>parser.setPhraseSlop(2);
>Query query = parser.parse("some string value here"+"*");
>TopDocs hits = indexsearcherObject.search(query, 10);
>
>Now i want to use BoostQuery
>
>QueryParser parser = new QueryParser("field1", analyzerObject) ;
>parser.setPhraseSlop(2);
>Query query = parser.parse("some string value here"+"*");
>
>BoostQuery bq = new BoostQuery(query, "2.0f");
>
>TopDocs hits = indexsearcherObject.search(bq, 10);
>
>
>Now how will i process field2 with boost value 1.0f?
>
>Before, this was being done at index time.
>
>
>i can see the only way here is the BooleanQuery which combines
>
>the first boostquery object bq and another one that i need to define
>for 
>bq2 for field2.
>
>is there any other way?
>
>Best regards
>
>
>
>On 10/21/19 2:33 PM, Uwe Schindler wrote:
>> Hi Boris,
>>
>>> That is ok, and i can see this case would be best with BoostQuery
>and
>>> also i dont have to use lucene expression jar and its dependents.
>>>
>>> However, i am curious how to do this kind of field based boosting at
>>> index time even though i will prefer the query time boosting
>methodology.
>> The reason why it was deprecated is exactly the problem I mentioned
>before: It did never do what the user expected. The boost factor given
>in the document's field was multiplied into the per document norms.
>Unfortunately, at the same time, he query normalization was using query
>statistics and normalized the scores. As Lucene is working per field,
>the same normalization is done per field, resulting in the constant
>factor per field to disappear. There was still some effect of index
>time boosting if different documents had different values, but it your
>case all is the same. I am not sure how your queries worked before, but
>the constant boost factors per field at index time did definitely not
>have the effect you were thinking of. Since the earliest version of
>Lucene, boosting at query time was the way to go to have different
>weights per field.
>>
>> The new feature in Lucene is now that you can change the score per
>document using docvalues and apply that per document at query time.
>Previously this was also possible with Document/Field#setBoost, but the
>flexibility was missing (only multiplying and limited precision). In
>addition the normalization effects made the whole thing not reliable.
>>
>> Uwe
>>
>>> Best regards
>>>
>>>
>>> On 10/21/19 12:54 PM, Uwe Schindler wrote:
>>>> Hi,
>>>>
>>>> As I said, before that is a misuse of index-time boosting. In
>addition in
>>> previous versions it did not even work correctly, because of query
>>> normalization it was normalized away anyways. And on top, to change
>it
>>> your have to reindex.
>>>> What you intend to do is a typical use case for query time boosting
>with
>>> BoostQuery. That is explained in almost every book about search,
>like those
>>> about Solr or Elasticsearch.
>>>> Most query parsers also allow to also add boost factors for fields,
>e.g.
>>> SimpleQueryParser (for humans that need simple syntax without
>fields).
>>> There you give a list of fields and boost factors.
>>>> Uwe
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> Achterdiek 19, D-28357 Bremen
>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>> 3A__www.thetaphi.de&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIr
>>> MUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=r7LRZQV82ywkycV4mBw1baHDKxar0wnm
>>> JtLLTiUC0wI&s=Zj32e0QqmZFvPbBlD8DPeh7KHYfOgQr89wvmaRvy_n8&e=
>>>> eMail: uwe@thetaphi.de
>>>>
>>>>> -----Original Message-----
>>>>> From: baris.kazar@oracle.com <baris.kazar@oracle.com>
>>>>> Sent: Monday, October 21, 2019 6:45 PM
>>>>> To: java-user@lucene.apache.org
>>>>> Cc: baris.kazar <baris.kazar@oracle.com>
>>>>> Subject: Re: Index-time boosting: Deprecated setBoost method
>>>>>
>>>>> Hi,-
>>>>>
>>>>> Thanks and i appreciate the disccussion.
>>>>>
>>>>> Let me please  ask this way, i think i give too much info at one
>time:
>>>>>
>>>>> Currently i have this:
>>>>>
>>>>> 
>
>Field  f1= new TextField("field1", "string1", Field.Store.YES);
>
>>>>>
>>>>> doc.add(f1); 
>f1.setBoost(2.0f);
>
>
>>>>>
>>>>> Field f2 = new TextField("field2", "string2", Field.Store.YES);
>
>>>>>
>>>>> doc.add(f2);
>
>>>>>
>>>>> f2.setBoost(1.0f);
>
>
>>>>>
>>>>>
>>>>> But this fails with Lucene 7.7.2.
>>>>>
>>>>>
>>>>> Probably it is more efficient and more flexible to fix this by
>using
>>>>> BoostQuery.
>>>>>
>>>>> However, what could be the fix with index time boosting? the code
>in my
>>>>> previous post was trying to do that.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 10/21/19 12:34 PM, Uwe Schindler wrote:
>>>>>> Hi,
>>>>>>
>>>>>> sorry I don't fully understand what you intend to do? If the
>boost values
>>>>> per field are static and used with exactly same value for every
>document,
>>> it's
>>>>> not needed a index time. You can just boost the field on the query
>side
>>> (e.g.
>>>>> using BoostQuery). Boosting every document with the same static
>values
>>> is
>>>>> an anti-pattern, that's something better suited for the query side
>- as you
>>> are
>>>>> more flexible.
>>>>>> If you need a different boost value per document, you can save
>that
>>> boost
>>>>> value in the index per document using a docvalues field (this
>consumes
>>> extra
>>>>> space, of course). Then you need the ExpressionQuery on the query
>side.
>>> But
>>>>> just because it looks like Javascript, it's not slow. The syntax
>is compiled to
>>>>> bytecode and directly included into the query execution as a
>dynamic java
>>>>> class, so it's very fast.
>>>>>> In short:
>>>>>> - If you need to have a different boost factor per field name
>that's
>>> constant
>>>>> for all documents, apply it at query time with BoostQuery.
>>>>>> - If you have to boost specific documents (e.g., top selling
>products),
>>> index
>>>>> a numeric docvalues field per document. On the query side you can
>use
>>>>> different query types to modify the score of each result based on
>the
>>>>> docvalues field. That can be done with Expression modules (using
>>> compiled
>>>>> Javascript) or by another query in Lucene that operates on
>ValueSource
>>> (e.g.,
>>>>> FunctionQuery). The first one is easier to use for complex
>formulas.4
>>>>>> Uwe
>>>>>>
>>>>>> -----
>>>>>> Uwe Schindler
>>>>>> Achterdiek 19, D-28357 Bremen
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>> 3A__www.thetaphi.de&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIr
>>>>> MUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=70RoM6loHhMGsp95phVzGQf8w5JxW7gX
>>>>> T5XnleMKrOs&s=td7cUfd22mXljSuvkUPXDunkIs_eO4GxdvHHxD2CTk0&e=
>>>>>> eMail: uwe@thetaphi.de
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: baris.kazar@oracle.com <baris.kazar@oracle.com>
>>>>>>> Sent: Monday, October 21, 2019 5:17 PM
>>>>>>> To: java-user@lucene.apache.org
>>>>>>> Cc: baris.kazar <baris.kazar@oracle.com>
>>>>>>> Subject: Re: Index-time boosting: Deprecated setBoost method
>>>>>>>
>>>>>>> Hi,-
>>>>>>>
>>>>>>> Sorry about the missing parts in previous post. please accept
my
>>>>>>> apologies for that.
>>>>>>>
>>>>>>> i needed to add a few more questions/corrections/additions to
>the
>>>>>>> previous post:
>>>>>>>
>>>>>>> Main Question was: if boost is a single constant value, do we
>need the
>>>>>>> Javascript part below?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> === Indexing code snippet for Lucene version 6.6.0 and before===
>>>>>>>
>>>>>>> Document doc = new Document();
>>>>>>>
>>>>>>>
>>>>>>> 
>
>Field  f1= new TextField("field1", "string1",
>Field.Store.YES);
>
>>>>>>>
>>>>>>> doc.add(f1); 
>f1.setBoost(2.0f);
>
>
>>>>>>>
>>>>>>> Field f2 = new TextField("field2", "string2", Field.Store.YES);
>
>>>>>>>
>>>>>>> doc.add(f2);
>
>>>>>>>
>>>>>>> f2.setBoost(1.0f);
>
>
>>>>>>>
>>>>>>> === end of indexing code snippet for Lucene version 6.6.0 and
>before
>>> ===
>>>>>>>
>>>>>>> This turns into this where _boost1 field is associated with
>field1 and
>>>>>>>
>>>>>>> _boost2 field is associated with field2 field:
>>>>>>>
>>>>>>>
>>>>>>> In Indexing code:
>>>>>>>
>>>>>>> === begining of indexing code snippet ===
>>>>>>> Field  f1= new TextField("field1", "string1", Field.Store.YES);
>
>>>>>>>
>>>>>>> Field _boost1 = new NumericDocValuesField(“field1”, 2L);
>>>>>>> doc.add(_boost1);
>>>>>>>
>>>>>>> // If this boost value needs to be stored, a separate
>storedField
>>>>>>> instance needs to be added as well
>>>>>>> … ( i will post this soon)
>>>>>>>
>>>>>>> Field _boost2 = new NumericDocValuesField(“field2”, 1L);
>>>>>>> doc.add(_boost2);
>>>>>>>
>>>>>>> // If this boost value needs to be stored, a separate
>storedField
>>>>>>> instance needs to be added as well
>>>>>>> … ( i will post this soon)
>>>>>>>
>>>>>>> === end of indexing code snippet ===
>>>>>>>
>>>>>>>
>>>>>>> Now, in the searching code (i.e., at query time) should i need
>the
>>>>>>> FunctionScoreQuery because in this case
>>>>>>>
>>>>>>> the boost is just a constant value but not a function? However,
>constant
>>>>>>> value can be argued to be a function with the same value all
the
>time,
>>> too.
>>>>>>>
>>>>>>> == begining of query time code snippet ===
>>>>>>> Expression expr = JavascriptCompiler.compile(“_boost1 +
>_boost2");
>>>>>>>
>>>>>>> 
>
>// SimpleBindings just maps variables to SortField
instances
>
>>>>>>>
>>>>>>> SimpleBindings bindings = new SimpleBindings();
>
>>>>>>>
>>>>>>> bindings.add(new SortField("_boost1", SortField.Type.LONG));
>
>
>//
>>>>> These
>>>>>>> have to LONG type i think since NumericDocValuesField accepts
>"long"
>>>>>>> type only, am i right? Can this be DOUBLE type?
>>>>>>>
>>>>>>> bindings.add(new SortField("_boost2", SortField.Type.LONG));
>
>
>//
>>>>> same
>>>>>>> question here
>>>>>>>
>>>>>>> // create a query that matches based on body:contents but
>
>>>>>>>
>>>>>>> // scores using expr
>
>>>>>>>
>>>>>>> Query query = new FunctionScoreQuery(
>
>>>>>>>
>>>>>>>         new TermQuery(new Term("field1", "term_to_look_for")),
>
>>>>>>>
>>>>>>> expr.getDoubleValuesSource(bindings));
>>>>>>>
>>>>>>> 
>searcher.search(query, 10);
>>>>>>>
>>>>>>> === end of code snippet ===
>>>>>>>
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>> On 10/21/19 11:05 AM, baris.kazar@oracle.com wrote:
>>>>>>>> Hi,-
>>>>>>>>
>>>>>>>>     i would like to ask the following to make it clearer
(for
>me at least):
>>>>>>>>
>>>>>>>> Document doc = new Document();
>>>>>>>>
>>>>>>>> 
>
>Field  f1= new TextField("field1", "string1",
>Field.Store.YES);
>
>>>>>>>>
>>>>>>>> doc.add(f1); 
>f1.setBoost(2.0f);
>
>
>>>>>>>>
>>>>>>>> Field f2 = new TextField("field2", "string2",
>Field.Store.YES);
>
>>>>>>>>
>>>>>>>> doc.add(f2);
>
>>>>>>>>
>>>>>>>> f2.setBoost(1.0f);
>
>
>>>>>>>>
>>>>>>>>
>>>>>>>> This turns into this where _boost1 field is associated with
>field1 and
>>>>>>>>
>>>>>>>> _boost2 field is associated with field2 field:
>>>>>>>>
>>>>>>>>
>>>>>>>> In Indexing code:
>>>>>>>>
>>>>>>>> Field  f1= new TextField("field1", "string1",
>Field.Store.YES);
>
>>>>>>>>
>>>>>>>> Field _boost1 = new NumericDocValuesField(“field1”, 2L);
>>>>>>>> doc.add(_boost1);
>>>>>>>>
>>>>>>>> // If this boost value needs to be stored, a separate
>storedField
>>>>>>>> instance needs to be added as well
>>>>>>>> … ( i will post this soon)
>>>>>>>>
>>>>>>>> Field _boost2 = new NumericDocValuesField(“field2”, 1L);
>>>>>>>> doc.add(_boost2);
>>>>>>>>
>>>>>>>> // If this boost value needs to be stored, a separate
>storedField
>>>>>>>> instance needs to be added as well
>>>>>>>> … ( i will post this soon)
>>>>>>>>
>>>>>>>>
>>>>>>>> Now, in the searching code (i.e., at query time) should i
need
>the
>>>>>>>> FunctionScoreQuery because in this case
>>>>>>>>
>>>>>>>> the boost is just a constant value but not a function? However,
>>>>>>>> constant value can be argued to be a function with the same
>value all
>>>>>>>> the time, too.
>>>>>>>>
>>>>>>>>
>>>>>>>> Expression expr = JavascriptCompiler.compile(“_boost");
>>>>>>>>
>>>>>>>> 
>
>// SimpleBindings just maps variables to SortField
instances
>
>>>>>>>>
>>>>>>>> SimpleBindings bindings = new SimpleBindings();
>
>>>>>>>>
>>>>>>>> bindings.add(new SortField("_boost1", SortField.Type.SCORE));
>
>
>
>>>>>>>>
>>>>>>>> // create a query that matches based on body:contents but
>
>>>>>>>>
>>>>>>>> // scores using expr
>
>>>>>>>>
>>>>>>>> Query query = new FunctionScoreQuery(
>
>>>>>>>>
>>>>>>>>        new TermQuery(new Term("field1", "term_to_look_for")),
>
>>>>>>>>
>>>>>>>> expr.getDoubleValuesSource(bindings));
>>>>>>>>
>>>>>>>> 
>searcher.search(query, 10);
>>>>>>>>
>>>>>>>>
>>>>>>>> So, if boost is a single constant value, do we need the
>Javascript
>>>>>>>> part above?
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/18/19 4:07 PM, baris.kazar@oracle.com wrote:
>>>>>>>>> Uwe,-
>>>>>>>>>
>>>>>>>>>     can this
>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>>>>>> 3A__lucene.apache.org_core_7-5F7-
>>>>>>>
>>> 5F2_expressions_org_apache_lucene_expressions_Expression.html&d=DwID
>>> aQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdI
>>>>>>> bQAiX-
>>>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=MR2S9Z9HEge6s665mtGOFRHKGmuiVYkjp
>>>>>>> 4tXOciYl7A&s=tMCjb5H5KivfJsp-BfABonpjelgp6hn9cBg2GScCmic&e=
>>>>>>>>> doc example that You also gave be extended with
>>>>> NumericDocValuesField
>>>>>>>>> part that needs to be done at indexing time boosting,
too?
>>>>>>>>>
>>>>>>>>> i see now why You meant that this is mixed type of boosting
>(i.e.,
>>>>>>>>> both indexing time and search time).
>>>>>>>>>
>>>>>>>>> I need then include this query mentioned in this example
on
>these
>>>>>>>>> _score field (i would call it _boost field in my case)
into my
>>>>>>>>> overall BooleanQuery.
>>>>>>>>>
>>>>>>>>> i will now try to combine these together and post here
for
>future
>>> help.
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/18/19 3:18 PM, Uwe Schindler wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Read my original email! The index time values are
written
>using
>>>>>>>>>> NumericDocValuesField. The expressions docs also
refer to
>that
>>> when
>>>>>>>>>> the bindings are documented.
>>>>>>>>>>
>>>>>>>>>> It's separate from the indexed data (TextField).
Think of it
>like an
>>>>>>>>>> additional numeric field in your database table with
a factor
>in
>>>>>>>>>> each row.
>>>>>>>>>>
>>>>>>>>>> Uwe
>>>>>>>>>>
>>>>>>>>>> Am October 18, 2019 7:14:03 PM UTC schrieb
>>> baris.kazar@oracle.com:
>>>>>>>>>>> Uwe,-
>>>>>>>>>>>
>>>>>>>>>>> Two questions there:
>>>>>>>>>>>
>>>>>>>>>>> i guess this is applicable to TextField, too.
>>>>>>>>>>>
>>>>>>>>>>> And i was expecting a index writer object in
the example for
>index
>>>>>>>>>>> time
>>>>>>>>>>>
>>>>>>>>>>> boosting.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 10/18/19 2:57 PM, Uwe Schindler wrote:
>>>>>>>>>>>> Sorry I was imprecise. It's a mix of both.
The factors are
>stored
>>> per
>>>>>>>>>>> document in index (this is why I called it index
time).
>During query
>>>>>>>>>>> time the expression use the index time values
to fold them
>into the
>>>>>>>>>>> query boost at query time.
>>>>>>>>>>>> What's your problem with that approach?
>>>>>>>>>>>>
>>>>>>>>>>>> Uwe
>>>>>>>>>>>>
>>>>>>>>>>>> Am October 18, 2019 6:50:40 PM UTC schrieb
>>>>> baris.kazar@oracle.com:
>>>>>>>>>>>>> Uwe,-
>>>>>>>>>>>>>
>>>>>>>>>>>>>       Thanks, if possible i am looking
for a pure Java
>methodology
>>>>>>>>>>>>> to do
>>>>>>>>>>> the
>>>>>>>>>>>>> index time boosting.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This example looks like a search time
boosting example:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>>>>>> 3A__lucene.apache.org_core_7-5F7-
>>>>>>>
>>> 5F2_expressions_org_apache_lucene_expressions_Expression.html&d=DwIF
>>> aQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdI
>>>>>>> bQAiX-
>>>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=6m6i5zZXPZNP6DyVv_xG4vXnVTPEdfKLeLS
>>>>>>> vGjEXbyw&s=B5_kGwRIbAoGqL0-SVR9r3t78E5XUuzLT37TeyV-bv8&e=
>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10/18/19 2:31 PM, Uwe Schindler wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is there a working example for
this? Is this mentioned
>in the
>>>>>>>>>>> Lucene
>>>>>>>>>>>>>>> Javadocs or any other docs so
that i can look it?
>>>>>>>>>>>>>> To index the docvalues, see NumericDocValuesField
(it can
>be
>>>>>>> added
>>>>>>>>>>> to
>>>>>>>>>>>>> documents like indexed or stored fields).
You may have
>used
>>> them
>>>>>>> for
>>>>>>>>>>>>> sorting already.
>>>>>>>>>>>>>>> this methodology seems sort of
like discouraging using
>index
>>>>> time
>>>>>>>>>>>>> boosting.
>>>>>>>>>>>>>> Not really. Many use this all the
time. It's one of the
>killer
>>>>>>>>>>>>> features of both Solr and Elasticsearch.
The problem was
>how
>>> the
>>>>>>>>>>>>> Document.setBoost()worked (it did not
work correctly, see
>>> below).
>>>>>>>>>>>>>>> Previous setBoost method call
was fine and easy to use.
>>>>>>>>>>>>>>> Did it have some performance
issues and then is that why
>it
>>> was
>>>>>>>>>>>>> deprecated?
>>>>>>>>>>>>>> No the reason for deprecating this
was for several
>reasons:
>>>>>>>>>>> setBoost
>>>>>>>>>>>>> was not doing what the user had expected.
Internally the
>boost
>>>>> value
>>>>>>>>>>>>> was just multiplied into the document
norm factor (which
>is
>>>>>>>>>>> internally
>>>>>>>>>>>>> also a docvalues field). The norm factors
are only very
>inprecise
>>>>>>>>>>>>> floats stored in a byte, so precision
is not well. If you
>put some
>>>>>>>>>>>>> values into it and the length norm was
already consuming
>all
>>> bits,
>>>>>>>>>>> the
>>>>>>>>>>>>> boosting was very coarse. It was also
only multiplied into
>and
>>> most
>>>>>>>>>>>>> users want to do some stuff like record
click counts in
>the index
>>>>>>>>>>> and
>>>>>>>>>>>>> then boost for example with the logarithm
or some other
>>> function.
>>>>> If
>>>>>>>>>>>>> the boost is just multiplied into the
length norm you have
>no
>>>>>>>>>>>>> flexibility at all.
>>>>>>>>>>>>>> In addition you can have several
docvalues fields and use
>their
>>>>>>>>>>>>> values in a function (e.g. one field
with click count and
>another
>>>>>>>>>>> one
>>>>>>>>>>>>> with product price). After that you can
combine click
>count and
>>>>>>>>>>> price
>>>>>>>>>>>>> (which can be modified indipenently during
index updates)
>and
>>>>>>> change
>>>>>>>>>>>>> boost to boost lower price and higher
click count up.
>>>>>>>>>>>>>> This is what you can do with the
expressions module. You
>just
>>>>> give
>>>>>>>>>>> it
>>>>>>>>>>>>> a function.
>>>>>>>>>>>>>> Here is an example, the second example
is using a
>>>>>>>>>>> FunctionScoreQuery
>>>>>>>>>>>>> that modifies the score based on the
function and the
>given
>>>>>>>>>>> docvalues:
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>>>>>> 3A__lucene.apache.org_core_7-5F7-
>>>>>>>
>>> 5F2_expressions_org_apache_lucene_expressions_Expression.html&d=DwIF
>>> aQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdI
>>>>>>> bQAiX-
>>>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=6m6i5zZXPZNP6DyVv_xG4vXnVTPEdfKLeLS
>>>>>>> vGjEXbyw&s=B5_kGwRIbAoGqL0-SVR9r3t78E5XUuzLT37TeyV-bv8&e=
>>>>>>>>>>>>>>> FunctionScoreQuery usage with
MultiFieldQueryParser
>would
>>>>> also
>>>>>>> be
>>>>>>>>>>>>> nice
>>>>>>>>>>>>>>> where
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> MultiFieldQuery already has boosts
field to do this in
>its
>>>>>>>>>>>>> constructor.
>>>>>>>>>>>>>> The boots in the query parser are
applied for fields
>during
>>> query
>>>>>>>>>>>>> time (to have a different weight per
field). Index time
>boosting is
>>>>>>>>>>> per
>>>>>>>>>>>>> document. So you can combine both.
>>>>>>>>>>>>>>> Maybe it is not needed with MultiFieldQueryParser.
>>>>>>>>>>>>>> You use MultiFieldQueryParser to
adjust weights of the
>fields
>>> (e.g.
>>>>>>>>>>>>> title versus body). The parsed query
is then wrapped with
>an
>>>>>>>>>>> expression
>>>>>>>>>>>>> that modifies the score per document
according to the
>>> docvalues.
>>>>>>>>>>>>>> Uwe
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 10/18/19 1:28 PM, Uwe Schindler
wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> that's not true. You can
do index time boosting, but
>you
>>> need
>>>>> to
>>>>>>>>>>> do
>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> using a separate field. You just
index a numeric
>docvalues
>>> field
>>>>>>>>>>>>> (which may
>>>>>>>>>>>>>>> contain a long or float value
per document). Later you
>wrap
>>> your
>>>>>>>>>>>>> query with
>>>>>>>>>>>>>>> some FunctionScoreQuery (e.g.,
use the Javascript
>function
>>>>> query
>>>>>>>>>>>>> syntax in
>>>>>>>>>>>>>>> the expressions module). This
allows you to compile a
>>> javascript
>>>>>>>>>>>>> function
>>>>>>>>>>>>>>> that calculated the final score
based on the score
>returned by
>>>>> the
>>>>>>>>>>>>> inner query
>>>>>>>>>>>>>>> and combines them with docvalues
that were indexed per
>>>>>>> document.
>>>>>>>>>>>>>>>> Uwe
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>> Uwe Schindler
>>>>>>>>>>>>>>>> Achterdiek 19, D-28357 Bremen
>>>>>>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>> 3A__www.thetaphi.de&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIr
>>>>>>>>>>>>>>> MUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
>>>>>>>>>>>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
>>> 8W80yE9L5xY&s=zgKmnmP9gLG4DlEnAfDdtBMEzPXtHNVYojxXIKEnQgs&e=
>>>>>>>>>>>>>>>> eMail: uwe@thetaphi.de
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>> From: baris.kazar@oracle.com
<baris.kazar@oracle.com>
>>>>>>>>>>>>>>>>> Sent: Friday, October
18, 2019 5:28 PM
>>>>>>>>>>>>>>>>> To: java-user@lucene.apache.org
>>>>>>>>>>>>>>>>> Cc: baris.kazar@oracle.com
>>>>>>>>>>>>>>>>> Subject: Re: Index-time
boosting: Deprecated setBoost
>>>>> method
>>>>>>>>>>>>>>>>> It looks like index-time
boosting (field) is not
>possible since
>>>>>>>>>>>>> Lucene
>>>>>>>>>>>>>>>>> version 7.7.2 and
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> i was using before for
another case the BoostQuery at
>>> search
>>>>>>>>>>> time
>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> boosting and
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> this seems to be the
only boosting option now in
>Lucene.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 10/18/19 10:01 AM,
baris.kazar@oracle.com wrote:
>>>>>>>>>>>>>>>>>> Hi,-
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> i saw this in the
Field class docs and i am figuring
>out the
>>>>>>>>>>>>> following
>>>>>>>>>>>>>>>>>> note in the docs:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> setBoost(float boost)
>>>>>>>>>>>>>>>>>> Deprecated.
>>>>>>>>>>>>>>>>>> Index-time boosts
are deprecated, please index index-
>>> time
>>>>>>>>>>> scoring
>>>>>>>>>>>>>>>>>> factors into a doc
value field and combine them with
>the
>>>>> score
>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>> query time using
eg. FunctionScoreQuery.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I appreciate this
note. Is there an example about
>this? I
>>> wish
>>>>>>>>>>>>> docs
>>>>>>>>>>>>>>>>>> would give a simple
example to further help.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>>>>>>>>>>>>>> 3A__lucene.apache.org_core_6-5F6-
>>>>>>>>>>>>>>>
>>> 5F0__core_org_apache_lucene_document_&d=DwIFaQ&c=RoP1YumCXCga
>>>>>>>>>>>>>>> WHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
>>>>>>>>>>>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
>>> 8W80yE9L5xY&s=rIVbw3_TGEwpaet5ibCeYze6vSDUiPhwOzlV0z484fM&e=
>>>>>>>>>>>>>>>>> Field.html
>>>>>>>>>>>>>>>>>> vs
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>>>>>>>>>>>>>> 3A__lucene.apache.org_core_7-5F7-
>>>>>>>>>>>>>>>
>>> 5F2_core_org_apache_lucene_document_F&d=DwIFaQ&c=RoP1YumCXCgaW
>>>>>>>>>>>>>>> HvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
>>>>>>>>>>>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
>>> 8W80yE9L5xY&s=yt1toHHZQBqd3qKpWeSzywGJhy928Q5qaEO4v9Lj3vg&e=
>>>>>>>>>>>>>>>>> ield.html
>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>---------------------------------------------------------------------
>>>>>>>>>>>>>>>>> To unsubscribe, e-mail:
java-user-
>>>>>>> unsubscribe@lucene.apache.org
>>>>>>>>>>>>>>>>> For additional commands,
e-mail:
>>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>>
>---------------------------------------------------------------------
>>>>>>>>>>>>>>>> To unsubscribe, e-mail: java-user-
>>>>>>> unsubscribe@lucene.apache.org
>>>>>>>>>>>>>>>> For additional commands,
e-mail: java-user-
>>>>>>> help@lucene.apache.org
>>>>>>>>>>>
>---------------------------------------------------------------------
>>>>>>>>>>>>>>> To unsubscribe, e-mail: java-user-
>>>>> unsubscribe@lucene.apache.org
>>>>>>>>>>>>>>> For additional commands, e-mail:
java-user-
>>>>>>> help@lucene.apache.org
>>>>>>>>>>>
>---------------------------------------------------------------------
>>>>>>>>>>>>>> To unsubscribe, e-mail: java-user-
>>>>> unsubscribe@lucene.apache.org
>>>>>>>>>>>>>> For additional commands, e-mail:
java-user-
>>>>>>> help@lucene.apache.org
>>>>>>>>>>>
>---------------------------------------------------------------------
>>>>>>>>>>>>> To unsubscribe, e-mail: java-user-
>>> unsubscribe@lucene.apache.org
>>>>>>>>>>>>> For additional commands, e-mail: java-user-
>>>>> help@lucene.apache.org
>>>>>>>>>>>> --
>>>>>>>>>>>> Uwe Schindler
>>>>>>>>>>>> Achterdiek 19, 28357 Bremen
>>>>>>>>>>>>
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>> 3A__www.thetaphi.de&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIr
>>>>>>> MUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
>>>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=6ksT9ArMj83Yxf_GrxLNeJ4UFEeKdVdLK0Bl
>>>>>>> OT0d754&s=33f2nq9rOLI5pN9e_RYl_TiEKnP_f4WMZ__vqyz2bzo&e=
>>>>>>>>>>>
>---------------------------------------------------------------------
>>>>>>>>>>> To unsubscribe, e-mail:
>java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> For additional commands, e-mail: java-user-
>>> help@lucene.apache.org
>>>>>>>>>> --
>>>>>>>>>> Uwe Schindler
>>>>>>>>>> Achterdiek 19, 28357 Bremen
>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>> 3A__www.thetaphi.de&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIr
>>>>>>> MUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
>>>>>>>
>>> BKNeyLlULCbaezrgocEvPhQkl4&m=owjI40OeLzt8gvPN44aTdndoiUel5E9Hqx1T
>>>>>>> EcoWk_Y&s=xbZedNkQXb5eQcw_K7lCOP7b5ToKJVZ1dCPY3hi836c&e=
>>>>>>>>
>---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail:
>java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>
>---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail:
>java-user-help@lucene.apache.org
>>>>>>
>---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>
>---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org

--
Uwe Schindler
Achterdiek 19, 28357 Bremen
https://www.thetaphi.de
Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message