lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MitchK <>
Subject Term Dictionary + scoring
Date Fri, 15 Jan 2010 13:19:19 GMT


I have searched the wiki and the mailing-lists, but I can't find any
postings for the following training-use cases.

I want to create a Term Dictionary, which I can response to my client. The
client should be able to manipulate this response in any way he wants - so I
really need a by human readable dictionary, which I can export to a
database, if I need to do so.
I know that Lucene has got a Term Dictionary, but I don't know how to access

I want to manipulate the scoring of a document. Sure, there are some good
ways to do so out-of-the-box, but I want to do so in a special way: 
For example my index contains on those stored documents:
1. "Star Wars - Episode I - Phantom Meneance DVD Extended Edition"
2. "Star Wars - Episodde I - Phantom Meneance Video-box"
3. "Star Wars - Episode V - The Empire Strikes Back Special Edition DVD"
4. "Star Wars - Heir to the Empire by Timothy Zahn"

There are only three queries since the index has been built:
1. "Star Wars Episode I" - > the user clicked document 1
2. "Star Wars" -> the user clicked document 3
3. the user can't remind the title of a special Star Wars book, so he is
searching for "Star Wars Empire"
and he clicked document 4.

Now, I want to do the following:
If someone is querying for "Star Wars" again, document 3 should be the first
responsed document, because whenever someone has searched for "Star Wars" in
the past, document 3 was the most popular document.
If someone is querying for "Star Wars Episode I" document 1 should be
responsed, due to the same reason.

In easy words, I want to boost some documents by query.
I can't do so with the help of a popularity-category, because if 1.000
queries were "Star Wars Episode I" and all 1.000 people clicked on document
1, the popularity of document 1 would be 1.000.
If 500 people were searching for "Star Wars" and clicked document 3, the
popularity of document 3 would be 500. However, the first result in the
response would be document 1 instead of 3.

I have absolutely no idea how to do so, without creating a seperate file per
document with the needed information. So, if anyone has some experiences
with such a use case, feel free to tell us your thoughts and ideas. Having a
term dictionary, one could use an external database to solve this problem
for the moment.  

Kind regards from Germany
View this message in context:
Sent from the Solr - User mailing list archive at

View raw message