lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mork0075 <mork0...@googlemail.com>
Subject Re: Similarity Search
Date Fri, 14 Mar 2008 14:54:03 GMT
So you mean i merge the 10 paragraphs to one huge query and then select 
the top x?

Is this, from a semantic standpoint, the optimal solution?

For explanation:

Paragraph_A
Paragraph_B
Paragraph_C
....

merge -> Paragraph_A Paragraph_B Paragraph_C ... = query

jz@uva schrieb:
> Hi,
>
> You can enter the whole paragraph as query, and then select the top 10.
>
> Cheers.
> jz
>
> On Fri, Mar 14, 2008 at 9:14 AM, Mork0075 <mork0075@googlemail.com> wrote:
>   
>> Hello,
>>
>>  we are using lucene in one of our applications for fulltext search,
>>  which works very vell.
>>
>>  I'am now interested in some similarity search for whole paragraphs.
>>
>>  For example there are 1000 textual items in the database, which contain
>>  on average more then perhaps 100 words per item. Now i have a set of 10
>>  textual items, and would like to know, which of the 1000 texual items
>>  are similar to the 10 (in a certain tolerance)?
>>
>>  Is this possible with lucene?
>>
>>  Thanks in advance
>>  Mark
>>
>>
>>     
>
>   


Mime
View raw message