lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: Lucene Text Similarity
Date Wed, 04 Sep 2013 01:26:52 GMT
(13/09/04 2:33), David Miranda wrote:
> Is there any way to check the similarity of texts with Lucene?
>
> I have the DBpedia indexed and wanted to get the texts more similar
> between the abstract and DBpedia another text. If I do a search in the
> abstract field, with a particular text the result is not very
> satisfactory. Eg
>
> Abstract DBpedia: "SoundCloud is an online audio distribution platform
> Which Allows collaboration, promotion and distribution of audio
> recordings."
>
> My Text: "Private Track From DJ Sneak. Download the track now in the
> SoundCloud website."
>
> If I do a search as follows:
>
> Query q = new QueryParser (Version.LUCENE_43, "abstract", analyzer).
> Parse (mytext);
>
> Search field abstract the "mytext", not me no results are returned.
>
> What can I do to implement this feature?
>
> Thanks in advance,
> David

Hi David,

I think you'd better to use "featured-terms", e.g. "DJ Sneak SoundCloud",
in your text rather than using whole text when you search abstract.

See MoreLikeThis source code to know how to extract "featured-terms" from
your indexed text.

koji
-- 
http://soleami.com/blog/automatically-acquiring-synonym-knowledge-from-wikipedia.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message