lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kasun Perera <>
Subject Indexing with Semantics
Date Sat, 28 Apr 2012 03:02:56 GMT
I'm using Lucene's Term Freq vector to calculate cosine similarity between
documents, Say my docments has these 3 terms, "owe" "owed" "owing". Lucene
takes this as 3 separate terms, but 3 of them means same "owe". Is there
any functionality in Lucene that can be used to index by semantics? so that
it indexes "owe" "owed" "owing" as one word "owe" with term frequency =3 ?

If not I'd welcome any suggestions achieving this task?


Kasun Perera

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message