lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furkan KAMACI <>
Subject Document Similarity Algorithm at Solr/Lucene
Date Tue, 23 Jul 2013 09:33:26 GMT

Sometimes a huge part of a document may exist in another document. As like
in student plagiarism or quotation of a blog post at another blog post.
Does Solr/Lucene or its libraries (UIMA, OpenNLP, etc.) has any class to
detect it?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message