lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Estanislao Oubel <estanislao.ou...@gmail.com>
Subject Re: How to index & search arrays of double?
Date Thu, 06 Aug 2015 14:13:40 GMT
Thanks Phaneendra for responding,

I know LIRE, I have been playing around with this library but I don't
understand which is the added value. To be more specific, LIRE allows
computing several image features and similarity between them, No problem so
far. My main concern is that the index used by LIRE is a lucene index (at
list in the examples). However, lucene index is an inverted index that
seems suitable for indexing terms but it's not clear to me how arrays of
values (LIRE features for example) are managed. What is even more strange
is that, when searching a specific feature, this is compared to all
documents in the index, and therefore I don't see which is the advantage of
using a lucene index ... Perhaps I am missing something but my
understanding is that an index should optimize the search of documents,
which seems not to be the case ...

If you have some experience with LIRE, could you please help me understand
all this ? The one-millon question is: do I have to use necessarily LIRE to
solve my specific problem?

If you think that this topic is not suitable for the lucene forum please
tell me and we could continue the discussion outside the mailing list. But
I think that is of general interest because perhaps there are solutions
using native lucene functions.

Thanks!

Stan





2015-08-06 10:48 GMT+02:00 Phaneendra N <phaneendran.gitam@gmail.com>:

> Hello Stan,
>   Great question. I come across with one such implementation based on
> lucene. Its called LIRE .
> This is an open source project. http://www.lire-project.net/
> You might get some ideas there.
> Please let me know if you find answers to your specific questions there.
> I'm curious.
>
> Thanks
> Phaneendra
>
> On Thu, Aug 6, 2015 at 12:39 PM, Estanislao Oubel <
> estanislao.oubel@gmail.com> wrote:
>
> > Hello everybody,
> >
> > I'm currently investigating methods for content-based image retrieval. In
> > this context, I would like to index documents containing arrays of
> doubles
> > and then perform an approximate search based on these arrays. For
> example,
> > I would like to insert in the index three documents (d1,d2,d3)
> containing a
> > field called feature1, a vector of doubles of dimension 3:
> >
> > d1_feature1  = [0.5 1.8 2.4].
> > d2_feature1  = [30.1 0 9.1].
> > d3_feature1  = [0.6 5.8 2.0].
> >
> > Now, I would like that lucene gives me d1 when I search a document
> > containing [0.51 1.79 2.41] (because d1 is the closest one according to a
> > distance L1 for example).
> >
> > Is it possible to do this type of things with lucene? More specifically:
> > 1. Does lucene support arrays of doubles as field type?
> > 2. Is it possible to search documents based on custom distances between
> > these arrays?
> >
> > If so, can you provide some clues about how to implement it? (fields
> types
> > and classes to use,  or an example)
> >
> > Thanks!
> >
> > Stan
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message