lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel Minchenkov <char...@gmail.com>
Subject Re: Duplicates
Date Fri, 23 Jul 2010 15:16:47 GMT
I mean two usecases.
I can't index folders only because I have another queries on files. Or I
have to do another index that contains only folders, but then I have to take
care of synchronizing folders in two indexes.
Does range, spatial, etc quiries are supported on multivalued fields?

2010/7/23 Peter Karich <peathal@yahoo.de>

> Pavel,
>
> hopefully I understand now your usecase :-) but one question:
>
> > I need to select always *one* file per folder or
> > select *only* folders than contains matched files (without files).
>
> What do you mean here with 'or'? Do you have 2 usecases or would one of
> them be sufficient?
> Because the second usecase could be solved without the patch: you could
> index folders only,
> then all prop_N will be multivalued field. and you don't have the problem
> of duplicate folders.
>
> (If you don't mind uglyness both usecases could even handled: After you got
> the folders
>  grabbing the files which matched could be done in postprocessing)
>
> But I fear the cleanest solution is to use the patch. Hopefully it can be
> applied without hassles
> against 1.4 or the trunk. If not, please ask on the patch-site for
> assistance.
>
> Regards,
> Peter.
>
>
> > Thanks, Peter!
> >
> > I'll try collapsing today.
> >
> > Example (sorry if table unformated):
> >
> > id |  type  |   prop_1  | .... |  prop_N |  folderId
> > ________________________________________
> >  0 | folder |           |      |         |
> >  1 | file   |  val1     |      |  valN1  |   0
> >  2 | file   |  val3     |      |  valN2  |   0
> >  3 | file   |  val1     |      |  valN3  |   0
> >  4 | folder |           |      |         |
> >  5 | folder |           |      |         |
> >  6 | file   |  val3     |      |  valN7  |   6
> >  7 | file   |  val4     |      |  valN8  |   6
> >  8 | folder |           |      |         |
> >  9 | file   |  val2     |      |  valN3  |   8
> >  10| file   |  val1     |      |  valN2  |   8
> >  11| file   |  val2     |      |  valN5  |   8
> >  12| folder |           |      |         |
> >
> >
> > I need to select always *one* file per folder or
> > select *only* folders than contains matched files (without files).
> >
> > Query:
> > prop_1:val1 OR prop_2:val2
> >
> > I need results (document ids):
> > 1, 9
> > or
> > 0, 8
> >
> > 2010/7/23 Peter Karich <peathal@yahoo.de>
> >
> >
> >> Hi Pavel!
> >>
> >> The patch can be applied to 1.4.
> >> The performance is ok, but for some situations it could be worse than
> >> without the patch.
> >> For us it works good, but others reported some exceptions
> >> (see the patch site: https://issues.apache.org/jira/browse/SOLR-236)
> >>
> >>
> >>> I need only to delete duplicates
> >>>
> >> Could you give us an example what you exactly need?
> >> (Maybe you could index each master document of the 'unique' documents
> >> with an extra field and query for that field?)
> >>
> >> Regards,
> >> Peter.
> >>
> >> --
> >>
> > Pavel Minchenkov
> >
> >
>
>
> --
> http://karussell.wordpress.com/
>
>


-- 
Pavel Minchenkov

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message