lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ANDREI SOLODIN <asolo...@comcast.net>
Subject Index-time join ToParentBlockJoinQuery query produces incorrect result
Date Wed, 03 Jul 2019 15:11:22 GMT
Hello, I am trying to understand the requirements for properly using the index-time join. In
my use case, I am trying to model a 1-N relationship where parent document could have 0-N
child documents. For now I am keeping my data very simple where each child has a single field.
So my data right now look like this:


Parent Doc         Children

--------------------------------------
id=id00000
                              none
id=id00001
                              program=P1

id=id00002
                              program=P1
                              program=P2

id=id00003
                              none
id=id00004
                              program=P1

id=id00005
                              program=P1
                              program=P2


So essentially I have 6 parent docs, doc 0 has no children, doc 1 has 1 child, doc 2 has 2
children, etc.


Certain queries are giving me incorrect result. For example:


BitSetProducer parentSet = new QueryBitSetProducer(new TermQuery(new Term("id", "id00003")));
Query q = new ToParentBlockJoinQuery(new WildcardQuery(new Term("program", "*")), parentSet,
 ScoreMode.None);


This returns "id00003", which is unexpected.


I opened a bug (https://issues.apache.org/jira/browse/LUCENE-8902) in my haste earlier (sorry)
and it was mentioned in there that "chid free is not supported". So I take it to mean that
each parent should have at least one child. So let's say I add a "default" child to each parent:


Parent Doc         Children

--------------------------------------
id=id00000
                              field1=val1
id=id00001

                              field1=val1
                              program=P1

id=id00002
                              field1=val1

                              program=P1
                              program=P2

id=id00003
                              field1=val1

id=id00004
                              field1=val1

                              program=P1

id=id00005
                              field1=val1

                              program=P1
                              program=P2


So now every parent has at least one child. That made no difference, still get the same result.
What am I doing wrong here?


Thanks

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message