lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Kramer <David.Kra...@shoebuy.com>
Subject Re: Solr querying nested documents with ChildDocTransformerFactory, get “Parent query yields document which is not matched by parents filter”
Date Thu, 02 Feb 2017 15:20:47 GMT
Thanks, for responding. Mikhail.  There are no deleted documents.  Since I’m fairly new to
Solr, one of the things I’ve been paranoid about is I have no way of validating my schema.xml,
or know whether Solr is even using it (I have evidence it’s not, more below). So for each
test, I’ve wiped out the index, recreated, and reimported. 

Back to whether my schema.xml is being used, I mentioned that I had to come up with a compound
UUID field of the first character of the docType plus the ID, and we put “<uniqueKey>uuid</uniqueKey>”
(was id) in our schema.xml.  Then I deleted and recreated the index and restarted Solr.  In
order to verify it was working, I created an import file that had unique IDs but UUIDs which
were duplicates of existing records, and it imported the new records even though the UUIDs
existed in the database already.  I’m not sure if Solr should have produced an error or
not. I’ll research that, but I mention that here in case it’s relevant.

Thanks.

On 2/2/17, 6:10 AM, "Mikhail Khludnev" <mkhl@apache.org> wrote:

    David,
    
    Can you make sure your index doesn't have deleted docs? This  can be seen
    in SolrAdmiun.
    And can you merge index to avoid having them in the index?
    
    On Thu, Feb 2, 2017 at 12:29 AM, David Kramer <David.Kramer@shoebuy.com>
    wrote:
    
    >
    >
    > Some background:
    > ·         The data involved is catalog data, with three nested objects:
    > Products, Items, and Skus, in that order. We have a docType field on each
    > record as a differentiator.
    > ·         The "id" field in our data is unique within datatype, but not
    > across datatypes. We added a "uuid" field in our program that generates the
    > Solr import file that is the id prefixed by the first letter of the
    > docType, like P12345. That makes the uuid field unique, and we have that as
    > the uniqueKey in our schema.xml.
    > ·         We are trying to retrieve the parent Product, and all children
    > documents. As such, we are using the ChildDocTransformerFactory
    > ([child...]) to retrieve the children along with the parent. We have not
    > yet solved the problem of getting items within SKUs as nested documents in
    > the results, and we will have to figure that out at some point, but for now
    > we get them flattened
    > ·         We are building out the proof of concept for this. This is all
    > new work, so we are free to change a lot.
    > ·         This is Solr 6.0.0, and we are importing in JSON format, if that
    > matters
    > ·         I submitted this question to StackOverflow<http://
    > stackoverflow.com/questions/41969353/solr-querying-nested-documents-with-
    > childdoctransformerfactory-get-parent-quer> but haven’t gotten any
    > answers yet.
    >
    >
    > Our data looks like this (I've removed some fields for simplicity):
    >
    > {
    >
    >   "id": 739063,
    >
    >   "docType": "Product",
    >
    >   "uuid": "P739063",
    >
    >   "_childDocuments_": [
    >
    >     {
    >
    >       "id": 1537378,
    >
    >       "price": 25.45,
    >
    >       "color": "Blush",
    >
    >       "docType": "Item",
    >
    >       "productId": 739063,
    >
    >       "uuid": "I1537378",
    >
    >       "_childDocuments_": [
    >
    >         {
    >
    >           "id": 12799578,
    >
    >           "size": "10",
    >
    >           "width": "W",
    >
    >           "docType": "Sku",
    >
    >           "itemId": 1537378,
    >
    >           "uuid": "S12799578"
    >
    >         }
    >
    >       ]
    >
    >     }
    >
    > }
    >
    >
    >
    > The query to fetch all Products and their children nested inside them is
    > q=docType:Product&fl=title,id,docType,[child
    > parentFilter=docType:Product]. When I run that query, all is well, and it
    > returns the first 10 rows. However, if I fetch more rows by adding, say
    > &rows=500, we get the error Parent query yields document which is not
    > matched by parents filter, docID=XXX.
    >
    > When we first saw that error, we discovered our id field was not unique
    > across document types, so we added the uuid field as mentioned above, which
    > is. we also added in our schema.xml file, wiped the core, recreated it, and
    > restarted Solr just to make sure it was in effect. We have double checked
    > and are sure that the uuid fields are unique.
    >
    >
    >
    > In all the search results for that error that I've found, the OP did not
    > have a field that could differentiate the different document types, but as
    > you see we do. Since both the query and the parentFilter are searching for
    > docType:Product I don't see how either could possibly return anything but
    > parents. We've also tried adding childFilter=docType:Item and
    > childFilter=docType:Sku but that did not help.  I also tried using title:*
    > for the filter since only products have titles.
    >
    >
    >
    > Is there anything else we can try?
    >
    > Any explanation of this?
    >
    > Is it possible that it's not using uuid as the unique identifier even
    > though it's specified in the schema.xml, and would that even cause this?
    >
    > Thanks.
    >
    >
    >
    
    
    -- 
    Sincerely yours
    Mikhail Khludnev
    

Mime
View raw message