lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Scheffler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-6096) Support Update and Delete on nested documents
Date Mon, 02 Jun 2014 07:56:01 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015235#comment-14015235
] 

Thomas Scheffler commented on SOLR-6096:
----------------------------------------

I would prefer a solrconfig.xml that all updates should be handled as block updates. Currently
I don't get it why that's not the default. Are there performance issues that come with this?
It would even help be to turn on block updates on client side via SOLRJ. But as I saw in the
code, block updates depend on nested documents. There are no nested documents on delete and
may not be on updates. So this heuristic to guess if it is a block update is definitely no
good enough. 

> Support Update and Delete on nested documents
> ---------------------------------------------
>
>                 Key: SOLR-6096
>                 URL: https://issues.apache.org/jira/browse/SOLR-6096
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.7.2
>            Reporter: Thomas Scheffler
>              Labels: blockjoin, nested
>
> When using nested or child document. Update and delete operation on the root document
should also affect the nested documents, as no child can exist without its parent :-)
> Example
> {code:xml|title=First Import}
> <doc>
>   <field name="id">1</field>
>   <field name="title">Article with author</field>
>   <doc>
>     <field name="name">Smith, John</field>
>     <field name="role">author</field>
>   </doc>
> </doc>
> {code}
> If I change my mind and the author was not named *John* but *_Jane_*:
> {code:xml|title=Changed name of author of '1'}
> <doc>
>   <field name="id">1</field>
>   <field name="title">Article with author</field>
>   <doc>
>     <field name="name">Smith, Jane</field>
>     <field name="role">author</field>
>   </doc>
> </doc>
> {code}
> I would expect that John is not in the index anymore. Currently he is. There might also
be the case that any subdocument is removed by an update:
> {code:xml|title=Remove author}
> <doc>
>   <field name="id">1</field>
>   <field name="title">Article without author</field>
> </doc>
> {code}
> This should affect a delete on all nested documents, too. The same way all nested documents
should be deleted if I delete the root document:
> {code:xml|title=Deletion of '1'}
> <delete>
>   <id>1</id>
>   <!-- implying also
>     <query>_root_:1</query>
>    -->
> </delete>
> {code}
> This is currently possible to do all this stuff on client side by issuing additional
request to delete document before every update. It would be more efficient if this could be
handled on SOLR side. One would benefit on atomic update. The biggest plus shows when using
"delete-by-query". 
> {code:xml|title=Deletion of '1' by query}
> <delete>
>   <query>title:*</query>
>   <!-- implying also
>     <query>_root_:1</query>
>    -->
> </delete>
> {code}
> In that case one would not have to first query all documents and issue deletes by those
id and every document that are nested.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message