lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stewart Sims (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions
Date Mon, 12 May 2014 14:52:15 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995131#comment-13995131
] 

Stewart Sims commented on SOLR-3284:
------------------------------------

This has been a source of errors that were difficult for us to track down, when using the
ConcurrentUpdateSolrServer to index a large volume of data (including some very large individual
documents). We were confused for a long time as to why some documents were not being indexed,
which turned out to be a combination of data errors and bad request errors due to too many
concurrent requests. Once we switched to HttpSolrServer the problems with bad requests went
away and we were able to get more informative exceptions which helped to find the data problems.

Had my colleague not found the thread below, we would have struggled to figure out the exact
causes of our problems:
http://lucene.472066.n3.nabble.com/Missing-documents-with-ConcurrentUpdateSolrServer-vs-HttpSolrServer-td4033637.html

One alternative to a code change might be to advise caution with using the ConcurrentUpdateSolrServer
with large volumes of data. HttpSolrServer seems for us (using Solr 4.2.1) to be more stable
and only takes about 30% more time to index.

> StreamingUpdateSolrServer swallows exceptions
> ---------------------------------------------
>
>                 Key: SOLR-3284
>                 URL: https://issues.apache.org/jira/browse/SOLR-3284
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 3.5, 4.0-ALPHA
>            Reporter: Shawn Heisey
>            Assignee: Shawn Heisey
>         Attachments: SOLR-3284.patch
>
>
> StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as HttpClient,
when doing adds.  It may happen with other methods, though I know that query and deleteByQuery
will throw exceptions.  I believe that this is a result of the queue/Runner design.  That's
what makes SUSS perform better, but it means you sacrifice the ability to programmatically
determine that there was a problem with your update.  All errors are logged via slf4j, but
that's not terribly helpful except with determining what went wrong after the fact.
> When using CommonsHttpSolrServer, I've been able to rely on getting an exception thrown
by pretty much any error, letting me use try/catch to detect problems.
> There's probably enough dependent code out there that it would not be a good idea to
change the design of SUSS, unless there were alternate constructors or additional methods
available to configure new/old behavior.  Fixing this is probably not trivial, so it's probably
a better idea to come up with a new server object based on CHSS.  This is outside my current
skillset.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message