lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Carrasco <d.carra...@i2tic.com>
Subject Re: Solr node is out of sync (looks Healthy)
Date Tue, 13 Feb 2018 09:22:27 GMT
Hello,

I answer inline ;)

2018-02-12 23:56 GMT+01:00 Emir Arnautović <emir.arnautovic@sematext.com>:

> Hi Daniel,
> Please see inline comments.
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 12 Feb 2018, at 13:13, Daniel Carrasco <d.carrasco@i2tic.com> wrote:
> >
> > Hello,
> >
> > 2018-02-12 12:32 GMT+01:00 Emir Arnautović <emir.arnautovic@sematext.com
> <mailto:emir.arnautovic@sematext.com>>:
> >
> >> Hi Daniel,
> >> Maybe it is Monday and I am still not warmed up, but your details seems
> a
> >> bit imprecise to me. Maybe not directly related to your problem, but
> just
> >> to exclude that you are having some strange Solr setup, here is my
> >> understanding: You are running a single SolrCloud cluster with 8 nodes.
> It
> >> has a single collection with X shards and Y replicas. You use DIH to
> index
> >> data and you use curl to interact with Solr and start DIH process. You
> see
> >> some of replicas for some of shards having less data and after node
> restart
> >> it ends up being ok.
> >
> >
> >> Is this right? If it is, what is X and Y?
> >
> >
> > Near to reality:
> >
> >   - I've a SolrCloud cluster with 8 nodes but has multiple collections.
> >   - Every collection has only one shard for performance purpose (I did
> >   some test splitting shards and queries were slower).
> Distributed request comes with an overhead and if collection is small,
> that overhead can be larger than benefit of parallelising search.
>
> >   - Every collection has 8 replicas (one by node)
> I would compare all shards on all nodes (64 Solr cores) v.s. having just
> one replica (16 Solr cores)
>

We've all shards on all nodes because we want to avoid the overhead of send
data between nodes (latency, network traffic). The page has a lot of
petitions per second and we want the fastest response posible with HA if
some nodes fails.

Also we've 8 collections: five are small (less than 15Mb), and three of
then has some GB (the bigger is about 10Gb).
Is the first SolrCloud I've created and I'd decided this architecture to
avoid what I say above, the overhead of sent the data between nodes when
the client ask for data to another node. Maybe is better to have a
replication factor of 3-4 for example and create shards for big collections?


>
> >   - After restart the node it start to recover the collections. I don't
> >   know if Solr serve data directly on that state or get the data from
> other
> >   nodes before serve it, but even while is recovering, the data looks OK.
> Recovery can be from transaction logs (logs can tell) and that would mean
> that there was no hard commit after some updates.
>
> >
> >
> >
> >> Do you have autocommit set up or you commit explicitly?
> >
> >
> > I'm not sure about that. How I can check it?
> It is part of solr.xml
>

I've checked the file and looks like there's no configuration about
autocommit. I'll search a bit about how works to see if can help.


>
> >
> > On curl command is not specified, but will be true by default, right?
> I think it is for full import.
>
> >
> >
> >
> >> Did you check logs on node with less data and did you see any
> >> errors/warnings?
> >
> >
> > I'm not sure when it failed and the cluster has a lot warnings and error
> > every time (maybe related with queries from shop), so is hard to
> determine
> > if import error exists and what's the error related to the import. Is
> like
> > search a needle on a haystack
> Not with some logging solution - one such is Sematext’s Logsene:
> https://sematext.com/logsene/ <https://sematext.com/logsene/>
>
> >
> >
> >> Do you do full imports or incremental imports?
> >>
> >
> > I've checked the curl command and looks like is doing full imports
> without
> > clean data:
> > http://' <http://'/> . $solr_ip .
> > ':8983/solr/descriptions/dataimport?command=full-
> import&clean=false&entity=description_’.$idm[$j].'_lastupdate
> This is not a good strategy since Solr does not have real updates - it is
> delete + insert and deleted documents are purged on segment merges. Also
> this will not eliminate documents that are deleted in the primary storage.
> It is much better to index it in new collection and use aliases to point to
> used collection. This way you can even roll back if not happy with new
> index.
>
>
But if you update three products for example and you create a new
collection with that updates, how you point that three products to original
collection?, or you've to reindex the whole collection on a new collection
and then create an alias?, because also I don't know if is a good idea to
reindex a whole collection of 10Gb every 5 minutes to create another
collection with the updates. Also you've to manage the way to delete the
old collections, to avoid to fill the disk.



> >
> >
> >>
> >> Not related to issue, but just a note that Solr does not guaranty
> >> consistency at any time - it has something called “eventual
> consistency” -
> >> once updates stop all replicas will (should) end up in the same state.
> >> Having said that, using Solr results directly in your UI would either
> >> require you to cache used documents on UI/middle layer or implement some
> >> sort of stickiness or retrieve only ID from Solr and load data from
> primary
> >> storage. If you have static data, and you update index once a day, you
> can
> >> use aliases and switch between new and old index and you will suffer
> from
> >> this issue only at the time when doing switches.
> >>
> >
> > But is normal that data will be inconsistent for a very long time?,
> because
> > looks like the data is inconsistent from about a week…
> It will become consistent once all changes are committed and searchers
> reopened.
>
> >
> > Another question: With HDFS, data will be consistent?. With HDFS the data
> > will be shared between nodes and then updates will be avaible on all
> nodes
> > at same time, right?
> I am not too familiar with running Solr on HDFS, but I doubt that it is
> working the way you expect it to work. You might be able to have multiple
> Solr instances (not part of the same cluster - can be standalone Solr)
> reading from the same HDFS directory and one updating it, but you would
> probably have to reload core on read instances to be aware of changes. Not
> sure if you would get much out of it - just different replication
> mechanism. But it is late here and I never used Solr on HDFS so take this
> with a grain of salt.
>

Today I've read a comment that said is like standalone server and it
creates a copy of the data for every node that has replica. Maybe is wrong,
but is not what I want.
I want to have a SolrCluster like now but sharing data through HDFS and
with the ability of autoscaling if there's too much load (AWS and GCP
autoscaling group), but maybe is like search unicorns...


> >
> > Thanks!!
> >
> >
> >>
> >> Regards,
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >>> On 12 Feb 2018, at 12:00, Daniel Carrasco <d.carrasco@i2tic.com>
> wrote:
> >>>
> >>> Hello, thanks for your help.
> >>>
> >>> I answer bellow.
> >>>
> >>> Greetings!!
> >>>
> >>> 2018-02-12 11:31 GMT+01:00 Emir Arnautović <
> emir.arnautovic@sematext.com
> >> <mailto:emir.arnautovic@sematext.com <mailto:emir.arnautovic@
> sematext.com>>>:
> >>>
> >>>> Hi Daniel,
> >>>> Can you tell us more about your document update process. How do you
> >> commit
> >>>> changes? Since it got fixed after restart, it seems to me that on that
> >> one
> >>>> node index searcher was not reopened after updates. Do you see any
> >>>> errors/warnings on that node?
> >>>>
> >>>
> >>> i've asked to the programmers and looks like they are using the
> >> collections
> >>> dataimport using curl. I think the data is imported from a Microsoft
> SQL
> >>> server using a solr plugin.
> >>>
> >>>
> >>>> Also, what do you mean by “All nodes are standalone”?
> >>>>
> >>>
> >>> I mean that nodes don't share filesystem (I'm planning to use Hadoop,
> but
> >>> I've to learn to create and maintain the cluster first). All nodes has
> >> its
> >>> own data drive and are connected to the cluster using zookeeper.
> >>>
> >>>
> >>>>
> >>>> Regards,
> >>>> Emir
> >>>> --
> >>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>>
> >>>>
> >>>>
> >>>>> On 12 Feb 2018, at 11:16, Daniel Carrasco <d.carrasco@i2tic.com>
> >> wrote:
> >>>>>
> >>>>> Hello,
> >>>>>
> >>>>> We're using Solr to manage products data on our shop and the last
> week
> >>>> some
> >>>>> customers called us telling that price between shop and shopping
> basket
> >>>>> differs. After research a bit I've noticed that it happens sometimes
> on
> >>>>> page refresh.
> >>>>> After disabling all cache I've queried all solr instances to see
if
> >> data
> >>>> is
> >>>>> correct and I've seen that one of them give a different price for
the
> >>>>> product, so looks like the instance has not got the updated data.
> >>>>>
> >>>>> - How is possible that a node on a cluster have different data?
> >>>>> - How i can check if data is in sync?, because the cluster looks
al
> >>>>> healthy on admin, and the node is active and OK.
> >>>>> - Is there any way to detect this error? and How I can force resyncs?
> >>>>>
> >>>>> After restart the node it got synced, so the data now is OK, but
I
> >> can't
> >>>>> restart the nodes every time to see if data is right (it tooks a
lot
> of
> >>>>> time to be synced again).
> >>>>>
> >>>>> My configuration is: 8 Solr nodes using v7.1.0 and zookeeper v3.4.11.
> >> All
> >>>>> nodes are standalone (I'm not using hadoop).
> >>>>>
> >>>>> Thanks and greetings!
> >>>>> --
> >>>>> _________________________________________
> >>>>>
> >>>>>    Daniel Carrasco Marín
> >>>>>    Ingeniería para la Innovación i2TIC, S.L.
> >>>>>    Tlf:  +34 911 12 32 84 Ext: 223 <+34%20911%2012%2032%2084>
> >>>>>    www.i2tic.com <http://www.i2tic.com/> <http://www.i2tic.com/
<
> http://www.i2tic.com/>>
> >>>>> _________________________________________
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> _________________________________________
> >>>
> >>>     Daniel Carrasco Marín
> >>>     Ingeniería para la Innovación i2TIC, S.L.
> >>>     Tlf:  +34 911 12 32 84 Ext: 223
> >>>     www.i2tic.com <http://www.i2tic.com/> <http://www.i2tic.com/
<
> http://www.i2tic.com/>>
> >>> _________________________________________
> >>
> >>
> >
> >
> > --
> > _________________________________________
> >
> >      Daniel Carrasco Marín
> >      Ingeniería para la Innovación i2TIC, S.L.
> >      Tlf:  +34 911 12 32 84 Ext: 223
> >      www.i2tic.com <http://www.i2tic.com/>
> > _________________________________________
>
>
Thanks!!

-- 
_________________________________________

      Daniel Carrasco Marín
      Ingeniería para la Innovación i2TIC, S.L.
      Tlf:  +34 911 12 32 84 Ext: 223
      www.i2tic.com
_________________________________________

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message