lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ксения Баталова <batalova...@gmail.com>
Subject Re: Solr Atomic Updates
Date Wed, 03 Jun 2015 18:04:55 GMT
Jack,

Decision of using several cores was made to increase indexing and
searching performance (experimentally).

In my project index is about 300-500 millions documents (each document
has rather difficult structure) and it may be larger.

So, while indexing the documents are being added in different cores by
some amount of threads.

In other words, each thread collect nessesary information for list of
documents and generate create-documents query to specific core.

At this moment it doesn't matter (and it can't be found out) which
document in which core will be.

And now there is necessary to update (atomic update) this index.

Something like this..

_ _

Batalova Kseniya


Explain a little about why you have separate cores, and how you decide
which core a new document should reside in. Your scenario still seems a bit
odd, so help us understand.


-- Jack Krupansky

On Wed, Jun 3, 2015 at 3:15 AM, Ксения Баталова <batalova.ks@gmail.com>
wrote:

> Hi!
>
> Thanks for your quick reply.
>
> The problem that all my index is consists of several parts (several cores)
>
> and while updating I don't know in advance in which part updated id is
> lying (in which core the document with specified id is lying).
>
> For example, I have two cores (*Core1 *and *Core2*) and I want to
> update the document with id *Id1 *and I don't know where this document
> is lying.
>
> So, I have to do two select-queries to my cores to know where it is.
>
> And then generate update-query to necessary core.
>
> What am I doing wrong?
>
> I remind that I'm using SOLR 4.4.0.
>
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> Best regards,
> Batalova Kseniya
>
>
> What exactly is the problem? And why do you care about cores, per se -
> other than to send the update to the core/collection you are trying to
> update? You should specify the core/collection name in the URL.
>
> You should also be using the Solr reference guide rather than the (old)
> wiki:
>
> https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
>
>
> -- Jack Krupansky
>
> On Tue, Jun 2, 2015 at 10:15 AM, Ксения Баталова <batalova.ks@gmail.com>
> wrote:
>
> > Hi!
> >
> > I'm using *SOLR 4.4.0* for searching in my project.
> > Now I am facing a problem of atomic updates in multiple cores.
> > From wiki:
> >
> > curl *http://localhost:8983/solr/update
> > <http://localhost:8983/solr/update> *-H
> > 'Content-type:application/json' -d '
> > [
> >  {
> >   "*id*"        : "*TestDoc1*",
> >   "title"     : {"set":"test1"},
> >   "revision"  : {"inc":3},
> >   "publisher" : {"add":"TestPublisher"}
> >  },
> >  {
> >   "id"        : "TestDoc2",
> >   "publisher" : {"add":"TestPublisher"}
> >  }
> > ]'
> >
> > As well as I understand, this means that the document, for example, with
> id
> > *TestDoc1*, will be searched for updating *only in one core*.
> > And if there is no any document with id *TestDoc1*, the document will be
> > created.
> > Can I somehow to specify the* list of cores* for searching and then
> > updating necessary document with specific id?
> >
> > It's something like *shards *parameter in *select* query.
> > From wiki:
> >
> > #now do a distributed search across both servers with your browser or
> curl
> > curl '
> >
> http://localhost:8983/solr/*select*?*shards*=localhost:8983/solr,localhost:7574/solr&indent=true&q=ipod+solr
> > '
> >
> > Or is it planned in the future?
> >
> > Thanks in advance.
> >
> > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> >
> > Best regards,
> > Batalova Kseniya
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message