lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xi Shen <davidshe...@gmail.com>
Subject Re: duplicated URL sent from Nutch to solr index
Date Mon, 03 Dec 2012 08:11:22 GMT
Then the "URL" must be the same.


On Mon, Dec 3, 2012 at 2:34 PM, Joe Zhang <smartagent@gmail.com> wrote:

> Sorry I didn't make it perfectly clear. The "id" field is URL.
>
> On Sun, Dec 2, 2012 at 11:33 PM, Joe Zhang <smartagent@gmail.com> wrote:
>
> > Thanks!
> >
> >
> > On Sun, Dec 2, 2012 at 11:20 PM, Xi Shen <davidshen84@gmail.com> wrote:
> >
> >> If the value for "id" field is the same, the old entry will be update;
> if
> >> it is new, a new entry will be created & indexed.
> >>
> >> This is my experience. :)
> >>
> >>
> >> On Mon, Dec 3, 2012 at 1:45 PM, Joe Zhang <smartagent@gmail.com> wrote:
> >>
> >> > Dear list,
> >> >
> >> > I just want to confirm an expected behavior of solr:
> >> >
> >> > Assuming we have " <uniqueKey>id</uniqueKey>" in schema.xml
for solr,
> >> when
> >> > we send the same URL from nutch to solr multiple times. would there be
> >> ONLY
> >> > ONE entry for that URL, but the content (if changed) and timestamp
> >> would be
> >> > updated?
> >> >
> >> >
> >> > Thanks!
> >> >
> >> > Joe
> >> >
> >>
> >>
> >>
> >> --
> >> Regards,
> >> David Shen
> >>
> >> http://about.me/davidshen
> >> https://twitter.com/#!/davidshen84
> >>
> >
> >
>



-- 
Regards,
David Shen

http://about.me/davidshen
https://twitter.com/#!/davidshen84

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message