lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriele Kahlout <gabri...@mysimpatico.com>
Subject Re: How to reserve ids?
Date Tue, 27 Sep 2011 22:43:50 GMT
Otis,

I'm following up on this as solving my problem though the stopwords
mechanism would be great. *Do stopwords apply also to the url/id field?*

Continuing on the msn.com example, with "msn.com" as a stopword
msn.comwebpage may still actually be indexed if neither the title nor
the body
contains "msn.com". Isn't it?

P.S.
I just click on 'reply to all' (or reply on the phone). If it bothers you
I'll make the less lazy effort of selecting 'reply'
[image: replyall.png]
On Tue, Sep 27, 2011 at 6:40 PM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:

> Gabriele,
>
> Using "msn.com" as a stopword would simply mean that msn.com would not be
> indexed and therefore a search for "msn.com" would not yield results.  You
> could still search for "hotmail" and it may match documents that have "
> msn.com" token stored in them, even though "msn.com" is a stopword.
>
> Otis
>
> P.S.
> No need to CC me, I'm on the list.
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
> >________________________________
> >From: Gabriele Kahlout <gabriele@mysimpatico.com>
> >To: solr-user@lucene.apache.org; Otis Gospodnetic <
> otis_gospodnetic@yahoo.com>
> >Sent: Tuesday, September 27, 2011 1:58 AM
> >Subject: Re: How to reserve ids?
> >
> >I'm interested in the stopwords solution as it sounds like less work but
> i'm not sure i understand how it works. By having msn.com as a stopword it
> doesnt mean i wont get msn.com as a result for say 'hotmail'. My
> understanding is that msn.com will never make it to the similarity
> function and thus affect the score calculation. But seldom does the url
> anyway (in my searches on content)!
> >
> >
>



-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message