lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: for those of you using gmail...
Date Thu, 18 Jul 2013 13:12:26 GMT
On Wed, Jul 17, 2013 at 10:02 PM, Chris Hostetter
<hossman_lucene@fucit.org> wrote:
>
> : And seems that it returns no result for query:
> :
> :   from:jenkins@thetaphi.de subject:"build 6605"  ANY_WORD_NOT_IN_TITLE
> :
> : Maybe for some mails, only title field are taken into consideration?
>
> Ah ... interesting.  I wonder if maybe something about the content of the
> jenkins emails + the multipart/mixed wrapping done by exmlm occasionally
> causes gmail to balk at trying to parse the various parts of *some*
> jenkins emails (like maybe just he ones with multi-byte characters?)  so
> you are left with only the headers being searchable?
>
> That still doesn't explain this descrepancy though...

Maybe it's because "build 6605" was a biiig email (1.9 MB), and Google
punted indexing any text whatsoever from its body?  I see this in the
build 6605 email:

   [Message clipped]  View entire message

> : >     from:jenkins@thetaphi.de regression
> : >
> : > I only get results up to Jul 2, even though there are many build
> : > failures after that.
> :
> : A recent search got results up to #6530. Still no 6605.
>
> mike says the newest email gmail will return from that serach is Jul 2,
> but Han, myself, (and if IIRC several other people) are all seeing lots of
> results since then ... just not all of them, notably the specific one
> Mike asked about ("Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b96) - Build
> # 660" sent 13 hours ago)

In fact, now when I run this search, I'm seeing additional results
after Jul 2!  And, they seem to be the smaller emails.

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message