lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "N. Hira" <>
Subject Re: Thoughts on Search Analytics?
Date Sat, 07 May 2011 00:08:13 GMT

On 06-May-2011, at 2:04 AM, Paul Libbrecht wrote:

> Le 6 mai 2011 à 00:20, Otis Gospodnetic a écrit :
>>> thus far, only search-testing has provided some analytics measures for us  
>>> (precision and recall ones). We, of course, construct the test-suites from the
>>> logs.
>> Interesting.  It sounds like you don't currently utilize any sort of reporting 
>> tool that would tell you more about users' search experience other than what you

>> can glean from the logs (number of hits, query string, any other params like hl,

>> fq, etc., request handler used, and latency).?  
> Correct, there was no plan for such operation yet. It can sure be a useful thing.
> I wonder if there's a lot of literature about such best practice btw. 
> This I would be very interested to hear.
>> So you can't really tell, for 
>> example, what percentage of queries are getting DYM suggestions or what 
>> percentage of queries originate from DYM suggestion, right?  You also can't tell

>> which queries result in very low click-through rate? etc.
> I think that one of the keys here is that we would need external linkable queries, and
in the ActiveMath engine I've been working on there was none such unfortunately. 
>> Those are some examples of search analytics reports that I was thinking people 
>> might be running in order to better understand how their search is behaving and 
>> what users are experiencing.
> It sure is desirable but the toolset was not right-away available for this.
> The simple log counts as François Schiettecatte just responded in the solr list would
even be useful but again, it needs a little extra programming.
> All people I met would recommend Google Analytics for this. I tend to dislike their approach
and do not know open-source, "my server side", solutions. Hence I made this thread public
to hear about "satelites" (which are somewhat lucene related).
> paul

At Jostens, we have been using Solr on our corporate site for just under a year now.  We pick
up log data every night and summarize using a combination of grep/sed/awk and gnuplot.

The key metrics we use (for now):
1.  Query Response time average during the day (ten minute intervals)
2.  Queries run during the day (ten minute intervals)
3.  Top N queries during the day that resulted in too many/too few hits
4.  Top N queries during the day that resulted in a reasonable number of hits
5.  Top N queries for a given period (say 30 days)

What we would like to do next is to look at correlations between search terms and results
viewed to improve our content, but that requires result tracking.  The other thing that would
be extremely useful would be something that helps us get to synonyms.

Hope this helps.

Hira, N.R.
Cognocys, Inc.
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message