lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ganesh" <>
Subject Re: Re: Facet search
Date Thu, 24 Feb 2011 06:21:25 GMT
My requirement is, I want to display the Top Terms with their count for every field. I am having
10 fields and in Each field Top 3 terms with their count should be displayed. When User selects
any term then the search will be performed to filter the results. 

I could use term vector and enumurate the term freqeuncy and sort it but it may be time consuming

Field_1                   Field_2                     Field_N  
Term_1_1(100)     Term_2_1(389)       Term_N_1(216)
Term_1_2(78)      Term_2_2(134)        Term_N_2(156)
Term_1_3(56)      Term_2_3(78)          Term_N_3(89)

Top users                 Country                 PageAccessed     
UserA (100)             India (1000)           /Articles/abc (200)
UserB (100)             US(500)                 /Articles/xyz (200)
UserC (100)             Russia(200)            /Articles/aaa (100)

When click on particular user, the results should be grouped for that User.
Top users                 Country                 PageAccessed     
UserA (100)             India (100)           /Articles/abc (55)
                                US(50)                 /Articles/xyz (25)
                                                             /Articles/aaa (10)

----- Original Message ----- 
From: "Chris Hostetter" <>
To: "Lucene Users" <>
Sent: Thursday, February 24, 2011 7:29 AM
Subject: [Bulk] Re: Facet search

> : This is another indicator that we should really try to extract Solr's
> : capabilities like Faceting into modules! Solr should not be required
> : if you want to use the facteing stuff we already have.
> the most basic logic of (field) faceting used by solr is simple TermEnum 
> iteration and document set intersection.  Any Lucene application can do 
> that w/o really refactoring any code out of Solr.  it's very straight 
> forward.
> The real value adds that solr provides are:
> * DocSet caching and cache warming which solr can do for you because it 
> knows when index changes (because it manages all the writes and reader 
> reopening).  
> * select alternate facet algorithms based on schema knowledge -- looking 
> at field types and value cardinality to determine when FieldCache or 
> UnInvertedField would be more efficient then TermEnumeration and DocSets
> * acurate counts when doing distributed searching
> This aren't things that seem like they could really be extracted in a very 
> reusable manner -- the pre-requisets and scaffolding you'd need to 
> setup and use these pieces in a meaningful way outside of solr would 
> probably wind up being just like solr.
> There are however lots of pieces that oculd be extracted and reused -- but 
> those things have allready been started/discussed (DocSets, hooks for 
> generic caches that are notified when IndexReaders are reopened, or 
> segments are changed, multivalue support in FieldCache, etc...)
> : >> I am using Lucene for my project and we have new requirement to present
> : >> data in the form of Analytics. Facet could be used for that but for this
> thats kind of a vague requirement -- if you can elaborate a bit on what 
> types of info you actaully want to compute/return, there may be a very 
> straightforward way to do it.  
> like i said: the basics of faceting over all terms in a field is *really* 
> trivial ... the original implementation in Solr was about 40 lines of 
> code...
> -Hoss


> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message