lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <>
Subject Re: lucene newbie question
Date Mon, 02 Oct 2006 19:04:39 GMT
SSN actually is a common situation.

Assume you have a (relational) database with a table of products with three
columns :
- SSN, which is also a primary key for that table,
- DESCRIPTION, which has free text (i.e. unformatted text) describing the
- OTHER - additional info.
Also assume you want to allow users of your application to search a product
by its description. For each product found, you intend to fetch the data on
that product from the database and display it to the users.

This can be done in the following setup:
Create a Lucene index with two fields:
- ssn - stored, but not indexed
- description - tokenized (hence indexed) but not stored.

Now the application would send the user query to Lucene, using the
description field. For each document found, the application would fetch its
ssn (which is available from the Lucene index since it was stored). Using
this ssn, the application would fetch all sorts of data on that product and
display it to the user.

There are other possible designs of course - you may want to have
additional data in the Lucene index, but this hopefully just gives the
feeling how different fields with different settings are used in an

I think you would find LIA ("Lucene In Action" book) very useful.

"Los Morales" <> wrote on 02/10/2006 11:46:45:

> Hi Erik,
> Thanks for the response.
> >Consider the index in the back of a book.  You could tear that out  and
> >still use it to tell what page something is on, but you have no  actual
> >content in hand.
> So, I guess what I'm having a hard time trying to figure out is, what's
> point of having an index when you can't search/retrieve the contents of a

> field in the index since it is not stored?  Isn't the whole point of
> an index is to be able to search and retrieve the contents efficiently?
> Basically I'm not sure the points of UnIndexed and UnStored fields types.

> Say I use a field type "unindexed" for my SSN.  I know its stored in the
> index but how am I suppose to retrieve it?
> As for the unstored, its like the scenario I described above... I see the

> fields in the index but I won't be able to search/retrieve it since I
> have the contents.  The "text" field type makes sense to me (with data
> a String), as well as the type "keyword".
> Is there a scenario or scenarios you can describe where
> will be useful?  Thanks in advanced!
> -los
> >From: Erik Hatcher <>
> >Reply-To:
> >To:
> >Subject: Re: lucene newbie question
> >Date: Mon, 2 Oct 2006 14:12:25 -0400
> >
> >
> >On Oct 2, 2006, at 2:08 PM, Los Morales wrote:
> >>I'm new to Lucene and IR in general.  I'm a bit confused on the
> >>of fields.  From what I've read, a field does not have to  be indexed
> >>its value can be stored in an index.  Likewise a  field can be indexed
> >>its value is not stored in an index.  Now  how can a field be
> >>when its value is not stored in the  index and vice-versa?  Again, I'm
> >>to the Index/Search  paradigm.  Thanks in advanced.
> >
> >Consider the index in the back of a book.  You could tear that out  and
> >still use it to tell what page something is on, but you have no  actual
> >content in hand.  When a field is tokenized (and therefore  implicitly
> >indexed), it is run through the specified Analyzer and the  terms
> >are indexed, but the original text may or may not also  be stored in the

> >index.
> >
> >Make sense?
> >
> >   Erik
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail:
> >For additional commands, e-mail:
> >
> _________________________________________________________________
> Be seen and heard with Windows Live Messenger and Microsoft LifeCams
> href=
> mspx?locale=en-us&source=hmtagline
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message