lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kainth, Sachin" <>
Subject 'a', 's' and 't' don't index properly
Date Thu, 08 Feb 2007 13:45:01 GMT
> Hello,
> I have a database of tracks, artists and albums and I'm indexing these
> 3 attributes plus also the first letter of the track thus (incidently
> I'm using dotlucene but the implementation of dotlucene is similar to
> the Java one):
>    Document Doc = new Document();
>    String Album = ...
>    String Artist = ...
>    String Track = ...
>    Doc.Add(Field.Text("album", Album));
>    Doc.Add(Field.Text("artist", Artist));
>    Doc.Add(Field.Text("track", Track));
>    Doc.Add(Field.Text("firstletter", Track.Substring(0,1)));
> Problem is I don't think certain first letters are being indexed
> properly or at all, either that or there is some problem elsewhere.  I
> have noticed that the letters 'a', 's' and 't' (there may be others)
> cause me problems.  I shall explain the problem I have.  When I search
> for the documents I perform a sorting operation on the firstletter
> field but where the firstletter was 'a', 's' or 't' the returned list
> does not contain those records in sorted order (all other records are
> sorted correctly).
> Here is my search command:
> Hits hits = searcher.Search(query, new Sort(new SortField[] { new
> SortField("firstletter", SortField.STRING)}));
> What I don't know is whether the fault lies in the indexing or in this
> or other code.  Does anyone know what could have happened.
> Thanks
> Sachin

This email and any attached files are confidential and copyright protected. If you are not
the addressee, any dissemination of this communication is strictly prohibited. Unless otherwise
expressly agreed in writing, nothing stated in this communication shall be legally binding.

The ultimate parent company of the Atkins Group is WS Atkins plc.  Registered in England No.
1885586.  Registered Office Woodcote Grove, Ashley Road, Epsom, Surrey KT18 5BW.

Consider the environment. Please don't print this e-mail unless you really need to. 

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message