lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <>
Subject Re: StandardFilter not handling dots as exptected ?
Date Thu, 06 Aug 2009 15:20:47 GMT
Ian Lea wrote:
> See which appears to
> be talking about the same sort of thing, and
> StandardAnalyzer.setReplaceInvalidAcronym(b).
> Quite how you deal with this in your own analyzer is left as an exercise ...

Yes I think you are right, though dont understand it fully

        TokenStream ts = analyzer.tokenStream("content", new 
        Token t;
        while ((t = != null) { System.out.println("R.E.S. 
parsed to :"+t); }

        ts = analyzer.tokenStream("content", new StringReader("R.E.S"));
        while ((t = != null) { System.out.println("R.E.S 
parsed to :"+t); }

this code outputs

R.E.S. parsed to :(res,0,6,type=<ACRONYM>)
R.E.S parsed to :(r.e.s,0,5,type=<HOST>)

so from my perspective I cannot see
it thinks R.E.S is a HOST it should be an acronym, but also for the one 
that is an acronym I thought it end up as r.e.s not res

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message