lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: Analyzer for code?
Date Wed, 07 Sep 2011 14:37:53 GMT
You can customize the stop word list on the analyzer (see the various constructors).

However, with code one generally wants it to be "parsed" and split into separate fields, not
just tokenized/analyzed into a single field.  For example, you might want the class names
separately searchable from the inner code. 

I've written, in the past, a Java doclet hook to do this sort of thing, letting the javadoc
engine do the heavy lifting.


On Sep 7, 2011, at 09:03 , Alan Williamson (aw2.0 cloud experts) wrote:

> Good Afternoon.
> We find ourselves indexing blocks of literal Java / CFML code.   We are using the standardanalyzer
but it seems to be a little keen with respect to stop words.   Has anyone come across an Analyzer
designed for code?
> thanks
> a

View raw message