lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cyril Gabathuler <>
Subject Typo Analyzer
Date Wed, 02 Dec 2020 06:51:07 GMT
Hi all

This is the first I’m using a emailing list so please bear with me if I’m doing anything

I’m looking for some support for a specific use case I have.

On our webpage we implemented an “auto suggestion” search based on the AnalyzingInfixSuggester.
As we don’t have a lot of data I used the in-memory approach of Lucene. The final product
looks something like this:

Now I was wondering if I can make the search more robust for e.g. typos. Is it e.g. possible
that I get the “same” search results for the word sonstige (correct spelling) and sonnstige
(incorrect spelling).

To give a better understanding how I implement this (maybe there are other things which can
be improved) find the important code snippets:

Creating the SearchIndex
 private InMemoryLuceneIndex createSearchIndex(int productVersion, LanguageEnum languageEnum)
        StopwordAnalyzerBase analyzer = null;
        switch (languageEnum) {
            case DE:
                analyzer = new GermanAnalyzer();
            case EN:
                analyzer = new EnglishAnalyzer();
            case IT:
                analyzer = new ItalianAnalyzer();
            case FR:
                analyzer = new FrenchAnalyzer();
        InMemoryLuceneIndex inMemoryLuceneIndex = new InMemoryLuceneIndex(new RAMDirectory(),
        final AwsXmlResponseGetRisikoklassifizierungen riskClassification = riskClassificationService.getRiskClassification(productVersion,
        final List<RiskClassification> riskClassifications = riskClassification.getRisikoklassifizierungen().stream()
                .map(risk -> RiskClassification.builder().isActive(!risk.isInNegativliste()).nogaCode(risk.getRisikonummer()).nogaDescription(risk.getBetriebsart()).nogaKeywords(risk.getStichworte()).build())
        return inMemoryLuceneIndex;

public class InMemoryLuceneIndex {
    private Optional<AnalyzingInfixSuggester> analyzingInfixSuggester;
    public InMemoryLuceneIndex(Directory memoryIndex, StopwordAnalyzerBase analyzer) {
        try {
            analyzingInfixSuggester = Optional.of(new AnalyzingInfixSuggester(memoryIndex,
        } catch (IOException e) {
            log.error("unable to create the search index", e);
            analyzingInfixSuggester = Optional.empty();
     * Ask for a suggestion
     * @param searchTerm
     * @return
    public Optional<List<RiskClassification>> suggest(String searchTerm) {
        if (analyzingInfixSuggester.isPresent()) {
            final List<Lookup.LookupResult> lookupResults;
            try {
                lookupResults = analyzingInfixSuggester.get().lookup(searchTerm, true, 10);
      "found {} results", lookupResults.size());
                return Optional.of(
                        .map(result -> {
                            try {
                                ByteArrayInputStream bis = new ByteArrayInputStream(result.payload.bytes);
                                ObjectInputStream in = new ObjectInputStream(bis);
                                return (RiskClassification) in.readObject();
                            } catch (IOException | ClassNotFoundException e) {
                                throw new Error("Could not decode payload :(");
            } catch (IOException e) {
                log.error("unable to lookup", e);
        return Optional.empty();
     * build index for suggestion search
     * @param riskClassifications
    public void indexRiskClassifications(List<RiskClassification> riskClassifications)
{"add {} risks to index", riskClassifications.size());
        analyzingInfixSuggester.ifPresent(suggester -> {
                    try {
                    } catch (IOException e) {
                        log.error("unable to build the index", e);

thanks a lot for any pointers!
View raw message