lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexis Aravena Silva <aarav...@itsofteg.com>
Subject Re: Problems creating index for suggestions
Date Wed, 05 Apr 2017 13:41:36 GMT
Hi Erick,


numDocs and MaxDocs = 8.


This is the content of the field _sugerencia_:


[cid:e03430ab-ff19-4955-a6da-d50b38e89b3d]



I've noticed that the problem is when Solr builds the fuzzySuggester index, in this type of
suggestion, the temp file grow greatly and when the process finish it disappears.



Regards.


________________________________
From: Erick Erickson <erickerickson@gmail.com>
Sent: Tuesday, April 4, 2017 8:05:42 PM
To: solr-user
Subject: Re: Problems creating index for suggestions

Something's indeed not what I'd expect here. One note: buildOnCommit
will rebuild the suggester every time the index has a document
committed _anywhere_. So if there's any activity at all in terms of
indexing your suggester is being built. I.e. if you have your
autocommit interval set to 1 minute and are actively indexing, your
suggester gets rebuilt every minute.

But that's not your problem. How big is the index this suggester is
part of? You say 8 documents. Exclusive of the suggester parts of the
index, how big is the rest of your index on disk?

The suggester re-reads all of the stored values in your entire base
index for the field _sugerencia_ to build itself. So I'm guessing that
when you say the index is 8 documents it's not quite what you think it
is.

On the admin screen, what are numDocs and maxDocs for the index in question?

Best,
Erick

On Tue, Apr 4, 2017 at 2:11 PM, Alexis Aravena Silva
<aaravena@itsofteg.com> wrote:
> Hi,
>
>
> I'm creating an index for suggestions, when I rebuild the index with 8 documents, Solr
creates a temp file that consumes over 20GB in the process and It takes more than 10 minutes
in reindex, what is the problem?, It's illogic that Solr takes so long and consumes such size
of my disk:
>
>
>
> Filed Type Definition:
>
>
> <fieldType name="text_suggestion" class="solr.TextField" positionIncrementGap="100"
multiValued="true">
>       <analyzer type="index">
>         <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
/>
>         <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15"
/>
>         <filter class="solr.ASCIIFoldingFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>          <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
/>
>         <filter class="solr.ASCIIFoldingFilterFactory"/>
>       </analyzer>
>     </fieldType>
>
>
> Suggester Configuration:
>
>
> <searchComponent name="suggest" class="solr.SuggestComponent">
>     <lst name="suggester">
>       <str name="name">fuzzySuggester</str>
>       <str name="lookupImpl">FuzzyLookupFactory</str>
>       <str name="indexPath">fuzzy_suggestions</str>
>       <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>       <str name="field">_sugerencia_</str>
>       <str name="payloadField">idTipoRegistro</str>
>       <str name="suggestAnalyzerFieldType">text_suggestion</str>
>       <str name="buildOnStartup">false</str>
>       <str name="buildOnCommit">true</str>
>     </lst>
>     <lst name="suggester">
>       <str name="name">infixSuggester</str>
>       <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
>       <str name="indexPath">infix_suggestions</str>
>       <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>       <str name="field">_sugerencia_</str>
>       <str name="payloadField">idTipoRegistro</str>
>       <str name="suggestAnalyzerFieldType">text_suggestion</str>
>       <str name="buildOnStartup">false</str>
>       <str name="buildOnCommit">true</str>
>     </lst>
>   </searchComponent>
>   <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy" >
>     <lst name="defaults">
>       <str name="suggest">true</str>
>       <str name="suggest.dictionary">infixSuggester</str>
>       <str name="suggest.dictionary">fuzzySuggester</str>
>       <str name="suggest.onlyMorePopular">true</str>
>       <str name="suggest.count">10</str>
>       <str name="suggest.collate">true</str>
>     </lst>
>     <arr name="components">
>       <str>suggest</str>
>     </arr>
>   </requestHandler>
>
>
>
> I rebuild the suggestions once by week, that's why I set buildOnCommit = true.
>
>
> Regards.

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message