lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr Size Limitation upto 32 kb limitation
Date Fri, 04 Jan 2019 16:02:55 GMT
First off, the field in question is "FileContent", why do you think
the filed "text" is the problem?
Try switching FileContent to a text-based type.

If that's not the case, depending on the tokenizer and the input you
_still_ may
have an immense term even if you have a text-based field. For example, the
data could be something like base64 encoded, which has no spaces and you
are using a tokenizer that only breaks on whitespace.

You simply have got to look at the input data to make sense of the problem

Best,
Erick

On Fri, Jan 4, 2019 at 3:32 AM Kranthi Kumar K
<KranthiKumar.K@ccubefintech.com> wrote:
>
> Hi team,
>
>
>
> We are currently using Solr 4.2.1 version in our project and everything is going well.
But recently, we are facing an issue with Solr Data Import. It is not importing the files
with size greater than 32766 bytes (i.e, 32 kb) and showing 2 exceptions:
>
>
>
> java.lang.illegalargumentexception
> org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception
>
>
>
> Please find the attached screenshot for reference.
>
>
>
> We have searched for solutions in many forums and didn’t find the exact solution for
this issue. Interestingly, we found in the article, by changing the type of the ‘field’
from sting to  ‘text_general’ might solve the issue. Please have a look in the below forum:
>
>
>
> https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t
>
>
>
> Schema.xml:
>
> Changed from:
>
> ‘<field name="text" type="string_rev" indexed="true" stored="false" multiValued="true"
/>’
>
>
>
> Changed to:
>
> ‘<field name="text" type="text_general " indexed="true" stored="false" multiValued="true"
/>’
>
>
>
> We have tried it but still it is not importing the files > 32 KB or 32766 bytes.
>
>
>
> Could you please let us know the solution to fix this issue? We’ll be awaiting your
reply.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message