Please do not cross-post, this thread is for the users mailing list, not dev.

You have got the answer several times already: clean your input data. You obviously parse some pdf that contains bad data that result in one single token (word) being >32kb. Clean your input data either in your application or with Update Processor or TokenFilter in Solr.

Jan Høydahl

11. feb. 2019 kl. 06:27 skrev Kranthi Kumar K <KranthiKumar.K@ccubefintech.com>:

Hi Team,

 

We didn’t get any suggested solutions. Could you help us by providing better approach or a solution to fix the issue?

We’ll be awaiting your reply.

 

image001

Thanks & Regards,

Kranthi Kumar.K,

Software Engineer,

Ccube Fintech Global Services Pvt Ltd.,

Email/Skype: kranthikumar.k@ccubefintech.com,

Mobile: +91-8978078449.

 

 

From: Kranthi Kumar K <KranthiKumar.K@ccubefintech.com>
Sent: Friday, February 1, 2019 10:26 AM
To: dev@lucene.apache.org; solr-user@lucene.apache.org
Cc: Ananda Babu medida <Anandababu.medida@ccubefintech.com>; Srinivasa Reddy Karri <srinivasareddy.karri@ccubefintech.com>; Ravi Vangala <Ravi.Vangala@ccubefintech.com>; Suresh Malladi <suresh@ccubefintech.com>; Vijay Nandula <vijay.nandula@ccubefintech.com>; Michelle Ngo <Michelle.Ngo@ccube.com.au>
Subject: Re: Solr Size Limitation upto 32 kb limitation

 

Hi Team,

 

Thanks for your suggestions that you've posted, but none of them have fixed our issue. Could you please provide us your valuable suggestions to address this issue.

 

We'll be awaiting your reply.

 

Thanks,

Kranthi kumar.K


From: Michelle Ngo
Sent: Thursday, January 24, 2019 12:00:06 PM
To: Kranthi Kumar K; dev@lucene.apache.org; solr-user@lucene.apache.org
Cc: Ananda Babu medida; Srinivasa Reddy Karri; Ravi Vangala; Suresh Malladi; Vijay Nandula
Subject: RE: Solr Size Limitation upto 32 kb limitation

 

Thanks @Kranthi Kumar K for following up

 

From: Kranthi Kumar K <KranthiKumar.K@ccubefintech.com>
Sent: Thursday, 24 January 2019 4:51 PM
To: dev@lucene.apache.org; solr-user@lucene.apache.org
Cc: Ananda Babu medida <Anandababu.medida@ccubefintech.com>; Srinivasa Reddy Karri <srinivasareddy.karri@ccubefintech.com>; Michelle Ngo <Michelle.Ngo@ccube.com.au>; Ravi Vangala <Ravi.Vangala@ccubefintech.com>; Suresh Malladi <suresh@ccubefintech.com>; Vijay Nandula <vijay.nandula@ccubefintech.com>
Subject: RE: Solr Size Limitation upto 32 kb limitation

 

Thank you Bernd Fehling for your suggested solution, I've tried the same by changing the type and added multivalued to true in Schema.xml file i.e,

change from:

 

<field name="FileContent" type="text_general" indexed="true" stored="true" />

 

Changed to:

 

<field name="FileContent" type="text_general" indexed="true" stored="true" multiValued="true" />

 

After changing it also still we are unable to import the files size > 32 kb. please find the solution suggested by Bernd in the below url:

 

http://lucene.472066.n3.nabble.com/Re-Solr-Size-Limitation-upto-32-kb-limitation-td4421569.html

 

Bernd Fehling, could you please suggest another alternative solution to resolve our issue, which would help us alot?

 

Please let me know for any questions.

 

image001

Thanks & Regards,

Kranthi Kumar.K,

Software Engineer,

Ccube Fintech Global Services Pvt Ltd.,

Email/Skype: kranthikumar.k@ccubefintech.com,

Mobile: +91-8978078449.

 

 

From: Kranthi Kumar K
Sent: Friday, January 18, 2019 4:22 PM
To: dev@lucene.apache.org; solr-user@lucene.apache.org
Cc: Ananda Babu medida <Anandababu.medida@ccubefintech.com>; Srinivasa Reddy Karri <srinivasareddy.karri@ccubefintech.com>; Michelle Ngo <Michelle.Ngo@ccube.com.au>; Ravi Vangala <Ravi.Vangala@ccubefintech.com>
Subject: RE: Solr Size Limitation upto 32 kb limitation

 

Hi team,

 

Thank you Erick Erickson ,Bernd Fehling , Jan Hoydahl for your suggested solutions. I’ve tried the suggested one’s and still we are unable to import files having            size  >32 kb, it is displaying same error.

 

Below link has the suggested solutions. Please have a look once.

 

http://lucene.472066.n3.nabble.com/Solr-Size-Limitation-upto-32-KB-files-td4419779.html

 

  1. As per Erick Erickson, I’ve changed the string type to Text type based and still the issue occurs .

I’ve changed from :

 

<field name="FileContent" type="string_rev" indexed="true" stored="true" />

 

Changed to:

 

<field name="FileContent" type="text" indexed="true" stored="true" />

 

If we do so, it is showing error in the log, please find the error in the attachment.

 

If I change to:

 

<field name="FileContent" type="text_general" indexed="true" stored="true" />

 

It is not showing any error , but the issue still exists.

 

  1. As per Jan Hoydahl, I have gone through the link that you have provided and checked ‘requestParsers’ tag in solrconfig.xml,

 

RequestParsers tag in our application is as follows:

 

‘<requestParsers enableRemoteStreaming="true"

                    multipartUploadLimitInKB="2048000"

                    formdataUploadLimitInKB="2048"

                    addHttpRequestToContext="false"/>’

Request parsers, which we are using and in the link you have provided are similar. And still we are unable to import the files size >32 kb.

 

  1. As per Bernd Fehling, we are using Solr 4.10.2. you have mentioned as,

If you are trying to add larger content then you have to "chop" that 
by yourself and add it as multivalued. Can be done within a self written loader. 

 

I’m a newbie to Solr and I didn’t get what exactly ‘self written loader’ is?

 

Could you please provide us sample code, that helps us to go further?

 

 

image001

Thanks & Regards,

Kranthi Kumar.K,

Software Engineer,

Ccube Fintech Global Services Pvt Ltd.,

Email/Skype: kranthikumar.k@ccubefintech.com,

Mobile: +91-8978078449.

 

 

From: Kranthi Kumar K <KranthiKumar.K@ccubefintech.com>
Sent: Thursday, January 17, 2019 12:43 PM
To: dev@lucene.apache.org; solr-user@lucene.apache.org
Cc: Ananda Babu medida <Anandababu.medida@ccubefintech.com>; Srinivasa Reddy Karri <srinivasareddy.karri@ccubefintech.com>; Michelle Ngo <Michelle.Ngo@ccube.com.au>
Subject: Re: Solr Size Limitation upto 32 kb limitation

 

Hi Team,

 

Can we have any updates on the below issue? We are awaiting your reply.

 

Thanks,

Kranthi kumar.K


From: Kranthi Kumar K
Sent: Friday, January 4, 2019 5:01:38 PM
To: dev@lucene.apache.org
Cc: Ananda Babu medida; Srinivasa Reddy Karri
Subject: Solr Size Limitation upto 32 kb limitation

 

Hi team,

 

We are currently using Solr 4.2.1 version in our project and everything is going well. But recently, we are facing an issue with Solr Data Import. It is not importing the files with size greater than 32766 bytes (i.e, 32 kb) and showing 2 exceptions:

 

  1. java.lang.illegalargumentexception
  2. org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception

 

Please find the attached screenshot for reference.

 

We have searched for solutions in many forums and didn’t find the exact solution for this issue. Interestingly, we found in the article, by changing the type of the ‘field’ from sting to  ‘text_general’ might solve the issue. Please have a look in the below forum:

 

https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t  

 

Schema.xml:

Changed from:

‘<field name="text" type="string_rev" indexed="true" stored="false" multiValued="true" />’

 

Changed to:

‘<field name="text" type="text_general " indexed="true" stored="false" multiValued="true" />’

 

We have tried it but still it is not importing the files > 32 KB or 32766 bytes.

 

Could you please let us know the solution to fix this issue? We’ll be awaiting your reply.