lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Rafalovitch <arafa...@gmail.com>
Subject Re: Tika HTTP 400 Errors with DIH
Date Thu, 04 Dec 2014 16:43:21 GMT
400 error means something wrong on the server (resource not found).
So, it would be useful to see what URL is actually being requested.

Can you run some sort of network tracer to see the actual network
request (dtrace, Wireshark, etc)? That will dissect the problem into
half for you.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 4 December 2014 at 09:42, Teague James <teaguej@insystechinc.com> wrote:
> The database stores the URL as a CLOB. Querying Solr shows that the field value is "http://www.someaddress.com/documents/document1.docx"
> The URL works if I copy and paste it to the browser, but Tika gets a 400 error.
>
> Any ideas?
>
> Thanks!
> -Teague
> -----Original Message-----
> From: Alexandre Rafalovitch [mailto:arafalov@gmail.com]
> Sent: Tuesday, December 02, 2014 1:45 PM
> To: solr-user
> Subject: Re: Tika HTTP 400 Errors with DIH
>
> On 2 December 2014 at 13:19, Teague James <teaguej@insystechinc.com> wrote:
>> clob="true"
>
> What does ClobTransformer is doing on the DownloadURL field? Is it possible it is corrupting
the value somehow?
>
> Regards,
>    Alex.
>
> Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter:
http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>

Mime
View raw message