lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Glock, Thomas" <thomas.gl...@pfizer.com>
Subject RE: Solr under tomcat - UTF-8 issue
Date Sat, 24 Oct 2009 17:44:35 GMT
Thanks -

I agree.  However my application requires results be trimmed to users based on roles.  The
roles are repeating values on the documents.  Users have many different role combinations
as do documents.
I recognize this is going to hamper caching - but using a GET will tend to limit the size
of search phrases when combined with the boolean role clause.  And I am concerned with hitting
url limits.

At any rate I solved it thanks to Yonik's recommendation.  

My flex client httpservice by default only sets the content-type request header to  "application/x-www-form-urlencoded"
 what it needed to do for tomcat is set the content-type request header to content-type =
"application/x-www-form-urlencoded; charset=UTF-8"; 

If you have any suggestions regarding limiting results based on user and document role permutations
- I'm all ears.  I've been to the Search Summit in NYC and no vendor could even seem to grasp
the concept.  

The problem case statement is this  - I have users globally who need to search for content
tailored to them.  Users searching for 'Holiday' don't get any value from 10000 documents
having the word holiday. What they need are documents authored for that population.  The documents
have the associated role information as metadata and therefore users will get only the documents
they have access to and are relevant to them.  That's the plan anyway!  

By chance I stumbled in Solr a month or so ago and I think its awesome.  I got the book two
days ago too - fantastic!

Thanks again,
Tom

-----Original Message-----
From: Walter Underwood [mailto:wunder@wunderwood.org] 
Sent: Saturday, October 24, 2009 1:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr under tomcat - UTF-8 issue

Don't use POST. That is the wrong HTTP semantic for search results.  
Use GET. That will make it possible to cache the results, will make your HTTP logs useful,
and all sorts of other good things.

wunder

On Oct 24, 2009, at 10:11 AM, Glock, Thomas wrote:

>
> Thanks - I now think it must be due to my client not sending enough ( 
> or correct ) headers in the request.
>
> Tomcat does work when using an HTTP GET but is failing the POST from 
> my flash client.
>
> For example putting this in both firefox and IE browsers url works
> correctly:
>
> http://localhost:8080/hranswers/elevate?fl=*%20score&indent=on&start=0
> &q=%D0%94%D0%BE%D0%B1%D0%B0%D0%B2%D0%B8%D1%82%D1%8C%20%D0%BD%D0%BE%D0%
> B2%D1%8B%D1%85%20%D0%BA%D0%B0%D0%BD%D0%B4%D0%B8%D0%B4%D0%B0%D1%82%D0%B
> E%D0%B2&fq=language_cd:ru&rows=20
>
> The POST information my client is sending looks like this and it
> fails:
>
> POST /hranswers/elevate HTTP/1.1
> Accept: */*
> Accept-Language: en-US
> x-flash-version: 10,0,32,18
> Content-Type: application/x-www-form-urlencoded
> Content-Encoding: UTF-8
> Content-Length: 209
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; 
> .NET CLR 1.1.4322; InfoPath.1; .NET CLR 2.0.50727; .NET CLR 
> 3.0.04506.648; MS-RTC LM 8; .NET CLR 3.0.4506.2152; .NET CLR 
> 3.5.30729; UserABC123)
> Host: localhost:8080
> Connection: Keep-Alive
> Pragma: no-cache
>
> fq=language%5Fcd%3Aru&rows=20&start=0&fl=%2A%20score&indent=on&q=
> %D0%94%D0%BE%D0%B1%D0%B0%D0%B2%D0%B8%D1%82%D1%8C%20%D0%BD%D0%BE
> %D0%B2%D1%8B%D1%85%20%D0%BA%D0%B0%D0%BD
> %D0%B4%D0%B8%D0%B4%D0%B0%D1%82%D0%BE%D0%B2
>
> I will keep digging - and let you know how it turns out.
>
> Thanks!
>
>
> -----Original Message-----
> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik 
> Seeley
> Sent: Saturday, October 24, 2009 12:43 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr under tomcat - UTF-8 issue
>
> Try using example/exampledocs/test_utf8.sh to narrow down if the 
> charset problems you're hitting are due to servlet container 
> configuration.
>
> -Yonik
> http://www.lucidimagination.com
>
>
> 2009/10/24 Glock, Thomas <thomas.glock@pfizer.com>:
>>
>> Thanks but not working...
>>
>> I did have the URIEncoding in place and just again moved the 
>> URIEncoding attribute to be the first attribute - ensured I saved 
>> sever.xml, shut down tomcat, deleted logs and cache and still no 
>> luck....  Its probably something very simple and I'm just missing it.
>>
>> Thanks for your help.
>>
>>
>> -----Original Message-----
>> From: Zsolt Czinkos [mailto:czinkos@gmail.com]
>> Sent: Saturday, October 24, 2009 11:36 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr under tomcat - UTF-8 issue
>>
>> Hello
>>
>> Have you set URIEncoding attribute to UTF-8 in tomcat's server.xml 
>> (on connector element)?
>>
>> Like:
>>
>> <Connector URIEncoding="UTF-8" connectionTimeout="20000" port="8080"
>> protocol="HTTP/1.1" redirectPort="8443"/>
>>
>> Hope this helps.
>>
>> Best regards
>>
>> czinkos
>>
>>
>> 2009/10/24 Glock, Thomas <thomas.glock@pfizer.com>:
>>>
>>> Hoping someone can help -
>>>
>>> Problem:
>>>        Querying for non-english phrases such as Добавить do not 
>>> return any results under Tomcat but do work when using the Jetty 
>>> example.
>>>
>>>        Both tomcat and jetty are being queried by the same custom
>>> (flash) client and both reference the same solr/data/index.
>>>
>>>        I'm using an http POST rather than http GET to do the query 
>>> to solr.  I believe the problem must be in how tomcat is configured 
>>> and had hoped the -Dfile.encoding=UTF-8 would solve it
>>> - but no luck.  I've stopped started tomcat and deleted the work 
>>> directory as well.
>>>
>>>        Results are the same in both IE6 and Firefox and I've used 
>>> both firebug and fiddler to view the http request/responses.  It is 
>>> consistent - jetty works, tomcat does not.
>>>
>>> Environment:
>>>        Tomcat 6 as a service on WinXP Professional 2002 sp 2
>>>        Tomcat Java properties -
>>>
>>>        -Dcatalina.home=C:\Program Files\Apache Software 
>>> Foundation\Tomcat 6.0
>>>        -Dcatalina.base=C:\Program Files\Apache Software 
>>> Foundation\Tomcat 6.0
>>>        -Djava.endorsed.dirs=C:\Program Files\Apache Software 
>>> Foundation\Tomcat 6.0\endorsed
>>>        -Djava.io.tmpdir=C:\Program Files\Apache Software 
>>> Foundation\Tomcat 6.0\temp
>>>
>>> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
>>>        -Djava.util.logging.config.file=C:\Program Files\Apache 
>>> Software Foundation\Tomcat 6.0\conf\logging.properties
>>>        -Dfile.encoding=UTF-8
>>>
>>> Thanks in advance.
>>> Tom Glock
>>>
>>>
>>
>


Mime
View raw message