lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lance Norskog" <>
Subject RE: problems with german "Umlaute"
Date Thu, 06 Sep 2007 18:35:26 GMT
I researched this problem before. The problem I found is that Python strings
are not Unicode by default. You have to do something to make them Unicode.
Here are the links I found:

We do the utf-8 encode&submit and so our strings are badly encoded and
stored. We are seeing the problem shown in "Marc-Andre Lemburg" in the link: an e-forward-accent becomes some Japanese character.

-----Original Message-----
From: news [] On Behalf Of Christian Klinger
Sent: Thursday, September 06, 2007 2:55 AM
Subject: problems with german "Umlaute"

Hi all,

i try to add/update documents with
the python api.

Everything works fine so far
but if i try to add a documents which contain German Umlaute (ö,ä,ü, ...) i
got errors.

Maybe someone has an idea how i could convert my data?
Should i post this to JIRA?

Thanks for help.

Btw: I have no .

This is my script:
from solr import *
kw = {'id':'12','title':title,'system':'plone','url':''}
c = SolrConnection('')

This is the error:

   File "", line 5, in ?
   File "/usr/local/lib/python2.4/site-packages/", line 596, in
     self.__add(lst, doc)
   File "/usr/local/lib/python2.4/site-packages/", line 710, in __add
     lst.append('<field name=%s>%s</field>' % (
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: 
ordinal not in range(128)

View raw message