lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Kiraly" <pkir...@tesuji.eu>
Subject date field type problem
Date Wed, 02 Sep 2009 10:00:30 GMT
Hi Solr users,

I have a lots of dates from a library catalog in not
solr.DateField compatible format. I wrote a new <fieldType>
definition inside the solrconfig.xml, which creates
eg. 1991-01-01T00:00:01Z from the input '[c1991.]' string.
It works fine when I tried it with the typical values
in the http://localhost:8983/solr/admin/analysis.jsp,
but it always throws an exception, when I try to index
the records.

<fieldType name="trickyDate" class="solr.DateField"
  sortMissingLast="true" omitNorms="true">
  <analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory" />
    <filter class="solr.TrimFilterFactory" />
    <filter class="solr.PatternReplaceFilterFactory"
      pattern="sh..?wa \d\d? " replacement="" replace="first"/>
    <filter class="solr.PatternReplaceFilterFactory"
      pattern="june (\d\d), " replacement="" replace="first"/>
    <filter class="solr.PatternReplaceFilterFactory"
      pattern="september (\d\d), " replacement="" replace="first"/>
    <filter class="solr.PatternReplaceFilterFactory"
      pattern="(\D)" replacement="" replace="all"/>
    <filter class="solr.PatternReplaceFilterFactory"
      pattern="^(\d{4})\d*$" replacement="$1-01-01T00:00:01"
      replace="all"/>
  </analyzer>
</fieldType>

It is more than possible, that I misunderstand something. What I
like to do is to 'normalize' somehow the input data, and I thought
that it is more effective in the Solr side, than in the client.

Have you got any advise, how I may continue?

P├ęter


Mime
View raw message