lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <paul.d...@ub.unibe.ch>
Subject AW: RegexReplaceProcessorFactory pattern to detect multiple \n
Date Thu, 07 Feb 2019 13:13:45 GMT
You don’t say what happens, just that it is not working. I assume nothing is replaced? Perhaps
the pattern should be



   <str name="pattern">"(\n\s*){2,}"</str>



??



Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für Windows 10



Von: Zheng Lin Edwin Yeo<mailto:edwinyeozl@gmail.com>
Gesendet: Donnerstag, 7. Februar 2019 14:08
An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
Betreff: RegexReplaceProcessorFactory pattern to detect multiple \n



Hi,

I am trying to use the RegexReplaceProcessorFactory to remove more than two
\n with any number of spaces between them (Eg: \n\n, \n \n, \n \n  \n \n),
and replace it with two <br>.

I use the following regex pattern and it is working when I test it in
regex101.com. But it is not working when I put it inside the
RegexReplaceProcessorFactory as below:

<updateRequestProcessorChain name="removeCode">
<processor class="solr.RegexReplaceProcessorFactory">
   <str name="fieldName">content</str>
   <str name="pattern">"(\\n\s*){2,}"</str>
   <str name="replacement">&lt;br&gt;&lt;br&gt;</str>
</processor>
          </updateRequestProcessorChain>

To explain further about my regex pattern, \s* is instructing the regex to
match any \n that have space after and {2,} is instructing the regex to
match 2 or more occurrence of such pattern (\n).

Please kindly let me know what is wrong and how should I do it?

I am using Solr 7.6.0.

Regards,
Edwin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message