lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject RegexReplaceProcessorFactory pattern to detect multiple \n
Date Thu, 07 Feb 2019 13:08:15 GMT
Hi,

I am trying to use the RegexReplaceProcessorFactory to remove more than two
\n with any number of spaces between them (Eg: \n\n, \n \n, \n \n  \n \n),
and replace it with two <br>.

I use the following regex pattern and it is working when I test it in
regex101.com. But it is not working when I put it inside the
RegexReplaceProcessorFactory as below:

<updateRequestProcessorChain name="removeCode">
<processor class="solr.RegexReplaceProcessorFactory">
   <str name="fieldName">content</str>
   <str name="pattern">"(\\n\s*){2,}"</str>
   <str name="replacement">&lt;br&gt;&lt;br&gt;</str>
</processor>
          </updateRequestProcessorChain>

To explain further about my regex pattern, \s* is instructing the regex to
match any \n that have space after and {2,} is instructing the regex to
match 2 or more occurrence of such pattern (\n).

Please kindly let me know what is wrong and how should I do it?

I am using Solr 7.6.0.

Regards,
Edwin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message