maven-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wolfgang Illmeyer (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DOXIA-542) Markdown module converts all apostrophes to quotation mark
Date Tue, 05 Jul 2016 12:34:11 GMT
Wolfgang Illmeyer created DOXIA-542:
---------------------------------------

             Summary: Markdown module converts all apostrophes to quotation mark
                 Key: DOXIA-542
                 URL: https://issues.apache.org/jira/browse/DOXIA-542
             Project: Maven Doxia
          Issue Type: Bug
          Components: Module - Markdown
    Affects Versions: 1.7, 1.4
            Reporter: Wolfgang Illmeyer


Whenever there is some text in a markdown file containing an apostrophe (U+0027, e.g. »don't«),
it is seemingly unconditionally replaced by a »right single quotation mark« (U+2019).

The problem seems to be an out-of-whack »smart« feature of the underlying pegdown library,
which is supposed to perform all kinds of typographic black magic. I'd suggest disabling that
(or at least make it configurable), because apostrophes are not quotation marks and modern
keyboard layouts have all the fancy typographic characters such as different length dashes,
ellipses, and all sorts of quotation marks already easily available.

The fix is relatively trivial:

{code}
--- a/doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
+++ b/doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
@@ -70,7 +70,7 @@ public class MarkdownParser
      * The {@link PegDownProcessor} used to convert Pegdown documents to HTML.
      */
     protected static final PegDownProcessor PEGDOWN_PROCESSOR =
-        new PegDownProcessor( Extensions.ALL & ~Extensions.HARDWRAPS, Long.MAX_VALUE
);
+        new PegDownProcessor( Extensions.ALL & ~Extensions.HARDWRAPS & ~Extensions.SMARTYPANTS,
Long.MAX_VALUE );
 
     /**
      * Regex that identifies a multimarkdown-style metadata section at the start of the document
{code}

But this makes some tests fail and I didn't have the time to figure out how to fix them.
Also, the resulting apostrophes probably need to be escaped in the HTML.

I tested the patch with 1.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message