tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1446) CHM parser : wrong decompression of aligned blocks
Date Thu, 23 Oct 2014 15:47:34 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181483#comment-14181483
] 

ASF GitHub Bot commented on TIKA-1446:
--------------------------------------

GitHub user thaichat04 opened a pull request:

    https://github.com/apache/tika/pull/20

    TIKA-1446

    TIKA- 1430, TIKA-1446, TIKA-1447, TIKA-1448: CHM Parser improvement

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/tika 1.6

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tika/pull/20.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20
    
----
commit 58a465391d128c2aa9b11c9f5a986f6bcd28abca
Author: Chris Mattmann <mattmann@apache.org>
Date:   2014-07-28T00:45:03Z

    [maven-release-plugin]  copy for tag 1.6
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/tags/1.6@1613865 13f79535-47bb-0310-9956-ffa450edef68

commit c98da37a4b83bdad6aa86ccc6aaec6b0d647c59a
Author: David Meikle <dmeikle@apache.org>
Date:   2014-07-31T18:29:32Z

    TIKA-1381 - Added Lingo24Translator implementation
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/tags/1.6@1614950 13f79535-47bb-0310-9956-ffa450edef68

commit d831ac12be2fc3303f5dab45b00b53b53b6a67e9
Author: Nick Burch <nick@apache.org>
Date:   2014-08-04T15:41:54Z

    Create a branch for 1.6, to backport the POI upgrade to
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1615619 13f79535-47bb-0310-9956-ffa450edef68

commit e2d10e633d38c52b0f490a09043fb43176d26fbe
Author: Nick Burch <nick@apache.org>
Date:   2014-08-04T15:54:55Z

    Merge the POI 3.11 beta 1 upgrade from Trunk to the 1.6 branch (TIKA-1380), ready for
inclusion in rc2
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1615636 13f79535-47bb-0310-9956-ffa450edef68

commit a5942c11cd6a3e75304ce0267c1fc4b5e979c66c
Author: Tim Allison <tallison@apache.org>
Date:   2014-08-04T16:51:40Z

    TIKA-1317 extract contents from SDTs within cells in tables in XWPF (docx) files
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1615675 13f79535-47bb-0310-9956-ffa450edef68

commit 68f9a11926946bdea29ab757a8275149d8d057e9
Author: Nick Burch <nick@apache.org>
Date:   2014-08-04T21:27:41Z

    Merge r1615631 from Trunk to 1.6 - Upgrade the Commons Codec version to match that in
Apache POI, upgraded in TIKA-1380
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1615800 13f79535-47bb-0310-9956-ffa450edef68

commit ee988d4daa5b451a51b799b0ec790b88ca7fc111
Author: Tim Allison <tallison@apache.org>
Date:   2014-08-05T13:03:05Z

    TIKA-1275 upgrade Commons Compress to 1.8.1; updated CHANGES.txt, too
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1615923 13f79535-47bb-0310-9956-ffa450edef68

commit 9d27e1379fba530def45b470a92ce5052078021c
Author: Tim Allison <tallison@apache.org>
Date:   2014-08-05T18:17:39Z

    TIKA-1380; fix for null ole.getLabel()
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1615970 13f79535-47bb-0310-9956-ffa450edef68

commit 2ee02d85aa703e65607a707ee171c166017916ab
Author: Nick Burch <nick@apache.org>
Date:   2014-08-20T14:16:06Z

    Merge r1619108 from Trunk to the 1.6 branch ready for release - Bump the POI dependency
to 3.11-beta2, and remove the Geronimo stax one which is no longer required by anything now
we are on Java 1.6 TIKA-1380
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1619109 13f79535-47bb-0310-9956-ffa450edef68

commit a3eac367cd560c20da4231f45eb18d638d4f91a1
Author: Chris Mattmann <mattmann@apache.org>
Date:   2014-08-31T19:36:36Z

    Bring 1.6 branch up to date with trunk in prep for 1.6 RC #2.
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1621623 13f79535-47bb-0310-9956-ffa450edef68

commit dd2a2b5bad7e363c5ab74db69b89b6083f6fc8ff
Author: Chris Mattmann <mattmann@apache.org>
Date:   2014-08-31T19:44:11Z

    [maven-release-plugin] prepare release 1.6-rc2
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1621627 13f79535-47bb-0310-9956-ffa450edef68

commit 5f9845759fb7839298ac5ee3abb11667035faac3
Author: Chris Mattmann <mattmann@apache.org>
Date:   2014-08-31T19:44:17Z

    [maven-release-plugin] prepare for next development iteration
    
    git-svn-id: https://svn.apache.org/repos/asf/tika/branches/1.6@1621629 13f79535-47bb-0310-9956-ffa450edef68

----


> CHM parser : wrong decompression of aligned blocks
> --------------------------------------------------
>
>                 Key: TIKA-1446
>                 URL: https://issues.apache.org/jira/browse/TIKA-1446
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.7
>            Reporter: Bin Hawking
>            Priority: Critical
>         Attachments: chm.zip
>
>
> If an embedded file contains aligned blocks, the parser outputs chaotic text or empty
text as to this file.
> I have fixed it myself, corrected decompressAlignedBlock() and its preparation methods.
Mostly this bug is due to misusing main tree/align tree/length tree. And some tree is built
wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message