tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luis Filipe Nassif (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-1007) Improve Concurrency of ParsingReader
Date Sun, 14 Oct 2012 20:31:03 GMT

     [ https://issues.apache.org/jira/browse/TIKA-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Luis Filipe Nassif updated TIKA-1007:

    Attachment: ParsingReaderTest.java
> Improve Concurrency of ParsingReader
> ------------------------------------
>                 Key: TIKA-1007
>                 URL: https://issues.apache.org/jira/browse/TIKA-1007
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.2
>         Environment: jre 1.7.0_05 x64, Windows 7 Enterprise x64
>            Reporter: Luis Filipe Nassif
>         Attachments: FastPipedReader.java, FastPipedWriter.java, ModifiedParsingReader.java,
ModifiedParsingReaderTest.java, ParsingReaderTest.java
> As discussed in TIKA-885, PipedReader and PipedWriter classes have a bug that do not
allow them to execute concurrently, because they notify each other only when the pipe is full
or empty, and do not after a char is read or written to the pipe. It affects the concurrency
of the reader and writer sides of ParsingReader. Try to execute the attached ParsingReaderTest.java
and you will see that only one processor is used (25% CPU on my quad core machine). So i modified
ParsingReader to use modified versions of PipedReader and PipedWriter, that work concurrently.
Try to execute the attached ModifiedParsingReaderTest.java and you will see that 2 processors
are used (50% on my machine). The attached FastPipedReader.java and FastPipedWriter.java are
only for demonstration purposes, because I took the base code from the net and changed it,
so it could suffer from license restrictions.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message