tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Mattmann" <mattm...@apache.org>
Subject Re: Review Request 31758: TIKA-1330: tika batch code
Date Thu, 05 Mar 2015 03:56:58 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31758/#review75289
-----------------------------------------------------------



trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java
<https://reviews.apache.org/r/31758/#comment122268>

    awesome! :) I wrote that one.



trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java
<https://reviews.apache.org/r/31758/#comment122269>

    Why not use commons-io? http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html


- Chris Mattmann


On March 5, 2015, 3:07 a.m., Tim Allison wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31758/
> -----------------------------------------------------------
> 
> (Updated March 5, 2015, 3:07 a.m.)
> 
> 
> Review request for tika.
> 
> 
> Repository: tika
> 
> 
> Description
> -------
> 
> TIKA-1330: tika batch code integrated into Tika-app.  This offers robust batch processing
code filesystem input -> filesystem output on a single machine.  The goals are:
> 
> 1) to make the code robust against permanent hangs and oom 
> 2) enable easy(ish) extensibility
> 3) include robust logging
> 
> 
> Diffs
> -----
> 
>   trunk/pom.xml 1664211 
>   trunk/tika-app/pom.xml 1664211 
>   trunk/tika-app/src/main/java/org/apache/tika/cli/BatchCommandLineBuilder.java PRE-CREATION

>   trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java 1664211 
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLIBatchCommandLineTest.java PRE-CREATION

>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java 1664211 
>   trunk/tika-batch/pom.xml PRE-CREATION 
>   trunk/tika-batch/src/main/examples/batchExecutor.sh PRE-CREATION 
>   trunk/tika-batch/src/main/examples/log4j.xml PRE-CREATION 
>   trunk/tika-batch/src/main/examples/log4j_driver.xml PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/AutoDetectParserFactory.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchNoRestartError.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcess.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ConsumersManager.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileConsumerFutureResult.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResource.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceConsumer.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawler.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawlerFutureResult.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileStarted.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IFileProcessorFutureResult.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IInterrupter.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IStatusReporter.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/InterrupterFutureResult.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/OutputStreamFactory.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParallelFileProcessingResult.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParserFactory.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/PoisonFileResource.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/SimpleLogStatusReporter.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/StatusReporterFutureResult.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/AbstractConsumersBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/BatchProcessBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineInterrupterBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineParserBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ContentHandlerFactoryBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/DefaultContentHandlerFactoryBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ICrawlerBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/IInterupterBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMAndQueueBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ReporterBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/SimpleLogReporterBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/StatusReporterBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/BasicTikaFSConsumer.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSBatchProcessCLI.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSConsumersManager.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDirectoryCrawler.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDocumentSelector.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSFileResource.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSListCrawler.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSOutputStreamFactory.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSProperties.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSUtil.java PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/FSCrawlerBuilder.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/strawman/StrawManTikaAppDriver.java
PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/util/BatchLocalization.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/util/ClassLoaderUtil.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/util/DurationFormatUtils.java PRE-CREATION

>   trunk/tika-batch/src/main/java/org/apache/tika/util/PropsUtil.java PRE-CREATION 
>   trunk/tika-batch/src/main/java/org/apache/tika/util/XMLDOMUtil.java PRE-CREATION 
>   trunk/tika-batch/src/main/resources/org/apache/tika/batch/fs/default-tika-batch-config.xml
PRE-CREATION 
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/CommandLineParserBuilderTest.java
PRE-CREATION 
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchDriverTest.java PRE-CREATION

>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchProcessTest.java PRE-CREATION

>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/FSBatchTestBase.java PRE-CREATION

>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/HandlerBuilderTest.java PRE-CREATION

>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/OutputStreamFactoryTest.java
PRE-CREATION 
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/StringStreamGobbler.java PRE-CREATION

>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/strawman/StrawmanTest.java
PRE-CREATION 
>   trunk/tika-batch/src/test/java/org/apache/tika/parser/evil/EvilParserFactory.java PRE-CREATION

>   trunk/tika-batch/src/test/resources/evil/assertion_error.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/hang_heavy_load.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/no_problem.txt PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/oom_exception.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/runtime_exception.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/sleep.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/sleep_2000.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/tika-evil-config.xml PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/tika-evil-mimetypes.xml PRE-CREATION 
>   trunk/tika-batch/src/test/resources/evil/tika_exception.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/log4j.properties PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/basic/test1.txt PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load1.evil
PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load2.evil
PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load3.evil
PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load4.evil
PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load5.evil
PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/test1.txt PRE-CREATION

>   trunk/tika-batch/src/test/resources/test-input/no_restart/test1.txt PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test2.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test3.txt PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/hang_heavy_load1.evil
PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test1.txt PRE-CREATION

>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test2.txt PRE-CREATION

>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test3.txt PRE-CREATION

>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test4.txt PRE-CREATION

>   trunk/tika-batch/src/test/resources/test-input/oom/asleep_10000.evil PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/oom/hang_heavy_load.evil PRE-CREATION

>   trunk/tika-batch/src/test/resources/test-input/oom/test1.txt PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/oom/test1b_oom_exception.evil PRE-CREATION

>   trunk/tika-batch/src/test/resources/test-input/oom/test2.txt PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/oom/test3.txt PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/oom/test4.txt PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/timeout_after_early_termination/asleep_60000.evil
PRE-CREATION 
>   trunk/tika-batch/src/test/resources/test-input/wait_after_early_termination/asleep_10000.evil
PRE-CREATION 
>   trunk/tika-batch/src/test/resources/tika-batch-config-basic-test.xml PRE-CREATION 
>   trunk/tika-batch/src/test/resources/tika-batch-config-evil-test.xml PRE-CREATION 
>   trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java 1664211 
>   trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java 1664211 
>   trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java 1664211

> 
> Diff: https://reviews.apache.org/r/31758/diff/
> 
> 
> Testing
> -------
> 
> Code has been in development as part of another fielded project for the last two years.
 Numerous unit tests...could always use more
> 
> 
> Thanks,
> 
> Tim Allison
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message