tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules
Date Wed, 06 Jan 2016 18:08:39 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085946#comment-15085946
] 

Tim Allison edited comment on TIKA-1824 at 1/6/16 6:08 PM:
-----------------------------------------------------------

[~bobpaulin], this is an awesome step forward.  Must have been a fair amount of work. Thank
you!

Few questions...not just for you, but for all.  I'm happy to submit/commit patches, but I
want to make sure I don't do anything objectionable to the community

* This is probably user error, but I'm getting: \[ERROR\] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:2.10:unpack
(unpack) on project tika-text-module: Unable to find artifact. Could not find artifact org.apache.tika:tika-test-resources:jar:tests:2.0-SNAPSHOT
in apache.snapshots (http://repository.apache.org/snapshots)
* Perhaps rename artifact names in parser sub-components to include "Parser(s?)", e.g. Apache
Tika Parser Advanced Module so that the names sort more clearly (at least in the maven window
in Intellij)?
* Perhaps add "parser(s?) to the artifactId, e.g. tika-parser-cad-module
* Perhaps lowercase names in parser-subcomponents so that they're inline with legacy: "Apache
Tika parser advanced module"
* Pkcs7Parser ... should that be under advanced...or somewhere else ...own crypto package?
* iwork ...should we move that to office?
* tika-test-resources...should we move TikaTest into that and change the name to tika-test?
 I have a vague memory of wanting to carve out a separate test package earlier and adding
TikaTest and something else...
* OutlookPSTParser...move that to office?  
* Does MBox belong in web?  Not sure where to put it?
* Move CommonsDigester to core _if_ we're willing to add a dependency on commons-codec into
core?
* Move Activator to tika-bundle?
* Move pot to multimedia or add tika-parsers-multimedia-advanced-module?
* Move geo.topic to "advanced"...perhaps we rename "advanced" to ner?
* Move ctakes to "advanced/ner"?
* Collapse web and text?

Again, this is fantastic.  Thank you!




was (Author: tallison@mitre.org):
[~bobpaulin], this is an awesome step forward.  Must have been a fair amount of work. Thank
you!

Few questions...not just for you, but for all.  I'm happy to submit/commit patches, but I
want to make sure I don't do anything objectionable to the community

* This is probably user error, but I'm getting: \[ERROR\] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:2.10:unpack
(unpack) on project tika-text-module: Unable to find artifact. Could not find artifact org.apache.tika:tika-test-resources:jar:tests:2.0-SNAPSHOT
in apache.snapshots (http://repository.apache.org/snapshots)
* Perhaps rename artifact names in parser sub-components to include "Parser(s?)", e.g. Apache
Tika Parser Advanced Module so that the names sort more clearly (at least in the maven window
in Intellij)?
* Perhaps add "parser(s?) to the artifactId, e.g. tika-parser-cad-module
* Perhaps lowercase names in parser-subcomponents so that they're inline with legacy: "Apache
Tika parser advanced module"
* Pkcs7Parser ... should that be under advanced...or somewhere else ...own crypto package?
* iwork ...should we move that to office?
* tika-test-resources...should we move TikaTest into that and change the name to tika-test?
 I have a vague memory of wanting to carve out a separate test package earlier and adding
TikaTest and something else...
* OutlookPSTParser...move that to office?  
* Does MBox belong in web?  Not sure where to put it?
* Move CommonsDigester to core _if_ we're willing to add a dependency on commons-digest into
core?
* Move Activator to tika-bundle?
* Move pot to multimedia or add tika-parsers-multimedia-advanced-module?
* Move geo.topic to "advanced"...perhaps we rename "advanced" to ner?
* Move ctakes to "advanced/ner"?
* Collapse web and text?

Again, this is fantastic.  Thank you!



> Tika 2.0 -  Create Initial Parser Modules
> -----------------------------------------
>
>                 Key: TIKA-1824
>                 URL: https://issues.apache.org/jira/browse/TIKA-1824
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 2.0
>            Reporter: Bob Paulin
>            Assignee: Bob Paulin
>
> Create initial break down of parser modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message