tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2016) A parser that combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.
Date Wed, 03 May 2017 20:15:04 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995572#comment-15995572
] 

ASF GitHub Bot commented on TIKA-2016:
--------------------------------------

chrismattmann commented on issue #169: TIKA-2016  Sentiment Analysis Parser Contributed by
amensiko and thammegowda
URL: https://github.com/apache/tika/pull/169#issuecomment-299022223
 
 
   ## Build passes:
   
   ```
   [INFO] ------------------------------------------------------------------------
   [INFO] Building Apache Tika 1.15-SNAPSHOT
   [INFO] ------------------------------------------------------------------------
   [INFO] 
   [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ tika ---
   [INFO] 
   [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ tika ---
   [INFO] 
   [INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ tika ---
   [INFO] 
   [INFO] --- forbiddenapis:2.2:check (default) @ tika ---
   [INFO] Skipping execution for packaging "pom"
   [INFO] 
   [INFO] --- forbiddenapis:2.2:testCheck (default) @ tika ---
   [INFO] Skipping execution for packaging "pom"
   [INFO] 
   [INFO] --- maven-install-plugin:2.5.2:install (default-install) @ tika ---
   [INFO] Installing /Users/mattmann/git/tika-gh/pom.xml to /Users/mattmann/.m2/repository/org/apache/tika/tika/1.15-SNAPSHOT/tika-1.15-SNAPSHOT.pom
   [INFO] ------------------------------------------------------------------------
   [INFO] Reactor Summary:
   [INFO] 
   [INFO] Apache Tika parent ................................. SUCCESS [  1.047 s]
   [INFO] Apache Tika core ................................... SUCCESS [ 22.697 s]
   [INFO] Apache Tika parsers ................................ SUCCESS [03:23 min]
   [INFO] Apache Tika XMP .................................... SUCCESS [  1.649 s]
   [INFO] Apache Tika serialization .......................... SUCCESS [  1.436 s]
   [INFO] Apache Tika batch .................................. SUCCESS [01:50 min]
   [INFO] Apache Tika language detection ..................... SUCCESS [  3.730 s]
   [INFO] Apache Tika application ............................ SUCCESS [ 32.442 s]
   [INFO] Apache Tika OSGi bundle ............................ SUCCESS [ 17.231 s]
   [INFO] Apache Tika translate .............................. SUCCESS [  1.825 s]
   [INFO] Apache Tika server ................................. SUCCESS [ 35.426 s]
   [INFO] Apache Tika examples ............................... SUCCESS [  9.609 s]
   [INFO] Apache Tika Java-7 Components ...................... SUCCESS [  1.738 s]
   [INFO] Apache Tika eval ................................... SUCCESS [ 24.480 s]
   [INFO] Apache Tika ........................................ SUCCESS [  0.022 s]
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time: 07:48 min
   [INFO] Finished at: 2017-05-03T13:03:29-07:00
   [INFO] Final Memory: 164M/1530M
   [INFO] ------------------------------------------------------------------------
   LMC-053601:tika-gh mattmann$ 
   ```
   
   I also tried it myself on the following file:
   
   `sample.sent`
   ``` 
   Man I'm so tired of battling against OSGI!
   ```
   
   `sample2.sent`
   ```
   Whatever, I need some cooling off time!
   ```
   
   # Binary sentiment
   ```
   LMC-053601:tika-gh mattmann$ java -jar tika-app/target/tika-app-1.15-SNAPSHOT.jar \
   >          --config=tika-parsers/src/test/resources/org/apache/tika/parser/sentiment/analysis/tika-config-sentiment-opennlp.xml
\
   >          -m sample.sent
   WARN  JBIG2ImageReader not loaded. jbig2 files will be ignored
   INFO  Sentiment Model is at https://raw.githubusercontent.com/USCDataScience/SentimentAnalysisParser/master/sentiment-models/en-netflix-sentiment.bin
   Content-Length: 43
   Content-Type: application/sentiment
   Sentiment: negative
   X-Parsed-By: org.apache.tika.parser.CompositeParser
   X-Parsed-By: org.apache.tika.parser.sentiment.analysis.SentimentParser
   resourceName: sample.sent
   LMC-053601:tika-gh mattmann$ 
   ```
   
   # Categorical (multi-class sentiment)
   Changing to use `sample2.sent`
   
   ```
   LMC-053601:tika-gh mattmann$ java -jar tika-app/target/tika-app-1.15-SNAPSHOT.jar     
    --config=tika-parsers/src/test/resources/org/apache/tika/parser/sentiment/analysis/tika-config-sentiment-opennlp-cat.xml
         -m sample2.sent
   WARN  JBIG2ImageReader not loaded. jbig2 files will be ignored
   INFO  Sentiment Model is at https://raw.githubusercontent.com/USCDataScience/SentimentAnalysisParser/master/sentiment-models/ht-sentiment-categ.bin
   Content-Length: 39
   Content-Type: application/sentiment
   Sentiment: angry
   X-Parsed-By: org.apache.tika.parser.CompositeParser
   X-Parsed-By: org.apache.tika.parser.sentiment.analysis.SentimentParser
   resourceName: sample2.sent
   LMC-053601:tika-gh mattmann$ 
   ```
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> A parser that combines Apache OpenNLP and Apache Tika and provides facilities for automatically
deriving sentiment from text.
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2016
>                 URL: https://issues.apache.org/jira/browse/TIKA-2016
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Anastasija Mensikova
>            Assignee: Chris A. Mattmann
>              Labels: analysis, gsoc2016, memex, parser, sentiment
>             Fix For: 1.15
>
>
> A new project that implements a parser that uses Apache OpenNLP and Apache Tika to perform
Sentiment Analysis.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message