tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ray Gauss II (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1320) extract text from jpeg in solr tika
Date Wed, 04 Jun 2014 11:05:02 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017613#comment-14017613

Ray Gauss II commented on TIKA-1320:

I'm not sure we have enough context in the description of this issue to help much here.

As [~thaichat04] points out, OCR is one way of obtaining text from an image, but there are
also several forms of embedded metadata that can be extracted.

Is there specific text you're looking to extract?

> extract text from jpeg in solr tika
> -----------------------------------
>                 Key: TIKA-1320
>                 URL: https://issues.apache.org/jira/browse/TIKA-1320
>             Project: Tika
>          Issue Type: New Feature
>            Reporter: muruganv
>              Labels: features
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> How to extract text from jpeg or image format or tiff in solr tika

This message was sent by Atlassian JIRA

View raw message