jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Marth (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-653) Improve binaries handling in Solr index
Date Wed, 01 Apr 2015 13:49:55 GMT

     [ https://issues.apache.org/jira/browse/OAK-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael Marth updated OAK-653:
------------------------------
    Fix Version/s:     (was: 1.2)
                   1.3.1

> Improve binaries handling in Solr index
> ---------------------------------------
>
>                 Key: OAK-653
>                 URL: https://issues.apache.org/jira/browse/OAK-653
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: oak-solr
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>             Fix For: 1.3.1
>
>
> Solr provides SolrCell (integration with Apache Tika, http://wiki.apache.org/solr/ExtractingRequestHandler)
which would be easy to leverage. Also it'd be nice to have that working on the Lucene level
as a specific set of analyzers/tokenizers so that it'd be transparent (wouldn't need any special
URI for binaries indexing) once those are configured in a Solr schema.
> It'd be also good to be able to extract the text from within the SolrIndexEditor (like
LuceneIndexEditor does) without having to rely on SolrCell on the Solr side as it's not always
exposed (it depends on wether it's explicitly configured).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message