lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <>
Subject [jira] [Commented] (SOLR-9601) DIH: Radicially simplify Tika example to only show relevant configuration
Date Sat, 01 Apr 2017 23:08:42 GMT


ASF subversion and git services commented on SOLR-9601:

Commit b02626de5071c543eb6e8deea450266218238c9e in lucene-solr's branch refs/heads/master
from [~arafalov]
[;h=b02626d ]

SOLR-9601: DIH Tika example is now minimal
Only keep definitions and files required to show Tika-extraction in DIH

> DIH: Radicially simplify Tika example to only show relevant configuration
> -------------------------------------------------------------------------
>                 Key: SOLR-9601
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: contrib - DataImportHandler, contrib - Solr Cell (Tika extraction)
>    Affects Versions: 6.x, master (7.0)
>            Reporter: Alexandre Rafalovitch
>            Assignee: Alexandre Rafalovitch
>              Labels: examples, usability
>         Attachments: tika2_20170308.tgz, tika2_20170316.tgz
> Solr DIH examples are legacy examples to show how DIH work. However, they include full
configurations that may obscure teaching points. This is no longer needed as we have 3 full-blown
examples in the configsets. 
> Specifically for Tika, the field types definitions were at some point simplified to have
less support files in the configuration directory. This, however, means that we now have field
definitions that have same names as other examples, but different definitions. 
> Importantly, Tika does not use most (any?) of those modified definitions. They are there
just for completeness. Similarly, the solrconfig.xml includes extract handler even though
we are demonstrating a different path of using Tika. Somebody grepping through config files
may get confused about what configuration aspects contributes to what experience.
> I am planning to significantly simplify configuration and schema of Tika example to **only**
show DIH Tika extraction path. It will end-up a very short and focused example.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message