tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: Convert file before Tika processes it?
Date Thu, 21 Jun 2012 12:08:11 GMT
Hi,

On Thu, Jun 21, 2012 at 4:35 AM, 122jxgcn <ywpark90@gmail.com> wrote:
> Hi, I'm currently working on Tika to properly process custom file type (*.hwp
> file) I have a binary executable file which converts hwp file into xml file.
> I'm not sure how can I include this binary file so that when Tika encounters
> hwp file, it can automatically convert in to xml file using the binary, and
> pass the document to XMLParser.

The best approach would be for you to write a custom Parser class for
this file type. That class would call your executable to convert the
file to XML and would then invoke the standard XMLParser on the
result.

BR,

Jukka Zitting

Mime
View raw message