tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3010)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: tika-python
Date Fri, 04 Nov 2016 14:23:00 GMT
Thanks Jorg appreciate it.
I’ll check out:

https://github.com/TalmarGrosskotz/teacher-shelf.git

And get back to you.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Principal Data Scientist, Engineering Administrative Office (3010)
Manager, Open Source Projects Formulation and Development Office (8212)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 180-503E, Mailstop: 180-502
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 

On 11/4/16, 3:52 AM, "Jörg Bilert" <bilert@gmail.com> wrote:

    Wow,
    
    thank you for your quick answer. I think I might have given you a wrong 
    impression. I think your code would work perfectly if I knew how to use 
    it properly- :)
    
    Nevertheless I added a link to my github repository to give you a short 
    impression of the project I am planning and the first steps in Python3 
    and tika I have taken so far.
    
    I would be glad for any help you could give me on how to use the 
    different parsers (or the parser for different filetypes).
    
    Thank you in advance,
    
    Jörg
    
    
    Am 04.11.2016 um 04:34 schrieb Mattmann, Chris A (3010):
    > Dear Jorg,
    >
    > Thank you much for sending this. I have been meaning to reply to your prior
    > emails on the same subject. Yes it will work for other file types. Can you give
    > me an example file and upload it in a Github issue of a file it’s not working for?
    > I can take a look.
    >
    > Cheers,
    > Chris
    >
    >
    > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    > Chris Mattmann, Ph.D.
    > Principal Data Scientist, Engineering Administrative Office (3010)
    > Manager, Open Source Projects Formulation and Development Office (8212)
    > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
    > Office: 180-503E, Mailstop: 180-502
    > Email: chris.a.mattmann@nasa.gov
    > WWW:  http://sunset.usc.edu/~mattmann/
    > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    > Director, Information Retrieval and Data Science Group (IRDS)
    > Adjunct Associate Professor, Computer Science Department
    > University of Southern California, Los Angeles, CA 90089 USA
    > WWW: http://irds.usc.edu/
    > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >   
    >
    > On 11/3/16, 5:15 PM, "Jörg Bilert" <bilert@gmail.com> wrote:
    >
    >      Hello Mr Mattman,
    >      
    >      I have just been looking into your pythong wrapper for tika and I like
    >      it a lot.
    >      But there is one thing i just don't see. According to the Apache Tika
    >      website Tika supports a lot of file formats (even audio and video). Buti
    >      don't know how to parse them in python. ODT and PDF work fine like in
    >      the samplecode on your github page.
    >      
    >      Could you give me a clue where to start to handle other file-types?
    >      
    >      Yours, Jörg Bilert
    >      
    >
    
    

Mime
View raw message