tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Burch <apa...@gagravarr.org>
Subject Re: Tika OneNote Support
Date Sun, 25 Nov 2012 19:33:14 GMT
On Wed, 14 Nov 2012, 122jxgcn wrote:
> Is there anyone who worked on extracting contents from MS OneNote file? 
> (*.one) It will be great if someone can tell me how to work with parsing 
> OneNote files programatically.

I'm not aware of anything. The good news is that the file format is fully 

You'll need to use the specification to write some code to read the 
format, then you can feed it to Tika. My hunch is you're looking at 5-15 
days of work.

Apache POI would probably be a good home for most of the OneNote code if 
you do get it working, please consider contributing it there if you make 


View raw message