openoffice-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Kelly <kelly...@gmail.com>
Subject DocFormats - Open source OOXML implementation
Date Fri, 15 Aug 2014 06:56:54 GMT
Those of you interested in OOXML may want to have a look at my own implementation of (a subset
of) the spec, which is part of a library I've just made available as open source (license
is ASLv2):

https://github.com/uxproductivity/DocFormats

I started working on this around two years ago as part of UX Write, and it's been included
in the version shipping on the iOS app store since February 2013. I've recently finished removing
all dependencies on iOS/OS X APIs, and converting all the code from Objective C to plain C99.
It now also builds on Linux, with Windows not being too far away.

The design is based on bidirectional transformation, as a way of achieving non-destructive
editing of foreign file formats. This permits incremental implementation of a given spec without
risking data loss due to incomplete features, since unsupported features of a given file format
are left untouched on save. UX Write uses HTML as both its native file format and in-memory
data model (via WebKit), but relies on DocFormats to read & write .docx files, as well
as export to LaTeX. The next major task I plan to work on (hopefully with help from others!)
is .odt support.

Now that this is open source, the eventual goal is for it to be generally usable by any app
which has a need to support multiple file formats, such as OOXML and ODF. Currently it is
limited to word processing formats only, but I'm interested in expanding it to cover spreadsheets,
presentations, and drawings. Aside from editors, it also could be used for batch conversion
tools, document analysis, web publishing, and other purposes.

There are minimal dependencies (basically only libxml and zlib), to make it easy to integrate
into different apps. I'm not a fan of huge monolithic architectures, and have kept it very
independent of other other aspects of UX Write for this very purpose. Note that this means
there is no editing or rendering code; it deals solely with conversion. UX Write uses WebKit
for the rendering, but there are many other ways in which one could build on top of this.

I'll be presenting on this at ApacheCon EU this November - see the talk "Addressing File Format
Compatibility in Word Processors" at http://apacheconeu2014.sched.org.

Comments/questions are welcome.

--
Dr. Peter M. Kelly
Founder, UX Productivity
peter@uxproductivity.com
http://www.uxproductivity.com/
http://www.kellypmk.net/

PGP key: http://www.kellypmk.net/pgp-key
(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

Mime
View raw message