openoffice-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <>
Subject Re: Consequences of Working in Office Documents Here
Date Wed, 22 Jun 2011 06:01:01 GMT
I think that the important part here is that others can review the
work being done. When that work is encapsulated behind binary formats,
then it makes it *very* difficult to perform that review.

Sure, some artifacts in the repository *need* to be binary. Nobody
will dispute that.

But when the primary work of this PMC can be done in a reviewable
format, then it helps all of us to make that happen.


On Wed, Jun 22, 2011 at 01:29, Dave Fisher <> wrote:
> On Jun 21, 2011, at 8:58 PM, Daniel Shahaf wrote:
>> Dennis E. Hamilton wrote on Tue, Jun 21, 2011 at 19:20:13 -0700:
>>> On a different list, not just here on ooo-dev, there has been some
>>> surprise to see us putting binaries (ODF documents) into some SVN
>>> locations used by the PPMC.
>>> My impression is that the experienced hands here in ASF are expecting
>>> to see DIFFs in commit messages on SVN, but binaries don't get DIFFed
>>> since it is usually unintelligible and almost always uninteresting.
>>> For some, it is new news that ODF packages are not XML files.
>>> Someone suggested that one could unpack the Zip of these documents and
>>> then do diffs of the respective XML parts and that could serve as
>>> a DIFF on what the changes are.  They also noticed they'd never seen
>>> that done.
>>> On seeing that suggestion (clearly the kinds of things developers
>>> think of, it being what we do), it struck me that we have a geeks are
>>> from Mars, users are from Venus situation here.
>>> I think the clash of expectations has to do with the differences in
>>> tools that are applicable at the level we work at, and how we see what
>>> it is we are at work on.
>>> We need to understand that we really have different experience sets,
>>> and they all are important in the context of the
>>> project.
>>> Here is a geeky explanation of why it does no good to figure out
>>> a better way to show DIFFs of the XML inside an ODF package if you
>>> want to know what an author contributor/committer changed.  (You might
>>> want that as a forensics tool, but not for knowing what someone
>>> changed in the course of their work on a document.)
>>> My (updated) explanation:
>> Long email.  In the end, the expectation is for commit mails to contain
>> reviewable diffs, I don't think you've addressed how that might be done?
> As far as I know binary files are acceptable elsewhere in SVN.
>> (as opposed to how it shouldn't be done)
> Generally ODF files will be documentation and testcases, and generally consistent., like
PNGs, JPEGs, etc. No one complains about PDFs or any of the MS Office formats in SVN. We haven't
seemed to care about that in the Apache POI project, I can't answer for PDFBox.
> I unzipped an ODF zip then each part is a huge set of verbose xml on two lines. Header
and data. For example, content.xml.
> <?xml version="1.0" encoding="UTF-8"?>
> <office:document-content xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" xmlns:xlink=""
xmlns:dc="" xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" xmlns:presentation="urn:oasis:names:tc:opendocument:xmlns:presentation:1.0"
xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0"
xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" xmlns: ....
> Diff won't work easily. Maybe SVN needs to provide "zip" storage and then "xml" diff
within. Could the Subversion project whip that out now. We'll wait until they do before we
proceed. I'm being sarcastic here. But if it available now that would be pretty cool.
> The real issue is that a binary document was used to update a table where everyone made
changes. Changes that were important to those viewing the commit messages. I know we all love
office documents around here, but ...
> Maybe we should be exchanging that particular file as a CSV.
> (BTW - I notice that Calc's save options don't include XLSX, etc.)
> Best Regards,
> Dave

View raw message