openoffice-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis E. Hamilton" <>
Subject RE: Consequences of Working in Office Documents Here
Date Wed, 22 Jun 2011 06:06:44 GMT
It's true that we could bring things around and shoe-horn them into the SVN DIFF model, like
using a CSV, or not minding that HTML diffs aren't that illuminating but we get them for what
they are worth, etc.

Of course, using a CSV loses a lot of information and anything that was done in the design
of the spreadsheet to facilitate its use, in my chosen example.

Now, in this case, the spreadsheet was from a committer. And other committers knew how to
retrieve it, update it, and resubmit to SVN with an informative enough commit message.  This
is not a complex case, it was just illustrative of the different level.

The question I did not answer, because I do not know the answer:  What is a straightforward
way for someone who was not raised as a Martian to contribute without being compelled to commit
unnatural (for a Venusian) acts.  What is a way to contribute that does not require an unnatural
change in already-successful ways of working?  And what is the cutover where the contribution
is substantial enough that an iCLA is required anyhow?

It seems to me there is an impedance mismatch for non-developer contributions of content that
becomes part of an Apache deliverable.  I don't question policies that are involved.  I am
wondering about the logistics and the friction of shoe-horning contributors into a practice
that is designed around submission of patches and requires arcane Martian technology.

Perhaps this is too hypothetical.

I would like to hear from non-developer members of ooo-Dev who want to contribute, and what
the nature of the envisioned contribution is.  Maybe some concrete use cases can clear this
up for all of us.

 - Dennis

-----Original Message-----
From: Dave Fisher [] 
Sent: Tuesday, June 21, 2011 22:30
Cc: Dennis E. Hamilton
Subject: Re: Consequences of Working in Office Documents Here

On Jun 21, 2011, at 8:58 PM, Daniel Shahaf wrote:

> Dennis E. Hamilton wrote on Tue, Jun 21, 2011 at 19:20:13 -0700:
>> On a different list, not just here on ooo-dev, there has been some
>> surprise to see us putting binaries (ODF documents) into some SVN
>> locations used by the PPMC. 
>> My impression is that the experienced hands here in ASF are expecting
>> to see DIFFs in commit messages on SVN, but binaries don't get DIFFed
>> since it is usually unintelligible and almost always uninteresting.
>> For some, it is new news that ODF packages are not XML files.
>> Someone suggested that one could unpack the Zip of these documents and
>> then do diffs of the respective XML parts and that could serve as
>> a DIFF on what the changes are.  They also noticed they'd never seen
>> that done.
>> On seeing that suggestion (clearly the kinds of things developers
>> think of, it being what we do), it struck me that we have a geeks are
>> from Mars, users are from Venus situation here.
>> I think the clash of expectations has to do with the differences in
>> tools that are applicable at the level we work at, and how we see what
>> it is we are at work on.
>> We need to understand that we really have different experience sets,
>> and they all are important in the context of the
>> project.
>> Here is a geeky explanation of why it does no good to figure out
>> a better way to show DIFFs of the XML inside an ODF package if you
>> want to know what an author contributor/committer changed.  (You might
>> want that as a forensics tool, but not for knowing what someone
>> changed in the course of their work on a document.)
>> My (updated) explanation:
> Long email.  In the end, the expectation is for commit mails to contain
> reviewable diffs, I don't think you've addressed how that might be done?

As far as I know binary files are acceptable elsewhere in SVN.

> (as opposed to how it shouldn't be done)

Generally ODF files will be documentation and testcases, and generally consistent., like PNGs,
JPEGs, etc. No one complains about PDFs or any of the MS Office formats in SVN. We haven't
seemed to care about that in the Apache POI project, I can't answer for PDFBox.

I unzipped an ODF zip then each part is a huge set of verbose xml on two lines. Header and
data. For example, content.xml.

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" xmlns:xlink=""
xmlns:dc="" xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" xmlns:presentation="urn:oasis:names:tc:opendocument:xmlns:presentation:1.0"
xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0"
xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" xmlns: ....

Diff won't work easily. Maybe SVN needs to provide "zip" storage and then "xml" diff within.
Could the Subversion project whip that out now. We'll wait until they do before we proceed.
I'm being sarcastic here. But if it available now that would be pretty cool.

The real issue is that a binary document was used to update a table where everyone made changes.
Changes that were important to those viewing the commit messages. I know we all love office
documents around here, but ...

Maybe we should be exchanging that particular file as a CSV.

(BTW - I notice that Calc's save options don't include XLSX, etc.)

Best Regards,

View raw message