subversion-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Branko ─îibej <>
Subject Re: SVN Blame Returns Corrupt Data
Date Fri, 11 Oct 2013 14:59:19 GMT
On 11.10.2013 16:55, Bob Archer wrote:
>> On 11.10.2013 15:58, Bob Archer wrote:
>>>> On Thu, Oct 10, 2013 at 5:49 PM, Bob Archer <>
>> wrote:
>>>> I assume he was asking how to "fix" the blame. Cause, sure, he could
>>>> open the file, convert it back to UTF-8 with CRLF line endings... and
>>>> commit it... of course, now blame is going to show him on every line,
>>>> since he just changed every line.
>>>> That's exactly what I meant.  You're correct with how the blame is
>>>> handled.  I committed the UTF-8 copy to a test branch, diff'd, and it
>>>> showed every line as being changed.  Unfortunately it looks like this is
>> best option.
>>> Yep, we have done the same thing. As a matter of fact, I just over the past
>> few days rescripted all our database scripts to be UTF-8 since merging them
>> just doesn't work correctly when they are UTF-16 even if you remove the
>> binary mime type.
>>>> On Thu, Oct 10, 2013 at 7:07 PM, Ben Reser <> wrote:
>>>> At current blame is not UTF-16 aware.
>>> It's not just blame that isn't... the diff engine, or whatever detects file
>> types always considers UTF-16 files to be binary. If you "add" a UTF-16 file
>> you see that svn adds the application/octet-stream mime type.  There is an
>> issue in the bug database about this from when I reported/complained about
>> it... however it hasn't been addressed. I'm surprised still at this time that svn
>> still can't support UTF-16 text files as text wrt adding, diffing, blaming, etc.
>> It's quite simple: no-one has written the necessary code. While I can
>> understand it's an interesting feature for Windows users, most Subversion
>> developers have other things to do. This being a volunteer project, and most
>> of us do not use Windows, you can hardly expect anyone to spend several
>> weeks on solving a problem that has a perfectly simple workaround. Since
>> UFT-8 and UTF-16 can be interchanged without data loss, there are other,
>> much more important things to do in Subversion.
> I appreciate all that you said. I didn't expect that UTF-16 was so uncommon in non-Windows
OSes. A large number of dev tools that I work with on Windows, especially the Microsoft tools
default to creating UTF-16 files.  
> I disagree with your "can be converted without data loss". If you need UTF-16 then you
need it. Also, if you are working in an international team and you have developers with other
language Oss which have different code pages then what you see when you look at a UTF-8 file
might be different than what I see.

I don't follow. Both UTF-16 and UTF-8 are complete representations of
the Unicode character set. Exactly the same code sequences can be
represented in both encodings. You can convert from UTF-16 to UTF-8 and
back and get exactly the same sequence of bytes.

-- Brane

Branko ─îibej | Director of Subversion
WANdisco // Non-Stop Data

View raw message