commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Herbert <>
Subject Re: [CODEC] CRLF files in macOS checkout
Date Tue, 18 Jun 2019 15:01:48 GMT

On 18/06/2019 15:38, sebb wrote:
> On Tue, 18 Jun 2019 at 12:58, Alex Herbert <> wrote:
>> On 18/06/2019 11:00, sebb wrote:
>>> On Tue, 18 Jun 2019 at 10:40, Alex Herbert <> wrote:
>>>> On 18/06/2019 09:55, sebb wrote:
>>>>> On Tue, 18 Jun 2019 at 08:15, Julian Reschke <>
>>>>>> On 17.06.2019 23:26, sebb wrote:
>>>>>>> Most of the files in my clone of codec have LF endings, however
a few are CRLF:
>>>>>>> ./
>>>>>>> ./src/assembly/bin.xml
>>>>>>> ./src/assembly/src.xml
>>>>>>> ./src/changes/changes.xml
>>>>>>> ./src/main/java/org/apache/commons/codec/cli/
>>>>>>> ./src/main/java/org/apache/commons/codec/language/
>>>>>>> ./src/main/resources/org/apache/commons/codec/language/bm/lang.txt
>>>>>>> ./src/test/java/org/apache/commons/codec/digest/
>>>>>>> ./src/test/java/org/apache/commons/codec/digest/
>>>>>>> ./src/test/java/org/apache/commons/codec/digest/
>>>>>>> ./src/test/java/org/apache/commons/codec/language/
>>>>>>> This causes spurious differences when the files are updated.
>>>>>>> Can these files be easily fixed without causing huge diffs to
be generated?
>>>>>>> Also, is there any way to prevent such files being committed
to the repo?
>>>>>>> S.
>>>>>> If svn:eol-style is set to "native", it shouldn't matter. I think
>>>>>> can be defaulted for newly added files.
>>>>> Thanks, but this is Git, not SVN.
>>>>>> In Jackrabbit, I regularly run a script to spot new files missing
>>>>>> property.
>>>>> Are you willing to share the script?
>>>> This was recently a problem in [statistics]. It was fixed using a
>>>> .gitattributes file [1] containing:
>>>> * text=auto
>>>> You can fix all the existing files following the steps detailed on the
>>>> git documentation:
>>>> $ echo "* text=auto" >.gitattributes
>>>> $ git add --renormalize .
>>>> $ git status        # Show files that will be normalized
>>>> $ git commit -m "Introduce end-of-line normalization"
>>> Thanks, though that did not pick up two of the files.
>> Oh dear.
>> When I tried this locally it misses from your list:
>> ./src/changes/changes.xml
>> ./src/test/java/org/apache/commons/codec/language/
>> Those files are also ignored on my machine (linux) by dos2unix. They are
>> not found by any of the following [1]:
>> $ grep -IUr --color "^M" src
>> $ find src -type f | xargs file | grep CRLF
>> $ grep -IUlr $'\r' src
>> So are they a problem?
> I don't know if this causes an issue.
> I used file on macOS to detect the problem files.
> Also my editor (BBEdit) shows the EOL as CRLF for them.

I am currently on linux. I don't have any settings for line endings 
configured for git [1], i.e. the core.autocrlf property. So if I am 
correct what I pulled from the master repo is unchanged on checkout. And 
the two spurious files seem OK for me and 9 require updating.

I can try it again on MacOS later. Maybe something is different there 
and this is very platform specific.

>>> However it looks like the commit message will show huge diffs for each file.
>>> Is that unavoidable?
>> The diff is done line-by-line. So if each line changes then it is a big
>> diff. I don't know a way around that.
>> The alternative would be to leave the .gitattributes file and not commit
>> the normalised files. The next time someone commits each of the
>> offending files the normalisation will occur as git sends it back to the
>> repo. So this just delays the big diff. At least if it all done at once
>> then it makes more sense and avoids the issue of a big diff occurring
>> some time in the future and someone has to figure it out all over again.
> Agreed it's best done all at once.
> I remember fixing EOLs on SVN but as I recall it did not create the
> huge diffs so long as it was done on the appropriate OS.
> Maybe doing it on Windows won't cause the diffs to be created? I may
> be able to try that later.

Since windows is the culprit for the CRLF endings it makes sense to try. 
In this case if you create the .gitattributes file (or configure 
core.autocrlf) git will know to send the file back to the repo 
normalised. So you may have to edit each of the offending files with a 
trivial change to force a commit. The diff should then be the trivial 
change you made and not the big diff with all the lines.

I don't know what happens on the server side. If you do it in a branch 
in Github you could compare the two side by side. Either it will show 
the trivial change or the big diff because on the server side the CRLF 
was changed and locally (on windows) it was not.

>> [1]
>>>> [1]
>>>>>> Best regards, Julian
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail:
>>>>> For additional commands, e-mail:
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message