Linux – ways to automatically fix line endings in shell scripts or files who break with ^M

bashcronincrontablinuxshell-scripting

Problem

We regularly break our files lines endings and things stop working without us noticing.

Bash complains about "invalid option" or ": command not found" as described here: http://thinkinginsoftware.blogspot.ca/2012/11/linux-server-cries-for-linux-desktop.html

I'm concerned this could break other text files as well (conf, crons…)

How we break it (I suppose)

We are a group of people using Windows, Mac or Linux to edit Linux files on one server.
We edit these files manually (ssh + vi/nano or localy + ftp).
Sometimes we copy/paste and I think this is what's causing the issue.
Yes, sometimes we don't test our changes for not-so-good reasons: the same script works on the replicate server, the change is just indenting some lines, etc. I agree this should be addressed.

Using Chef/Puppet-like solutions is not planned.

Update

TLDR copy-paste is not an issue, FTP is.

I did some testing with copy/paste Windows line endings CRLF on Windows + Notepad++ + PuTTY + nano and vi.
It looks like the CR (^M) character is filtered, only LF gets pasted to the files. Thanks ewwhite for making me doubt about the copy/paste theory!

I transferred a CRLF-ended file via FTP using FileZilla, option "Send mode" to automatic. The CRLF are preserved. I wonder if FileZilla could convert them to LFs.

Mitigation

We can't ban non-Linux OSes nor forbidding copy-paste.

I thought of those solutions:

  1. Build a cron.minutely that runs dos2unix or sed on all scripts. Cons: we need to maintain a list of "modifiable text files", as I don't want it to run on /
  2. Use a text editor that would support additional commands after file change. Cons: could break files that legitimately use non-Linux line endings, doesn't work when we ftp scripts.
  3. Use a trigger system like http://inotify.aiken.cz/?section=incron&page=about&lang=en. Cons: ?

Pros of #2 and #3: we could also use these to add a final blank line for programs that need it.

Using bash, version 4.2.37(1)-release

Related questions on ^M (CRLF)

Edit: I got one downvote, could you please explain why?

Best Answer

I have to deal with this on occasion with some legacy systems. Sometimes the files retained in the organization's source control (Borland Starteam) were set to the wrong linefeed configuration.

But working in a number of cross-platform environments, copy/paste should not cause this issue alone. Try to identify the trends based on the output from the following and deal with the worst offenders appropriately.

Periodically search for files with DOS linefeeds.

find /var/www -not -type d -exec file "{}" ";" | grep CRLF

Example:

# find /ppro/bin -not -type d -exec file "{}" ";" | grep CRLF
/ppro/bin/compile/save/srcfix.c: ASCII C program text, with CRLF line terminators
/ppro/bin/compile/bldtag.c: ASCII Pascal program text, with CRLF line terminators
/ppro/bin/compile/bldtag.c.sav: ASCII Pascal program text, with CRLF line terminators
/ppro/bin/compile/dbcsum2.c: ASCII Pascal program text, with CRLF line terminators
/ppro/bin/hphw/print_sv.c: ASCII text, with CRLF line terminators
/ppro/bin/linuxhw/dhcpd.conf: ASCII text, with CRLF line terminators
/ppro/bin/linuxhw/dhcpd.conf.mult_subnet: ASCII text, with CRLF line terminators

Then BURN them!!

Remember, that dos2unix on some systems will modify permissions...