How to programmatically convert putty logs to human-readable, searchable files

command-line-interfacegreplog-filesputtyshell-scripting

Background:
I have a Windows 7 workstation and use PuTTY for SSH connectivity to Linux servers with session logging enabled. I previously used the Printable output option but that has the benefit of no escape characters but the drawback of making commands I've typed unsearchable if I used tab to autocomplete or backspace because I corrected a typo (or 3) as I was typing the command.

NOTE: I have cygwin installed for additional command-line tool support (i.e. grep, find, etc.).

Recently, I had to go back and find some commands to set the record straight with a coworker about something that happened on a server and the inability to see the final commands I issued is problematic and makes it more difficult to search the logs as well as being much more difficult to readily demonstrate what actually happened for my coworker.

Example #1:
This is an actual PuTTY log file of the 'pwd' command initially misspelled as 'pdw' and then corrected to 'pwd' with Printable output enabled when viewed with cat or in less.

NOTE: There is no difference between less and cat in this case because there are no ESC codes and only printable output was captured.

$ cat 20151112.170705.log
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2015.11.12 17:07:05 =~=~=~=~=~=~=~=~=~=~=~=
[root@eye ~]# pdwwd
/root
[root@eye ~]# exit
logout
$

As you can see, if you were to search for pwd you would get no matching results. I have used iTerm on Mac and know that it can automatically replay logs and it just seems like there should be a way to see the end result of what I eventually used.

Enter the PuTTY All session output logging option. Ok, so here is the deal, with All session output option enabled, the log file gets inundated with ESC codes for terminal color and non-printable characters like backspace.

Example #2:
This is an actual PuTTY log file of the 'pwd' command initially misspelled as 'pdw' and then corrected to 'pwd' with All session output enabled when viewed in less.

$ less 20151112.170457.log
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2015.11.12 17:04:57 =~=~=~=~=~=~=~=~=~=~=~=
Using username "root".
Authenticating with public key "ssh2_rsa_2048_private_key_20111128.ppk"
ESC[?1034hESC]0;root:~^GESC[1;30m[ESC[1;35mrootESC[1;30m@ESC[1;35meye ESC[1;34m~ESC[1;30m]ESC[1;35m# ESC[0mpdESC[ESC[Kwd
/root
ESC]0;root:~^GESC[1;30m[ESC[1;35mrootESC[1;30m@ESC[1;35meye ESC[1;34m~ESC[1;30m]ESC[1;35m# ESC[0mexit
logout
$

Ok, so I'm almost to the actual problem. If I use cat with All session output option enabled, it looks perfect. It is exactly what I want to see and work with.

Example #3:
This is an actual PuTTY log file of the 'pwd' command initially misspelled as 'pdw' and then corrected to 'pwd' with All session output enabled when viewed using cat.

NOTE: This is the exact same log file as above. This is also the exact visible output and format that I want to be able to search.

$ cat 20151112.170457.log
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2015.11.12 17:04:57 =~=~=~=~=~=~=~=~=~=~=~=
Using username "root".
Authenticating with public key "ssh2_rsa_2048_private_key_20111128.ppk"
[root@eye ~]# pwd
/root
[root@eye ~]# exit
logout

$

The Real Problem I Need to Solve:
How can I programmatically translate, convert, or update the content of these log files so that they only show what is actually visible to the user after processing the log file through cat command with All session output enabled in PuTTY session logging?

For the record, I have spent more than a few hours researching and testing possible solutions. Things I have tried which have not worked suitably (or at all):

  • straight-forward redirect of stdout to a new file called test.log. The test.log was identical to the original log file. No benefit.
$ cat 20151112.170457.log > test.log
  • installing xclip and redirecting cat output to xclip. The xclip program complains because I don't have X11 support. Didn't work. No benefit.
$ cat 20151112.170457.log | xclip
Error: Can't open display: (null)
  • using built-in Windows clip command. No complaints from the OS when I run this command, but it has the two problems below instead.

    • No programmatic way to get the data out of the Windows clipboard
    • Even if I could get the data out, the clipboard content is identical to the original log file (with ESC codes). No benefit.
  • I do own a Mac and some of the other posts I read suggest using pbcopy but I'm skeptical it would work any different than 'clip' on Windows.

  • I am aware of and have used less -R and less -r to allow handling of terminal color ESC codes (using -R) and all ESC codes (using -r) but, again, this only helps with presentation to the user and does not allow programmatic searching of hundreds of log files for the command pwd if I made a mistake and corrected it before pressing Enter.

  • I also talked to coworker. No benefit. 😉

In essence, I just want the PuTTY log equivalent of a copy/paste of a web page into notepad. The web page source has lots of <html> tags but the user doesn't see any of this and if they highlight the page, click copy and paste into Notepad, all they get is the text they saw on the page.

I want to be able to programmatically create the Notepad equivalent of all these PuTTY log files for effective searching (i.e. using grep).

NOTE: If it isn't obvious from my total StackExchange reputation of 1, this is my first question or post on StackExchange sites. I'm looking for helpful answers and not responses like "Switch to Linux" or "RTFM".

Best Answer

Assuming the escape codes in the putty file are in binary(im a little confused why your example has control characters - im guessing less did that), you can try col

http://man7.org/linux/man-pages/man1/col.1.html

The control sequences for carriage motion that col understands and their decimal values are listed in the following table:

          ESC-7             reverse line feed (escape then 7)
          ESC-8             half reverse line feed (escape then 8)
          ESC-9             half forward line feed (escape then 9)
          backspace         moves back one column (8); ignored in the
                            first column
          newline           forward line feed (10); also does carriage
                            return
          carriage return   (13)
          shift in          shift to normal character set (15)
          shift out         shift to alternate character set (14)
          space             moves forward one column (32)
          tab               moves forward to next tab stop (9)
          vertical tab      reverse line feed (11)