Iis – nxlog parse_csv fails on odd character in IIS log file

iislogstash

I am parsing IIS logs.

This error shows in my nxlog eventlog:

2013-12-24 18:40:20 ERROR if-else failed at line 50, character 351 in C:\Program Files (x86)\nxlog\conf\nxlog.conf. statement execution has been aborted; procedure 'parse_csv' failed at line 50, character 225 in C:\Program Files (x86)\nxlog\conf\nxlog.conf. statement execution has been aborted; Invalid CSV input: '2012-06-20 14:31:37 10.1.0.16 GET /App_Themes/Authenticated/Styles/index.jsp - 80 - 192.168.0.93 "|dir 302 0 0 62'

This is a better view of the line(s) in the log file

2012-06-20 14:31:37 10.1.0.16 GET /App_Themes/Authenticated/Styles/index.jsp - 80 - 192.168.0.93 |dir 302 0 0 62

This is my nxlog.conf file; I cut it off at line 51 because the rest is just routing.

define ROOT C:\Program Files (x86)\nxlog

Moduledir %ROOT%\modules
CacheDir %ROOT%\data
Pidfile %ROOT%\data\nxlog.pid
SpoolDir %ROOT%\data
LogFile %ROOT%\data\nxlog.log

SuppressRepeatingLogs TRUE
LogLevel INFO


#<Extension fileop>
#    Module      xm_fileop
#    <Schedule>
#        Every   1 hour
#        Exec    file_cycle('%ROOT%\data\nxlog.log', 5);
#    </Schedule>
#</Extension>

<Extension syslog>
    Module      xm_syslog
</Extension>

<Extension json>
    Module      xm_json
</Extension> 

<Extension w3c>
    Module      xm_csv
    Fields  $date, $time, $s-ip, $cs-method, $cs-uri-stem, $cs-uri-query, $s-port, $cs-username, $c-ip, $cs-User-Agent, $sc-status, $sc-substatus, $sc-win32-status, $time-taken
    FieldTypes  string, string, string, string, string, string, string, string, string, string, string, string, string, string
    Delimiter   ' '
    QuoteChar   '"'
    EscapeControl FALSE
    UndefValue  -
</Extension>


<Input iis_in>
    Module  im_file
    File    "f:\\iislogs\\u_*.log"
    ReadFromLast FALSE
    Exec    if $raw_event =~ /^#/ drop();                        \
                else                                             \
                {                                                \
                    w3c->parse_csv();                            \
                    $EventTime = parsedate($date + " " + $time); \
                    to_json(); \
                }
</Input>
  1. Is this something I should worry about?
  2. Is this bombing out because of the " character?
  3. How can I avoid this?

EDIT: Further research shows that this is probably due to incorrectly coded characters in the "cs(User-Agent)" Field. Not sure what to do about that.

#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken
2012-02-04 22:09:37 10.1.0.16 GET /login.aspx - 80 - 192.168.0.93 +xa7 403 4 5 218
2012-02-04 22:09:37 10.1.0.16 GET /signup/ - 80 - 192.168.0.93 +xa7 302 0 0 15
2012-02-04 22:09:37 10.1.0.16 GET /signup/ - 80 - 192.168.0.93 " 302 0 0 15
2012-02-04 22:09:37 10.1.0.16 GET /signup/ - 80 - 192.168.0.93 ߧߢ 302 0 0 0
2012-02-04 22:09:37 10.1.0.16 GET /signup/ - 80 - 192.168.0.93 𧧰"" 302 0 0 15
2012-02-04 22:09:38 10.1.0.16 GET /webresource.axd - 80 - 192.168.0.93 +xa7 404 0 0 0
2012-02-04 22:09:38 10.1.0.16 GET /webresource.axd - 80 - 192.168.0.93 " 404 0 0 0
2012-02-04 22:09:38 10.1.0.16 GET /webresource.axd - 80 - 192.168.0.93 ߧߢ 404 0 0 0
2012-02-04 22:09:38 10.1.0.16 GET /webresource.axd - 80 - 192.168.0.93 𧧰"" 404 0 0 0
2012-02-04 22:09:41 10.1.0.16 GET /contactus.aspx - 80 - 192.168.0.93 +xa7 302 0 0 15
2012-02-04 22:09:41 10.1.0.16 GET /contactus.aspx - 80 - 192.168.0.93 " 302 0 0 15
2012-02-04 22:09:41 10.1.0.16 GET /contactus.aspx - 80 - 192.168.0.93 ߧߢ 302 0 0 15
2012-02-04 22:09:41 10.1.0.16 GET /contactus.aspx - 80 - 192.168.0.93 𧧰"" 302 0 0 0

Best Answer

It's a little unclear whether there is a " character in your input or not. Try setting QuoteChar to something other than the double-quote.