Log Message Parsers

Log Message Parsers

This chapter has discussed parsing log files into log entries using regular expressions or conversion specifiers.  Log entries often contain a message which may contain a significant amount of data.  Because log entries are parsed into an envelope and message, it can be difficult to extract data from the message itself.  This is especially true when a log file contains a lot of variation between log entry messages.

To solve this problem, LogViewPlus employs a two stage parsing process.  First, the log entries are parsed using the techniques discussed in this chapter.  Once the log entry has been parsed, the log message will be parsed as a separate step.  By default log entry messages will be parsed using an automatic message parse, but you can use the Parse Message Filter to configure one of three options:  automatic parse, pattern parse or regex parse.  These options are discussed below.

Once configured, message parsers can be associated with a log file parser configuration and will be saved as Message Parser Settings.  A log file can contain multiple message parsers which you can manually configure using the Parse Message Filter.  After creating or editing a message parser, all log files using the target configuration will need to be refreshed.

Information extracted from a parsed message will be displayed in the Log Entry Grid when the parse message filter is applied.  When viewing parsed messages, the Message column will be temporarily removed.  Extracted information can also be used in Reports & Dashboards.

Automatic Message Parse

Automatic message parsing occurs when no configured message parser has been found which matches the current log message.  This is means that all configured log message parsers must be attempted before the decision to use an automatic parse can be made. 

An automatic parse will scan the log message for text it finds interesting and attempt to extract this information.  Examples of interesting text include:

1.  Numbers
2.  Words in all caps.
3.  Words which contain numbers or symbols.
4.  Text in brackets, parenthesis or quotes.
5.  Text after a colon.

Information extracted from an automatic parse will usually be given a generic column name.  For example, Column 1, Column 2, etc.  However, a column name may be found in the data if a key/value pair is detected.  LogViewPlus may detect a key/value pair when:

1.  The key and value are separated by an equals sign.  For example: key=value.
2.  The key and value are separated by a colon.  For example: key: value.

Occasionally, an automatic parse may result in an excessive amount of information.  For example, if a log entry contains an XML or JSON statement.  In these scenarios, CPU and memory is need to process log message data which may not ultimately be needed.  To limit this kind of excessive parsing, an automatic parse will mark any messages with more than 10 parameters as advanced messages with 'Adv. Message'.  These message are best processed manually.

Automatic message parsing is always on a best effort basis.  Usually, we can obtain useful results using the heuristics described above, but sometimes you may need to manually extract the target information.  This can be done using either pattern or regex message parsing discussed below.

Messages parsers can be added or changed only by using the Parse Message Filter.  Existing message parsers can be viewed or removed in the Message Parser Settings.

Pattern Message Parse

A pattern log message parser uses string conversion specifiers to define a parsing configuration.  The generated configuration pattern will be very similar to the configuration generated when using the Pattern Parser but only string conversion specifiers should be used.

When a manual message parse is needed, we recommend using the pattern parser instead of the regex parser.  The pattern parse is generally significantly more performant.  This can be particularly important when multiple log message parsers need to be associated with the log file configuration.

Regex Message Parse

A regex log message parser will use a regular expression to extract information from the log entry message.  LogViewPlus uses Microsoft's regex parser internally, so only .NET regular expressions are supported.

As discussed above, the pattern message parser is recommend when a manual parse is needed.  However, the regex parser can add significant value to users who are already familiar with regular expressions.

Data Table Parse

A Data Table message parse is an advanced operation that allows you to use JSON to configure the column settings and one or more parse instructions.   Configuring Data Table parsers takes more time than Pattern or Regex parsers, but it gives you a much higher degree of control over how log entry messages are parsed.

Let's start with an example.  Consider a log message that might look like:

123: this is a, 456 test message, 789

I happen to know that in this message 123 is actually a time span measured in seconds.  789 actually represents a size measured in megabytes.  All other data in the message can be ignored.  I only care about these two values.

In this case, values 123 and 789 could be parsed with:

{
    "name": "My Data Table Parser",
    "searchPatterns": [
        {
            "isRegex": true,
            "pattern": "(?<Time>\d+): "
        },
        {
            "isRegex": true,
            "pattern": ", (?<Size>\d+)$"
        }
    ],
    "operation": "sequentialregex",
    "columns": [
        {
            "name": "Time",
            "unitType": "timespan",
            "commonUnit": "s",
            "defaultValue": "0"
        },
        {
            "name": "Size",
            "unitType": "bytes",
            "commonUnit": "mb",
            "defaultValue": "0"
        }
    ]
}

There are four parts to this parser configuration, all of which are required. 

The first field is the name.  The value provided here will be used to identify the filter when it is displayed.

The second field is the searchPatterns.  This is an array of custom parse configurations to be executed.  These configurations can optionally be regular expressions if the isRegex property is set to true.

The third field is the operation type.  This can be one of three values: and, or, sequentialregex.  The operation defines how the searchPatterns should be processed when multiple search patterns are used.  The and operation is used when all patterns must be matched.  The or operation is used when only one pattern much match. 

The sequentialregex operation is the same as and but the searchPatterns provided must all be regular expressions.  Each expression must be matched in order where the next expression will start where the prior expression ends.  In other words, expression 2 must be found after expression 1.

The final field is columns which defines the data model.  The primary purpose of this field is to give names to the data extracted by the searchPatterns.  All columns must have a name and the number of columns must match the expected number values extracted.

Additionally, the columns field can optionally define a unitType which identifies and converts standard units of measure.  When using a unitType, a commonUnit must be defined to transform and normalize the data if necessary.  For example, if LogViewPlus reads a time value in milliseconds, but the commonUnit is seconds, the time will be converted to seconds.  A defaultValue should also be provided in case no data is found. 

Unit types are optional.  Currently, only two units of measure are supported: time and bytesBytes is used when a data size representation is needed and time is used to represent a time span. 

When using a unitType, a commonUnit must be provided.  The following commonUnits are supported.

unitType
commonUnit
Meaning
time
d
days
 
h
hours
 
m
minutes
 
s
seconds
 
f
fractional seconds
bytes
b
bytes
 
kb
kilobytes
 
mb
megabytes
 
gb
gigabytes
 
tb
terabytes

When interpreting the parsed value, data values will be scanned to determine if a unit type can be identified.  Unit types in data values are often identified only by the first character to give more flexibility.  For example, if the measurement type is bytes and the commonUnit is mb, than a value of 2048K will be converted and LogViewPlus will display a value of 2.  Data values of "2048 killobytes" or "2048 KB" will produce the same result as only a case insensitive 'k' is used for identification.  If a unit type cannot be identified, the commonUnit will be assumed and the data will not be modified.

Data Table parsers are new in LogViewPlus 3.1.  If you are working with a Data Table parser and you have any questions or issues, please contact support.


< >