DSV Parser

DSV stands for delimiter separated values.  We created the DSV Parser primarily because we needed a "CSV" parser - a log parser that was capable of reading comma separated value (CSV) files.  However, as soon as our "CSV" parser was completed, we realized that we needed another parser for tab separated files.  Also, what if the customer wants to use the pipe character to separate values? Or a tilde?

The DSV parser solves these problems by reading the separating character from the parser arguments.  When reading the parser arguments LogViewPlus will assume the first non-space character which is not part of a conversion specifier is the character which should be used to separate values.

For example, consider the following conversion pattern:

%d, %t, %p, %c, %m

Here, the first non-space character which is not part of a conversion specifier is a comma.  So, a comma will be used as the separating character.  Similarly, if we consider the pattern:

%d\t%t\t%p\t%c\t%m

We can see that the tab character '\t' is the first non-space character which is not part of a conversion specifier.

Why not just use the Pattern Parser?  Why do we need a separate parser for delimiter separated values? This is a good question because certainly the Pattern Parser would be able to read and interpret the conversion patterns outlined above. However, a delimiter separated value file may have values which are surrounded by quotes and some fields may even contain the delimiter.  Also, note that the above conversion patterns do not define a new line %n conversion specifier.  In the case of a delimiter separated value file, the log entry is complete only when all expected fields have been read successfully.  This means that fields can be multi-line.

Fields that run multiple lines, and fields that contain the delimiter as part of their values must be enclosed in quotes. Quotes can be either single or double. This will be determined inline.  If the field starts with a quote character, then the same character will be the expected to close the field.  To escape a quote character within a field value, you must use two quote characters back to back. For example, '' (two singles) or "" (two doubles).

The above rules conform to how some programs, like Microsoft Excel, write DSV files.

For example, the following is a valid CSV entry:

"2014-05-16 10:50:14,125", Thread 1, INFO, MyLogger, "This is
 my ""LogMessage"", my app is running."

This log message could be parsed using the first conversion pattern given - %d, %t, %p, %c, %m.  LogViewPlus will automatically remove all surrounding quotes, but whitespace will be preserved.  In the above example, the Message field would contain:

This is
 my "LogMessage", my app is running.

What about headers?  Delimiter separated value files frequently use a header row to define the columns.  For example, column headers for the log entry above might be: Date, Thread, Priority, Logger and Message.  LogViewPlus will intelligently ignore column headers.  They are not expected and not required.  If found, they will be ignored.  Delimiter separated files are always parsed according to the provided conversion pattern.


< >