Multi-Line Regex Parser


Author
Message
aendyp
aendyp
New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)
Group: Forum Members
Posts: 2, Visits: 11
Hi,
I have a problem with the regex parser. I would like to parse the following logfile:

06.10.2020 16:38:18.17 +03:00 [Debug] [drivermanagement] [ifm.Suite.BuildingBlocks.Security.Authorization.DefaultMockAuthorizationPolicyProvider] []
Added AuthorizationPolicy:
Name = Default_Reader
Authentication schemes = Test Scheme
Requirement with scopes = ifm.suite.services.drivermanagement.reader
06.10.2020 16:38:18.239 +03:00 [Debug] [drivermanagement] [ifm.Suite.BuildingBlocks.Security.Authorization.DefaultMockAuthorizationPolicyProvider] []
Added AuthorizationPolicy:
Name = Default_Writer
Authentication schemes = Test Scheme
Requirement with scopes = ifm.suite.services.drivermanagement.reader,ifm.suite.services.drivermanagement.writer
06.10.2020 16:38:18.241 +03:00 [Debug] [drivermanagement] [ifm.Suite.BuildingBlocks.Security.Authorization.DefaultMockAuthorizationPolicyProvider] []
added HomePolicy
06.10.2020 16:38:18.242 +03:00 [Debug] [drivermanagement] [ifm.Suite.BuildingBlocks.Security.Authorization.DefaultMockAuthorizationPolicyProvider] []
DefaultPolicy with schemes Test Scheme
Using SPA folder: wwwroot
Updating Index.html
Replacing file wwwroot\index.html
Error manipulating spa:

{"ClassName":"System.IO.FileNotFoundException","Message":"Could not find file 'C:\\d\\SingleProcess\\src\\Services\\ifm.Suite.Services.SingleProcess\\bin\\Debug\\netcoreapp3.1\\wwwroot\\index.html'.","Data":null,"InnerException":null,"HelpURL":null,"StackTraceString":" at System.IO.FileStream.ValidateFileHandle(SafeFileHandle fileHandle)\r\n at System.IO.FileStream.CreateFileOpenHandle(FileMode mode, FileShare share, FileOptions options)\r\n at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)\r\n at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)\r\n at System.IO.StreamReader..ctor(String path, Encoding encoding, Boolean detectEncodingFromByteOrderMarks)\r\n at System.IO.File.InternalReadAllText(String path, Encoding encoding)\r\n at System.IO.File.ReadAllText(String path)\r\n at ifm.Suite.Services.DriverManagement.SuiteCommunication.Startup.UpdateSpaFolderEcontent()","RemoteStackTraceString":null,"RemoteStackIndex":0,"ExceptionMethod":null,"HResult":-2147024894,"Source":"System.Private.CoreLib","WatsonBuckets":null,"FileNotFound_FileName":"C:\\d\\SingleProcess\\src\\Services\\ifm.Suite.Services.SingleProcess\\bin\\Debug\\netcoreapp3.1\\wwwroot\\index.html","FileNotFound_FusionLog":null}


06.10.2020 16:38:19.234 +03:00 [Information] [drivermanagement] [ifm.Suite.Services.DriverManagement.SuiteCommunication.Startup] []
Setting up message bus...
Hosting environment: Production
Content root path: C:\d\SingleProcess\src\Services\ifm.Suite.Services.SingleProcess\bin\Debug\netcoreapp3.1
Now listening on: http://localhost:5000
Application started. Press Ctrl+C to shut down.
06.10.2020 16:38:21.353 +03:00 [Information] [drivermanagement] [ifm.Suite.BuildingBlocks.Provisioning.Hosting.RestoreTenantState] []
Got 1 tenants to restore

(excerpt I use in the parser wizard)

I use following regex:

(?<d>.*?) \[(?<p>.*?)]\ \[(?<Service>.*?)]\ \[(?<c>.*?)]\ \[(?<unkown>.*?)][\n\r]+\s+?(?<m>.*)


It works fine with the online regex tester, but the parser has trouble parsing the file anyway.



How can I debug the issue?

Cheers!

LogViewPlus Support
LogViewPlus Support
Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)
Group: Moderators
Posts: 1.1K, Visits: 3.7K
Hi,

Thanks for reporting this issue.

The problem here is the date format.  Sometimes there is a three digit millisecond, and sometimes the millisecond is two digits.  In other words, it starts as "dd.MM.yyyy %H:mm:ss.ff zzzz" and then changes to "dd.MM.yyyy %H:mm:ss.fff zzzz".  I assume the millisecond value could also be missing entirely.

Unfortunately, I have not seen this scenario before and LogViewPlus is not handling it well.  I would expect two things:
1.  The generic '%d' (which you are implicitly using in the Regex Parser) should handle all of these situations.  This does not appear to be the case.
2.  It should be possible to create a multi-pattern to parse a line twice if necessary.  However, this does not appear to be working either.  Possibly because the two formats are too similar.

These are both serious issues and I will get them fixed ASAP.  I should have something for you early next week.

In the meantime, the best I can recommend it to use a Pattern Parser with the configuration:
%d{dd.MM.yyyy %H:mm:ss.fff zzzz} [%p] [%S{Service}] [%c] [%S{Unknown}]%m%n



This will unfortunately cause some log entries to be merged together.  In the example above, the first and second will be merged.  The merging of log entries does not impact search.

I apologize for the poor result on this one.  I will get these issues fixed ASAP.

Thanks again,

Toby
aendyp
aendyp
New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)New Member (15 reputation)
Group: Forum Members
Posts: 2, Visits: 11
Hi,

Thanks for the quick reply and fix. I thought it was a problem with the newline. I tried the pattern parser first, but without any working configuration.

Are you sure this is causing the problem, because when I remove the line break in the logfile, it parses just fine:

06.10.2020 16:38:18.17 +03:00 [Debug] [drivermanagement] [ifm.Suite.BuildingBlocks.Security.Authorization.DefaultMockAuthorizationPolicyProvider] [] Added AuthorizationPolicy:
Name = Default_Reader
Authentication schemes = Test Scheme
Requirement with scopes = ifm.suite.services.drivermanagement.reader
06.10.2020 16:38:18.239 +03:00 [Debug] [drivermanagement] [ifm.Suite.BuildingBlocks.Security.Authorization.DefaultMockAuthorizationPolicyProvider] [] Added AuthorizationPolicy:
Name = Default_Writer
Authentication schemes = Test Scheme
Requirement with scopes = ifm.suite.services.drivermanagement.reader,ifm.suite.services.drivermanagement.writer
06.10.2020 16:38:18.241 +03:00 [Debug] [drivermanagement] [ifm.Suite.BuildingBlocks.Security.Authorization.DefaultMockAuthorizationPolicyProvider] [] added HomePolicy
06.10.2020 16:38:18.242 +03:00 [Debug] [drivermanagement] [ifm.Suite.BuildingBlocks.Security.Authorization.DefaultMockAuthorizationPolicyProvider] [] DefaultPolicy with schemes Test Scheme
Using SPA folder: wwwroot
Updating Index.html
Replacing file wwwroot\index.html
Error manipulating spa:

{"ClassName":"System.IO.FileNotFoundException","Message":"Could not find file 'C:\\d\\SingleProcess\\src\\Services\\ifm.Suite.Services.SingleProcess\\bin\\Debug\\netcoreapp3.1\\wwwroot\\index.html'.","Data":null,"InnerException":null,"HelpURL":null,"StackTraceString":" at System.IO.FileStream.ValidateFileHandle(SafeFileHandle fileHandle)\r\n at System.IO.FileStream.CreateFileOpenHandle(FileMode mode, FileShare share, FileOptions options)\r\n at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)\r\n at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)\r\n at System.IO.StreamReader..ctor(String path, Encoding encoding, Boolean detectEncodingFromByteOrderMarks)\r\n at System.IO.File.InternalReadAllText(String path, Encoding encoding)\r\n at System.IO.File.ReadAllText(String path)\r\n at ifm.Suite.Services.DriverManagement.SuiteCommunication.Startup.UpdateSpaFolderEcontent()","RemoteStackTraceString":null,"RemoteStackIndex":0,"ExceptionMethod":null,"HResult":-2147024894,"Source":"System.Private.CoreLib","WatsonBuckets":null,"FileNotFound_FileName":"C:\\d\\SingleProcess\\src\\Services\\ifm.Suite.Services.SingleProcess\\bin\\Debug\\netcoreapp3.1\\wwwroot\\index.html","FileNotFound_FusionLog":null}


06.10.2020 16:38:19.234 +03:00 [Information] [drivermanagement] [ifm.Suite.Services.DriverManagement.SuiteCommunication.Startup] [] Setting up message bus...
Hosting environment: Production
Content root path: C:\d\SingleProcess\src\Services\ifm.Suite.Services.SingleProcess\bin\Debug\netcoreapp3.1
Now listening on: http://localhost:5000
Application started. Press Ctrl+C to shut down.
06.10.2020 16:38:21.353 +03:00 [Information] [drivermanagement] [ifm.Suite.BuildingBlocks.Provisioning.Hosting.RestoreTenantState] [] Got 1 tenants to restore


This is the regex I am currently using:

^(?<d>.*?) \[(?<p>.*?)]\ \[(?<Service>.*?)]\ \[(?<c>.*?)]\ \[(?<CorrelationId>.*?)](?<m>.*)


LogViewPlus Support
LogViewPlus Support
Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)
Group: Moderators
Posts: 1.1K, Visits: 3.7K
Thanks for pointing this out.  The newline could very well be contributing to the problem.  

Looking again, I just noticed that I can get a very good parse by ignoring the timezone:
%d %s [%p] [%S{Service}] [%c] [%S{Unknown}]%m%n


I suspect there are several 'small' things which are adding up to make this file difficult to parse.  The behavior is definitely not what I would expect.  We aim to make configuring parsers intuitive and things like two decimal vs three decimal dates shouldn't really matter.  I would like to take a closer look at how we can improve things.
LogViewPlus Support
LogViewPlus Support
Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)Supreme Being (5.3K reputation)
Group: Moderators
Posts: 1.1K, Visits: 3.7K
We have now released LogViewPlus v2.4.43 into BETA. Using this version of LogViewPlus, the sample log file above can be parsed using the Pattern Parser with either:

 %d [%p] [%S{Service}] [%c] [%S{Unknown}]%m%n


or

%d{dd.MM.yyyy %H:mm:ss.fff zzzz} [%p] [%S{Service}] [%c] [%S{Unknown}]%m%n
%d{dd.MM.yyyy %H:mm:ss.ff zzzz} [%p] [%S{Service}] [%c] [%S{Unknown}]%m%n


The first pattern is recommended as it should also handle the scenario where only one millisecond digit is used.  I am including the multi-pattern here just to highlight that the issue with multi-pattern parsing has also been resolved.

Parsing this file with a regex should now be easier as both the timestamp and the time zone can be automatically recognized as a date.  However, the trailing 'newline' you mentioned is problematic.  The issue is that the Regex Parser parses line by line.  It therefore expects all fields to be on the same line.  In the example log entries above, this is not the case as the Message field begins on a new line.  It would be fine if the message field was started on one line and continued on the next - but this may not be the case.

Hope that helps.  Thanks for bringing this problem to our attention!

Toby



GO

Merge Selected

Merge into selected topic...



Merge into merge target...



Merge into a specific topic ID...




Similar Topics

Login

Explore
Messages
Mentions
Search