Follow Me On...
Main | Bash Script for monitoring ulimit for multiple processes »
Tuesday
Nov032015

Using GREP + SORT + UNIQ to find occurrences of a repeated event by an id field

Use Case: you have log file but it has a bunch of entries that start with a time stamp. You had a process which continuously crashed such that each time it restarted it would reprint an occurance. Well we want to just get one match for the first time it occurred… luckily i had an identifier in my log file to key off of.

My Log File

grep "Description Mismatch" logfile.log

[2015-10-24 16:30:01.655] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
[2015-10-24 16:45:01.672] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
[2015-10-24 17:00:02.073] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
[2015-10-24 17:00:02.146] [WARN] scheduler - Description Mismatch 562b997fb0e2bbb208f1f7dd CRM Description: -- Auto Created by Callinize Callinize Description: gq
[2015-10-24 17:15:01.815] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
The first 3 and the last are the same occurrence but at different times. First we sort using the ```-k``` option. 8,8 means the 8th field which matches "562b955c8c01d13309889115" ``` grep "Description Mismatch" logfile.log | sort -k 8,8 ```
[2015-10-24 16:30:01.655] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
[2015-10-24 16:45:01.672] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
[2015-10-24 17:00:02.073] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
[2015-10-24 17:15:01.815] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm

Next, lets use uniq with -f (which is the field to start the comparison at) to get the uniq lines. Now we have:

grep "Description Mismatch" logfile.log | sort -k 8,8 | uniq -f 8

[2015-10-24 16:30:01.655] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
[2015-10-24 17:00:02.146] [WARN] scheduler - Description Mismatch 562b997fb0e2bbb208f1f7dd CRM Description: -- Auto Created by Callinize Callinize Description: gq
[2015-10-24 18:00:04.052] [WARN] scheduler - Description Mismatch 562ba7a41cce2fd10843296f CRM Description: -- Auto Created by Callinize Callinize Description: sold

Final Command

Tack on a wc -1 to get a count output.

grep "Description Mismatch" logfile.log | sort -k 8,8 | uniq -f 8 | wc -l

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
All HTML will be escaped. Hyperlinks will be created for URLs automatically.