 |
|
 |
|
|
 |
Grep last matching string in huge file |
 |
 |
|
|
09-06-06 12:30 PM
Hello, All!
I need to grep for a value that is contained within a string that
matches a certain pattern. The string looks something like this:
***cookie*** cookie_id_name=XXX
so, i have a pattern consisting of string "***cookie*** cookie_id_name"
-that's the whole string, no problem defining that. What I need is the
XXX value, which I later derive from the string.
The command I use is this:
grep '"***cookie*** cookie_id_name' log_file_name.log | tail -1
The problem is those log files are huge, and there are about 20 of them
in all. When I run the above grep command, it takes unacceptably
sizeable system resources to perform.
Could anyone pls make a suggestion on how to make this easier?
Note: this type of string is not the only thing in the log files, which
means that I never know how many strings away from the end of the file
it may be found (maybe 10 maybe 10000). Idea to use a command like
tail -10000 log_file_name.log | grep '"***cookie*** cookie_id_name' |
tail -1
is the best I can think of, but not good enough in my view.
Suggestions are appreciated!
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 12:30 PM
Sergei,
to me, it seems that you need to solve two problems ...
firstly ... the logfiles are huge ... you dont want to
read the whole file, you only want to read the smallest
possible part of it.
secondly ... you look for a pattern.
The solution woul be to read the lines in the file in
reverse order (back to front) rather than in normal
order...
AND
stop reading the file with the first occurance
if the pattern you search.
This might eventually require a 'roll your own'
solution. It shouldn't be too hard to write
a program, that read in the file from back to
front, and outputs complete lines.
Patternmatching could be done with the
functions from regex.h.
Another solution might be to keep the logfiles shorter.
Rainer
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 06:37 PM
sergei.sheinin@db.com wrote:
> Hello, All!
>
> I need to grep for a value that is contained within a string that
> matches a certain pattern. The string looks something like this:
>
> ***cookie*** cookie_id_name=XXX
>
> so, i have a pattern consisting of string "***cookie*** cookie_id_name"
> -that's the whole string, no problem defining that. What I need is the
> XXX value, which I later derive from the string.
>
> The command I use is this:
>
> grep '"***cookie*** cookie_id_name' log_file_name.log | tail -1
>
>
> The problem is those log files are huge, and there are about 20 of them
> in all. When I run the above grep command, it takes unacceptably
> sizeable system resources to perform.
>
>
> Could anyone pls make a suggestion on how to make this easier?
>
>
> Note: this type of string is not the only thing in the log files, which
> means that I never know how many strings away from the end of the file
> it may be found (maybe 10 maybe 10000). Idea to use a command like
> tail -10000 log_file_name.log | grep '"***cookie*** cookie_id_name' |
> tail -1
> is the best I can think of, but not good enough in my view.
>
tac file | grep -m 1 <pattern>
Ed.
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 06:37 PM
Ed Morton wrote:
> tac file | grep -m 1 <pattern>
Bingo ...
tac ... cat ... nice little pun ...
when I thought about the problem I somehow
felt 'there should be a program for this already' ...
but didn't find anything, but reversing cat did't come
to my mind.
Learned something new today ;-)
Rainer
[ Post a follow-up to this message ]
|
|
|
 |
|
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 06:37 PM
>
> tac file | grep -m 1 <pattern>
>
> Ed.
sounds sweet, but what's "tac"? i don't have it as a recognized
command...
[ Post a follow-up to this message ]
|
|
|
 |
|
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 06:37 PM
Rainer Temme wrote:
> Ed Morton wrote:
>
> Bingo ...
>
> tac ... cat ... nice little pun ...
> when I thought about the problem I somehow
> felt 'there should be a program for this already' ...
> but didn't find anything, but reversing cat did't come
> to my mind.
>
> Learned something new today ;-)
>
> Rainer
"pun" as in "fun"? how do I get this to work? also, grep doesn't have
the "-m" option.
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 06:37 PM
On 2006-09-06, sergei.sheinin@db.com wrote:
> Ed Morton wrote:
>
> sounds sweet, but what's "tac"? i don't have it as a recognized
> command...
Neither tac (concatenate and print files in reverse) nor the -m
option to grep are standard. They are part of the GNU utilities.
--
Chris F.A. Johnson, author | <http://cfaj.freeshell.org>
Shell Scripting Recipes: | My code in this post, if any,
A Problem-Solution Approach | is released under the
2005, Apress | GNU General Public Licence
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 06:37 PM
sergei.sheinin@db.com wrote:
> sorry, guys, but it's probably not an option under the circumstances. i
> work in an environment where all env changes are looked down upon with
> a frown (that's for a reason, btw). so i need to make do with what's
> available on solaris 5.8.
A pity ... are you only allowed to use what's already there, or
can you at least introduce small selfwritten progs?
If you have to 'eat whats on the table' and your approach
with tail isn't good enough, you might try this:
- define a blocksize (say 8K)
- get filelength of logfile in bytes (ls -l)
calculate size in blocks (eval)
- use dd to read a N blocks
from EOF_minus_N_blocks to EOF.
and pipe the output to your grep.
- if the line is found you're done.
- If nothing is found, dd from
EOF_minus_2N-1_blocks to EOF_minus_N-1_blocks
(note: yes, there's an overlap to handle the
rare case, that the line is over block-borders)
- repeat this procedure util either a line is found, or
you're at the beginning of the file.
(use dd's count=xxx bs=xxx skip=xxx options to position
in the file)
Rainer
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 06:37 PM
sergei.sheinin@db.com wrote:
>
>
>
> sorry, guys, but it's probably not an option under the circumstances. i
> work in an environment where all env changes are looked down upon with
> a frown (that's for a reason, btw). so i need to make do with what's
> available on solaris 5.8.
I don't know if it'll be any faster, but you could do something like
this (untested) to step back "delta" lines at a time:
size=`wc -l < file`
delta=1000
end="$size"
start=$(( end - delta ))
while (( start > 0 ))
do
start=$(( end - delta ))
sed -n "${start},${end}p" | grep pattern
end=$(( start - 1 ))
done
Check the logic.
You could also try using awk or sed instead of grep to find your pattern
to see if they're any faster.
Ed.
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
 |
Re: Grep last matching string in huge file |
 |
 |
|
|
09-06-06 06:37 PM
Ed Morton wrote:
> sergei.sheinin@db.com wrote:
>
> I don't know if it'll be any faster, but you could do something like
> this (untested) to step back "delta" lines at a time:
>
> size=`wc -l < file`
> delta=1000
> end="$size"
> start=$(( end - delta ))
> while (( start > 0 ))
> do
> start=$(( end - delta ))
> sed -n "${start},${end}p" | grep pattern
> end=$(( start - 1 ))
> done
>
> Check the logic.
>
> You could also try using awk or sed instead of grep to find your pattern
> to see if they're any faster.
>
> Ed.
something like that. problem is, using wc -l on that file is also not a
good idea, as it takes (i just checked) about 10 seconds and 2% cpu.
what i'll probably do is this
for (my $i=1; $i<4; $i++)
{
$lines = 1000 * i;
$cmd = "tail -$lines logfile | grep 'pattern' | tail -1" ## i tested
this command, the performance is great!
@result = `$cmd`;
if (#$result >0)
blalbalba...
}
this loop should work after the first iteration in 90+% of the cases.
if after three iterations is still doesn't get what it's looking for,
then a warinng message will be emailed. it's acceptable under the
circumstances ;)
Sergei.
[ Post a follow-up to this message ]
|
|
|
 |
|
 |
|
 |
|
|
|
Sponsored Links |
 |
 |
|
|
 |
All times are GMT. The time now is 06:27 PM. |
 |
|
|
 |
|
 |
|
|
 |
|
Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
|
|
|
|
Medical and Health forum | Computer Games Reviews | Graphics design forum
|
 |
|
 |
|