Unix Shell - Re: Read strings from one file and search for them in a directory containing htm files

This is Interesting: Free IT Magazines  
Home > Archive > Unix Shell > November 2005 > Re: Read strings from one file and search for them in a directory containing htm files





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Re: Read strings from one file and search for them in a directory containing htm files
Meghavvarnam

2005-11-29, 7:51 am

Ed Morton wrote:
> Meghavvarnam wrote:
>
> <snip>
> <snip>
>
> Note that the above is now:
>
> for (string in strings)
> print string
> if (index...) {
> }
>
> By adding "print string" between the "for.." and the "if..", you've
> taken the "if..." outside of the loop. Add parens to make what you want
> explicit {...}.
>
>
> Yes, there is. You now only have "print string" in the "for" loop. The
> "if ..." is outside of it.
>

I completely agree with you. Here is how the modified script looks
like...

gawk 'NR==FNR{strings[$0]++;next} {
for (string in strings) {
if (index($0,">"string"<") || index($0,"\""string"\"")
|| index($0,">"string"\n")) {
usedStrings[string]++
delete strings[string] # for efficiency
} # if
} # for loop
}
END {
for (string in usedStrings)
print string
}' allStrings.txt htm/*.htm > usedStringsfile

This script is saved in a file called listused. Gave execute
permission.Then executed it from the command line as shown below

[Megh@razor] listused

I still see the same behaviour - usedStringsfile was empty.

> I gave execute
>
> What parse errors? There may be some since it's untested, but I don't
> see any.
>
> Here is the
>
> Here again you've added a line and so taken the subsequent block (the
> "if...") out of the loop.
>
>
> By that do you mean that "usedStringsfile" is empty? Well, yes, it would
> be since no-where above do you direct any output to it, but additionally
> you've broken the loop again.
>

Here is the modified script:
gawk ' NR==FNR{strings[$0]++;next}
{ for (string in strings) {
if (index($0,">"string"<") || index($0,"\""string"\"") ||
index($0,">"string"\n")) {
usedStrings[string]++
delete strings[string] # for efficiency
}
}
}
END {
print "Used Strings:"
for (string in usedStrings)
printf "\t%s\n", string
print "Unused Strings:"
for (string in strings)
printf "\t%s\n", string
}' allStrings.txt htm/*.htm

This file is saved in listused1, provide execute permission and run
from the command line like this :

listused1 > output1

output1 has strings that are both used and unused in it. When I cross
check it manually.
Here is how the output1 file begins -
Used Strings:
Unused Strings:
... All the strings follow here

Ed,

Given that the script is saved in a file, it would help if you can tell
me the correct way to run it from the command line.

We need to get this working.. Help please !

Thank you again!

Regards,
Megh

> Would we
>
> No.
>
>
> See Janis' response.
>
>
> You're welcome,
>
> Ed.


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com