Unix Shell - sed and awk

This is Interesting: Free IT Magazines  
Home > Archive > Unix Shell > December 2006 > sed and awk





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author sed and awk
Bo Yang

2006-12-16, 1:31 am

I have a question about when to use sed or awk, and their
merits and defects.

Now, given a cpp file, which has two form of comments:
the line start with '//'
and the linexs between '/*' and '*/'

I want to get how many lines the comments are in the file.
Of course, I can use sed as following:

sed -n '/\/\*/,/\*\//p;/\/\//p'

to output all the commets. But if I want to output the
comments only, how to do? (this is not the main question,hehe)

Now, I want to change a method, still use sed, but this time
I want to do it another way:

for the situation '//', it is too easy to discuss here.
for the second situation, I want to let sed do followings:
when see the first /* , into a mode A, and turn back to
normal mode until it see the */ matched with /*.

Is there any way to express the mode in sed? And what about
awk ?

Welcome to discuss here, thank you!
Ed Morton

2006-12-16, 1:31 am

Bo Yang wrote:
> I have a question about when to use sed or awk, and their
> merits and defects.


Use sed for simple substituions, awk otherwise.

> Now, given a cpp file, which has two form of comments:
> the line start with '//'
> and the linexs between '/*' and '*/'
>
> I want to get how many lines the comments are in the file.
> Of course, I can use sed as following:
>
> sed -n '/\/\*/,/\*\//p;/\/\//p'
>
> to output all the commets.


No, you can't. That won't account for "/*" or "*/" appearing within
strings, within comments, etc.

> But if I want to output the
> comments only, how to do? (this is not the main question,hehe)


For both cases, use a tool that understands the language, e.g. something
like this to wrap "gcc -E":

_file="$1"

_sed=/whatever/bin/sed # path to a sed that supports
# the "[[:space:]]" RE (e.g. GNU)

_hash="__HASH__$$__"
$_sed -e "s/#/${_hash}/g" ${_file} |
gcc -E - |
$_sed -e "s/${_hash}/#/g" \
-e '/^[[:space:]][[:space:]]*$/d' \
-e '/^$/d' \
-e '1d'

to strip the comments, then use "comm" against the original if you want
the non-commented lines.

Caveat - I posted the above to comp.lang.c a few years back (before
learning awk) so there's almost certainly a more efficient way to write it.

Regards,

Ed.
Ed Morton

2006-12-16, 1:31 am

Ed Morton wrote:

> Bo Yang wrote:
>
>
>
> Use sed for simple substituions, awk otherwise.
>
>
>
> No, you can't. That won't account for "/*" or "*/" appearing within
> strings, within comments, etc.
>
>
>
> For both cases, use a tool that understands the language, e.g. something
> like this to wrap "gcc -E":
>
> _file="$1"
>
> _sed=/whatever/bin/sed # path to a sed that supports
> # the "[[:space:]]" RE (e.g. GNU)
>
> _hash="__HASH__$$__"
> $_sed -e "s/#/${_hash}/g" ${_file} |
> gcc -E - |
> $_sed -e "s/${_hash}/#/g" \
> -e '/^[[:space:]][[:space:]]*$/d' \
> -e '/^$/d' \
> -e '1d'
>
> to strip the comments, then use "comm" against the original if you want
> the non-commented lines.
>
> Caveat - I posted the above to comp.lang.c a few years back (before
> learning awk) so there's almost certainly a more efficient way to write it.


Here's a better version:

$ cat strip.gcc
[ $# -eq 2 ] && arg=$1 || arg=""
eval file="\$$#"
sed 's/a/aA/g;s/__/aB/g;s/b/bA/g;s/#/bB/g' "$file" |
gcc -P -E $arg - |
sed 's/bB/#/g;s/bA/b/g;s/aB/__/g;s/aA/a/g'

You can call it with an argument like "-ansi" to specify which C
standard to apply. See the discussion at:

http://tinyurl.com/y9hfk6

for more details if you care.

Ed.
Bo Yang

2006-12-16, 1:31 am

Ed Morton :
> Ed Morton wrote:
>
>
> Here's a better version:
>
> $ cat strip.gcc
> [ $# -eq 2 ] && arg=$1 || arg=""
> eval file="\$$#"
> sed 's/a/aA/g;s/__/aB/g;s/b/bA/g;s/#/bB/g' "$file" |

Why substitue a,__,b here, do they have any side effect to
the preprocess?
> gcc -P -E $arg - |
> sed 's/bB/#/g;s/bA/b/g;s/aB/__/g;s/aA/a/g'
>
> You can call it with an argument like "-ansi" to specify which C
> standard to apply. See the discussion at:
>
> http://tinyurl.com/y9hfk6
>
> for more details if you care.
>
> Ed.

Ed Morton

2006-12-16, 1:19 pm

Bo Yang wrote:
<snip>>
>
> Why substitue a,__,b here, do they have any side effect to
> the preprocess?


Yes: __FILE__, __LINE__, etc. get expanded by the preprocessor.

Ed.
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com