Unix Shell - Two sed questions...?

This is Interesting: Free IT Magazines  
Home > Archive > Unix Shell > February 2005 > Two sed questions...?





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Two sed questions...?
Pepper

2005-02-10, 5:58 pm

I'm trying to get sed to do two things that, I suspect, are beyond even
its limits... Firstly, can sed remove all occurences of the "<" and ">"
characters from a file? I've tried to do it many ways, and it always
breaks. It strips all other characters I want the file to be clean of,
but it won't accept these two.


Secondly, I have sed truncate lines after specified characters/strings
in various logs of mine, but I also wanted to delete all contents of a
line *up to* a certain character or string's occurence.

For example, let's say I have this:

"Found the following matches in .XYZ: arfarf.txt; text.doc
Found the following matches in .tfiles: logs.txt; arfarf2.log"

And I would like to cut it to be just this:

"arfarf.txt; text.doc
logs.txt; arfarf2.log"


Can either of these two things be done?

2005-02-10, 5:58 pm

All can be done, with enough time and money.

Nothing is beyond the limits of sed


can sed remove all occurences of the "<" and ">"
[root@cube4 sbin]# echo ">>,,>>" | sed 's/\>//g'
[root@cube4 sbin]# echo ">>,,>>" | sed 's/,//g'[vbcol=seagreen]
[root@cube4 sbin]# echo ">>,,>>" | sed 's/>//g'
,,
[root@cube4 sbin]# echo ">>,,<<" | sed 's/[><]//g'

For the latter problem, use cut instead....[vbcol=seagreen]
> Found the following matches in .tfiles: logs.txt; arfarf2.log"
>
> And I would like to cut it to be just this:
>
> "arfarf.txt; text.doc


[root@cube4 sbin]# echo "found the following matches in
xxxx:files.1;files.2"
found the following matches in xxxx:files.1;files.2
[root@cube4 sbin]# echo "found the following matches in
xxxx:files.1;files.2" | cut -f 2 -d ':'
files.1;files.2
[root@cube4 sbin]#

,,


Ed Morton

2005-02-10, 5:58 pm



Pepper wrote:
> I'm trying to get sed to do two things that, I suspect, are beyond even
> its limits... Firstly, can sed remove all occurences of the "<" and ">"
> characters from a file? I've tried to do it many ways, and it always
> breaks. It strips all other characters I want the file to be clean of,
> but it won't accept these two.


You're probably not quoting the script. Look:

PS1> echo "a>>b>>c" | sed s/>//g
ksh: //g: cannot create
PS1> echo "a>>b>>c" | sed 's/>//g'
abc

Just use the appropriate quotes.


>
> Secondly, I have sed truncate lines after specified characters/strings
> in various logs of mine, but I also wanted to delete all contents of a
> line *up to* a certain character or string's occurence.
>
> For example, let's say I have this:
>
> "Found the following matches in .XYZ: arfarf.txt; text.doc
> Found the following matches in .tfiles: logs.txt; arfarf2.log"
>
> And I would like to cut it to be just this:
>
> "arfarf.txt; text.doc
> logs.txt; arfarf2.log"


So, assuming you want to truncate up to ": ", it's just:

sed 's/.*: //'

Regards,

Ed.

> Can either of these two things be done?

Robert Bonomi

2005-02-10, 5:58 pm

In article <Xns95F9A8A5E4416engI@193.110.122.97>,
Pepper <pep@per.doesnt.want.spm> wrote:
>I'm trying to get sed to do two things that, I suspect, are beyond even
>its limits... Firstly, can sed remove all occurences of the "<" and ">"
>characters from a file? I've tried to do it many ways, and it always
>breaks. It strips all other characters I want the file to be clean of,
>but it won't accept these two.


This is *probably* a matter of 'quoting', or the lack thereof.

Remember, '<', and '>' are characters that have "special meaning" to the
shell. the command:

sed -e 's/[><]//g'

works just fine, when I try it, *with* the substitution command enclosed
in quotes.

However, if this is the *only* modification to the file, then the command
tr -d '<>'

is likely to be faster -- something that you'll notice *only* on a _large_
file, or if you're doing this on *lots* of files.

>Secondly, I have sed truncate lines after specified characters/strings
>in various logs of mine, but I also wanted to delete all contents of a
>line *up to* a certain character or string's occurence.
>
>For example, let's say I have this:
>
>"Found the following matches in .XYZ: arfarf.txt; text.doc
> Found the following matches in .tfiles: logs.txt; arfarf2.log"
>
>And I would like to cut it to be just this:
>
>"arfarf.txt; text.doc
> logs.txt; arfarf2.log"
>

This one is a little more complicated.

Different approaches may be required, depending on whether you use a
single character or a string as the 'marker' point.

The general idea is to use: (a) an anchor to the beginning of the line --
not _required_, but it makes you intention clear, and speeds up processing;
(b) a 'wildcard' that matches everything up to your delimiter/marker; and
(c) something that *uniquely* identifies the delimiter/marker at the desired
location.

Now, *if* the constructed pattern has more than one possible match against
the line (e.g. a 'match any number of anything', followed by a 'delimiter',
and there is _more_than_one_ occurrence of that 'delimiter'), the pattern will
match the *longest* possible string. (This is what is called "greedy" pattern-
matching.) *Not* the 'desired result' if your intent was the _first_
occurrence (or any other occurrence other than the 'last' one, in fact).

For your example, where you apparently want to delete everything up to and
including the _first_ ":", and presumably any whitespace after it, you
could use:

s/^[^:]*:[ ]*//

(where there are two characters between the square brackets.
a [SP], and a [HT].)

this specifies a pattern that matches :
beginning-of-line, followed by
zero or more 'not a colon' characters, followed by
a literal colon, followed by
zero or more 'whitespace' characters

To match on, say, the _third_ colon on the line, you simply repeat the
'match everything not a colon, then match a colon' segment of the pattern
the appropriate number of times. e.g.:

s/^[^:]*:[^:]*:[^:]*:[ ]*//


Triggering on a specific occurrence of a *non-unique* multi-character string
is considerably more complex.


>Can either of these two things be done?


Obviously, "yes". <grin>


John W. Krahn

2005-02-11, 2:51 am

Ed Morton wrote:
>
> Pepper wrote:
>
>
> So, assuming you want to truncate up to ": ", it's just:
>
> sed 's/.*: //'


Don't forget that * is greedy so that may remove more than the OP wants.

sed 's/^[^:]*: *//'


John
--
use Perl;
program
fulfillment
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com