|
Home > Archive > Unix Shell > January 2006 > simple sed help - delete word containing + (plus sign)
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
simple sed help - delete word containing + (plus sign)
|
|
| rachelms79@hotmail.com 2006-01-20, 6:03 pm |
| How do you delete all words containing + (plus sign)? I tried sed
's/.+.//g' but that leaves the characters not adjacent to a +.
Thanks much.
| |
| Janis Papanagnou 2006-01-20, 6:03 pm |
| rachelms79@hotmail.com wrote:
> How do you delete all words containing + (plus sign)? I tried sed
> 's/.+.//g' but that leaves the characters not adjacent to a +.
> Thanks much.
Not sure what you have in mind; providing examples would be helpful.
A . stands for any character, so you delete "any character followed
by a plus followed by any character".
If you want not single characters but arbitrary long strings you
would use the pattern .*+.* but that means you'd replace every line
with a plus by a blank one.
If you want "words", as you write, you have to define how word is
defined. If delimited by white space then you might want to use the
command
sed -e 's/[^[:space:]]*+[^[:space:]]*//g'
which produces, for example, given this input
aaa bbbb+ cccc +dddd +eee+ fff+fff ggg hhh+hh+hhh iii +++ jjj
that output
aaa cccc ggg iii jjj
If that's not what you want provide example input/output data.
Janis
| |
| Chris F.A. Johnson 2006-01-20, 6:03 pm |
| On 2006-01-20, rachelms79@hotmail.com wrote:
> How do you delete all words containing + (plus sign)? I tried sed
> 's/.+.//g' but that leaves the characters not adjacent to a +.
How do you define "word"?
Your regular expression needs to match all adjacent characters that
belong in the word, e.g.:
sed 's/[a-zA-Z]*+[a-zA-Z]*//g'
Or:
sed 's/[[:alpha:]]*+[[:alpha:]]*//g'
Or, if the word can include numbers:
sed 's/[a-zA-Z0-9]*+[a-zA-Z0-9]*//g'
etc......
--
Chris F.A. Johnson, author | <http://cfaj.freeshell.org>
Shell Scripting Recipes: | My code in this post, if any,
A Problem-Solution Approach | is released under the
2005, Apress | GNU General Public Licence
| |
| Timothy Larson 2006-01-22, 6:10 pm |
| rachelms79@hotmail.com wrote:
> How do you delete all words containing + (plus sign)? I tried sed
> 's/.+.//g' but that leaves the characters not adjacent to a +.
> Thanks much.
>
I don't use sed very much, but + is a special character in regex
grammars I am familiar with. You might need to escape it.
Tim
| |
| Janis Papanagnou 2006-01-22, 6:10 pm |
| Timothy Larson wrote:
> rachelms79@hotmail.com wrote:
>
>
> I don't use sed very much, but + is a special character in regex
> grammars I am familiar with. You might need to escape it.
It _might_ be a special character in some RE grammars. But not with
the sed the OP is using, as you may see if you re-read his posting;
he described that his pattern works as it is defined.
(Though he needs another regular expression than the one he uses.)
Janis
| |
| Will Renkel 2006-01-23, 6:13 pm |
| the attached commands seem to do the job
note it must be done TWICE to handle two successive "words with plus" that are next to each other
eg - kk abc+def hjy+kkk ll
in the command the first -e sequence does internal words,
the second does first word in line,
and the third does last word in line.
for first word, the word plus one space is deleted
for last word space plus word is deleted
for internal words word plus a space is deleted
sed -e 's/ [^ ]*+[^ ]* / /g' -e 's/^[^ ]*+[^ ]* //' -e 's/ [^ ]*+[^ ]*$//' | sed -e 's/ [^ ]*+[^ ]* / /g' -e 's/^[^ ]*+[^ ]* //' -e 's/ [^ ]*+[^ ]*$//'
--
---------------------------------------------------------------
Will Renkel
Wheaton, Ill.
REGISTERD Linux User: 300583
---------------------------------------------------------------
thelarsons3@cox.net wrote:
>rachelms79@hotmail.com wrote:
>
>I don't use sed very much, but + is a special character in regex
>grammars I am familiar with. You might need to escape it.
>
>Tim
| |
| Bill Marcum 2006-01-23, 6:13 pm |
| On Mon, 23 Jan 2006 15:56:01 +0000 (UTC), Will Renkel
<renkel@xnet.com> wrote:
> the attached commands seem to do the job
> note it must be done TWICE to handle two successive "words with plus" that are next to each other
> eg - kk abc+def hjy+kkk ll
> in the command the first -e sequence does internal words,
> the second does first word in line,
> and the third does last word in line.
>
> for first word, the word plus one space is deleted
> for last word space plus word is deleted
> for internal words word plus a space is deleted
>
> sed -e 's/ [^ ]*+[^ ]* / /g' -e 's/^[^ ]*+[^ ]* //' -e 's/ [^ ]*+[^ ]*$//' | sed -e 's/ [^ ]*+[^ ]* / /g' -e 's/^[^ ]*+[^ ]* //' -e 's/ [^ ]*+[^ ]*$//'
>
>
You might simplify this by adding spaces to the beginning and end of
each line:
sed -e 's/.*/ & /' -e 's/ [^ ]*+[^ ]* / /g' -e 's/^ \(.*\) $/\1/'
To handle consecutive "x+y" words, you can use a loop:
sed -e 's/.*/ & /' -e ':loop;s/ [^ ]*+[^ ]* / /g;tloop' -e 's/^ \(.*\) $/\1/'
Or as a sed script:
#!/bin/sed -f
s/.*/ & /
:loop
s/ [^ ]*+[^ ]* / /g
tloop
s/^ \(.*\) $/\1/
In most cases, use of these commands indicates that you are probably
better off programming in something like `awk' or Perl. But
occasionally one is committed to sticking with `sed', and these
commands can enable one to write quite convoluted scripts.
[from the GNU sed info file]
--
'Ooohh.. "FreeBSD is faster over loopback, when compared to Linux
over the wire". Film at 11.'
-- Linus Torvalds
|
|
|
|
|