Unix Shell - newbie question: remove lines from text-file

This is Interesting: Free IT Magazines  
Home > Archive > Unix Shell > August 2006 > newbie question: remove lines from text-file





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author newbie question: remove lines from text-file
Martin J鴕gensen

2006-08-25, 7:33 am

Hi,

I have a +200 kb log file from which I want to remove lines such as this
one:

5950K ......... ......... ......... ......... ......... ......... 59%
5600K ......... ......... ......... ......... ......... ......... 60%
5650K ......... ......... ......... ......... ......... ......... 61%

It's called wget-log. I'm not really good at linux bash shells so how do
I remove those lines with "......... ......... ......... ........." ?

I suppose I could do something like "cat wget-log |
(some-command/grep?) > newfile" but there are probably better methods,
including not creating a new file...


Best regards
Martin J鴕gensen

--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
yulangdong

2006-08-25, 1:26 pm

sed -i "/......... ......... ......... ......... ......... ......... /d"
urlog

"Martin J鴕gensen" <hotmail_spam@hotmail.com>
??????:1k43s3-k95.ln1@news.tdc.dk...
> Hi,
>
> I have a +200 kb log file from which I want to remove lines such as this
> one:
>
> 5950K ......... ......... ......... ......... ......... ......... 59%
> 5600K ......... ......... ......... ......... ......... ......... 60%
> 5650K ......... ......... ......... ......... ......... ......... 61%
>
> It's called wget-log. I'm not really good at linux bash shells so how do
> I remove those lines with "......... ......... ......... ........." ?
>
> I suppose I could do something like "cat wget-log |
> (some-command/grep?) > newfile" but there are probably better methods,
> including not creating a new file...
>
>
> Best regards
> Martin J鴕gensen
>
> --
> --------------------------------------------------------------------------

-
> Home of Martin J鴕gensen - http://www.martinjoergensen.dk



yulangdong

2006-08-25, 1:26 pm

sorry,

you need escape .

"yulangdong" <yulangdong@hotmail.com> 写入消息新闻
:44eef415$1@news.starhub.net.sg...
> sed -i "/......... ......... ......... ......... ......... ......... /d"
> urlog
>
> "Martin J鴕gensen" <hotmail_spam@hotmail.com>
> ??????:1k43s3-k95.ln1@news.tdc.dk...
>
> --------------------------------------------------------------------------
> -
>
>



Ed Morton

2006-08-25, 1:26 pm

yulangdong wrote:

> "Martin J鴕gensen" <hotmail_spam@hotmail.com>
> ??????:1k43s3-k95.ln1@news.tdc.dk...
>
>
> -
>
>
> sed -i "/......... ......... ......... ......... ......... ......... /d"
> urlog
>


Top-posting fixed. Please don't top-post.

A "." matches any character. Escape it ("\.") if you want to
specifically only match a period.

You also don't need to explicitly list every character if your sed
supports RE intervals, e.g. to match 6 repetitions of 9
periods-then-a-space:

sed '/\(\.\{9\} \)\{6\}/d'

Regards,

Ed.

Martin J鴕gensen

2006-08-26, 1:24 pm

Ed Morton <morton@lsupcaemnt.com> writes:

> yulangdong wrote:
>



That worked fine, thanks.
[vbcol=seagreen]
> Top-posting fixed. Please don't top-post.
>
> A "." matches any character. Escape it ("\.") if you want to
> specifically only match a period.


Only escape the first character, right? Or every one? I also don't
really understand whether or not the citation marks " " are necessary?

> You also don't need to explicitly list every character if your sed
> supports RE intervals, e.g. to match 6 repetitions of 9
> periods-then-a-space:
>
> sed '/\(\.\{9\} \)\{6\}/d'


That's too advanced for me... Can you break it up and explain that to me?


Best regards
Martin J鴕gensen

--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
First Lensman

2006-08-26, 7:22 pm

I would recommend learning PERL and regular expressions. Try using the
following script:

Script Name: unclog.pl (or whatever you want to call it)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
#!/usr/bin/perl -wn

print if (!/^.*\.{9} .*$/);

Make sure the permissions on unclog.pl are set to execute (i.e. 770)

Then run the script as follows:

unclog.pl wget-log > wget-unclog

Hope this helps.

Art Ramos

Martin J=F8rgensen wrote:
> Hi,
>
> I have a +200 kb log file from which I want to remove lines such as this
> one:
>
> 5950K ......... ......... ......... ......... ......... ......... 59%
> 5600K ......... ......... ......... ......... ......... ......... 60%
> 5650K ......... ......... ......... ......... ......... ......... 61%
>
> It's called wget-log. I'm not really good at linux bash shells so how do
> I remove those lines with "......... ......... ......... ........." ?
>
> I suppose I could do something like "cat wget-log |
> (some-command/grep?) > newfile" but there are probably better methods,
> including not creating a new file...
>
>
> Best regards
> Martin J=F8rgensen
>
> --
> -------------------------------------------------------------------------=

--
> Home of Martin J=F8rgensen - http://www.martinjoergensen.dk


Xicheng Jia

2006-08-26, 7:22 pm

First Lensman wrote:
> I would recommend learning PERL and regular expressions. Try using the
> following script:
>
> Script Name: unclog.pl (or whatever you want to call it)
> =======================
> #!/usr/bin/perl -wn
>
> print if (!/^.*\.{9} .*$/);
>
> Make sure the permissions on unclog.pl are set to execute (i.e. 770)
>
> Then run the script as follows:
>
> unclog.pl wget-log > wget-unclog
>


why not issue it directly on the command line:

perl -ne 'print if not /(??:\.){9} ){6}/' wget-log > wget-unclog

Xicheng

mik3

2006-08-26, 7:22 pm


Martin J=F8rgensen wrote:
> Hi,
>
> I have a +200 kb log file from which I want to remove lines such as this
> one:
>
> 5950K ......... ......... ......... ......... ......... ......... 59%
> 5600K ......... ......... ......... ......... ......... ......... 60%
> 5650K ......... ......... ......... ......... ......... ......... 61%
>
> It's called wget-log. I'm not really good at linux bash shells so how do
> I remove those lines with "......... ......... ......... ........." ?
>
> I suppose I could do something like "cat wget-log |
> (some-command/grep?) > newfile" but there are probably better methods,
> including not creating a new file...
>
>
> Best regards
> Martin J=F8rgensen
>
> --
> -------------------------------------------------------------------------=

--
> Home of Martin J=F8rgensen - http://www.martinjoergensen.dk



if you know Python,

o =3D open("newfile","a")
for lines in open("wget-log"):
if not "......... ......... ......... ........." in lines:
o.write(lines)
o=2Eclose()

Janis

2006-08-27, 7:25 pm

Martin J=F8rgensen wrote:
> Ed Morton <morton@lsupcaemnt.com> writes:
>
>
> That's too advanced for me... Can you break it up and explain that to me?


It's easier to understand if you temporarily remove the - necessary -
backslashes...

/ start of regexp
( start of grouping
\.{9} a literal dot (needs escaping), nine times
) a space, and end of grouping
{6} six times the whole group
/d end of regexp, and delete command

Now every bracket for ranges and grouping requires escaping with a
backslash.

Janis

Martin J鴕gensen

2006-08-28, 1:42 pm

"Janis" <janis_papanagnou@hotmail.com> writes:

> Martin J鴕gensen wrote:
>
> It's easier to understand if you temporarily remove the - necessary -
> backslashes...
>
> / start of regexp
> ( start of grouping
> \.{9} a literal dot (needs escaping), nine times
> ) a space, and end of grouping
> {6} six times the whole group
> /d end of regexp, and delete command
>
> Now every bracket for ranges and grouping requires escaping with a
> backslash.


hmm. Thanks. I think I'll have to see that a couple of more times or
even try to experiment with it before I clearly understand it...


Best regards
Martin J鴕gensen

--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
Janis

2006-08-29, 1:44 am

Martin J=F8rgensen wrote:
> "Janis" <janis_papanagnou@hotmail.com> writes:
>
me?[vbcol=seagreen]
>
> hmm. Thanks. I think I'll have to see that a couple of more times or
> even try to experiment with it before I clearly understand it...


Regexps are not easy to read, especially if you're not used to.
I don't know where you have concrete problems, so I'll guess...

The input text to be matched by the pattern contains only dots
'=2E' and spaces ' '. To avoid repetition of such regexp atoms you
may want to use friendly counting schemes. In many advanced
regexp syntaxes it is possible to define ranges like

{2,5} at least two times, at most 5 times
{,5} at most five times, no lower limit
{2,} at least 2 times, no upper limit
{2} exactly two times

These ranges apply to the preceeding regexp atom, so

\=2E{2} means exactly two times the dot
..{2} means exactly two times any character

If you have more than a single atom to be repeated you may use
brackets to group a group of atoms

(whatever){5} repeat the pattern within the brackets 5 times

The brackets (, ), {, }, are meta characters in the regexp syntax,
and some regexp parsers require them to be escaped by a backslash
e=2Eg. to distinguish them from the literal bracket symbols.

Janis


> Best regards
> Martin J=F8rgensen
>
> --
> -------------------------------------------------------------------------=

--
> Home of Martin J=F8rgensen - http://www.martinjoergensen.dk


Martin J鴕gensen

2006-08-30, 1:23 am

"Janis" <janis_papanagnou@hotmail.com> writes:

> Martin J鴕gensen wrote:
>
> Regexps are not easy to read, especially if you're not used to.
> I don't know where you have concrete problems, so I'll guess...


I have problems reading it all over the place, everywhere :-)

> The input text to be matched by the pattern contains only dots
> '.' and spaces ' '. To avoid repetition of such regexp atoms you
> may want to use friendly counting schemes. In many advanced
> regexp syntaxes it is possible to define ranges like
>
> {2,5} at least two times, at most 5 times
> {,5} at most five times, no lower limit
> {2,} at least 2 times, no upper limit
> {2} exactly two times


Thanks.

> These ranges apply to the preceeding regexp atom, so
>
> \.{2} means exactly two times the dot
> .{2} means exactly two times any character


How about \.\{2\} ?

> If you have more than a single atom to be repeated you may use
> brackets to group a group of atoms
>
> (whatever){5} repeat the pattern within the brackets 5 times


Ok.

> The brackets (, ), {, }, are meta characters in the regexp syntax,
> and some regexp parsers require them to be escaped by a backslash
> e.g. to distinguish them from the literal bracket symbols.


Oh... What's the name of my regexp parser if I type sed '/\(\.\{9\}
\)\{6\}/d' in bash or a script?


Best regards
Martin J鴕gensen

--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
Chris F.A. Johnson

2006-08-30, 7:33 am

On 2006-08-29, Martin J鴕gensen wrote:
>
> Oh... What's the name of my regexp parser if I type sed '/\(\.\{9\}
> \)\{6\}/d' in bash or a script?


sed

--
Chris F.A. Johnson, author <http://cfaj.freeshell.org>
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
===== My code in this post, if any, assumes the POSIX locale
===== and is released under the GNU General Public Licence
Martin J鴕gensen

2006-08-30, 1:32 pm

"Chris F.A. Johnson" <cfajohnson@gmail.com> writes:

> On 2006-08-29, Martin J鴕gensen wrote:
>
> sed


Is awk also a regexp parser? Is the shell? Like bash?


Best regards
Martin J鴕gensen

--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
Chris F.A. Johnson

2006-08-30, 7:32 pm

On 2006-08-30, Martin J鴕gensen wrote:
> "Chris F.A. Johnson" <cfajohnson@gmail.com> writes:
>
>
> Is awk also a regexp parser?


Yes.

> Is the shell? Like bash?


Bash can use regular expressions in the non-portable [[ ... ]]
syntax. See the man page for more info.

--
Chris F.A. Johnson, author <http://cfaj.freeshell.org>
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
===== My code in this post, if any, assumes the POSIX locale
===== and is released under the GNU General Public Licence
Stephane CHAZELAS

2006-08-30, 7:32 pm

2006-08-30, 17:21(-04), Chris F.A. Johnson:
> On 2006-08-30, Martin J鴕gensen wrote:
>
> Yes.
>
>
> Bash can use regular expressions in the non-portable [[ ... ]]
> syntax. See the man page for more info.


(only recent (>3) versions, and that's some sort of GNU regexps).

And every (well most) shell has expr, sed, perl, grep or awk
though generally not builtin (but what would be the point?)

zsh (Perl compatible RE) and ksh93 (AT&T RE) also have internal
support for regular expressions.

--
St閜hane
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com