|
Home > Archive > Unix Shell > August 2006 > newbie question: remove lines from text-file
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
newbie question: remove lines from text-file
|
|
| Martin J鴕gensen 2006-08-25, 7:33 am |
| Hi,
I have a +200 kb log file from which I want to remove lines such as this
one:
5950K ......... ......... ......... ......... ......... ......... 59%
5600K ......... ......... ......... ......... ......... ......... 60%
5650K ......... ......... ......... ......... ......... ......... 61%
It's called wget-log. I'm not really good at linux bash shells so how do
I remove those lines with "......... ......... ......... ........." ?
I suppose I could do something like "cat wget-log |
(some-command/grep?) > newfile" but there are probably better methods,
including not creating a new file...
Best regards
Martin J鴕gensen
--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
| |
| yulangdong 2006-08-25, 1:26 pm |
| sed -i "/......... ......... ......... ......... ......... ......... /d"
urlog
"Martin J鴕gensen" <hotmail_spam@hotmail.com>
??????:1k43s3-k95.ln1@news.tdc.dk...
> Hi,
>
> I have a +200 kb log file from which I want to remove lines such as this
> one:
>
> 5950K ......... ......... ......... ......... ......... ......... 59%
> 5600K ......... ......... ......... ......... ......... ......... 60%
> 5650K ......... ......... ......... ......... ......... ......... 61%
>
> It's called wget-log. I'm not really good at linux bash shells so how do
> I remove those lines with "......... ......... ......... ........." ?
>
> I suppose I could do something like "cat wget-log |
> (some-command/grep?) > newfile" but there are probably better methods,
> including not creating a new file...
>
>
> Best regards
> Martin J鴕gensen
>
> --
> --------------------------------------------------------------------------
-
> Home of Martin J鴕gensen - http://www.martinjoergensen.dk
| |
| yulangdong 2006-08-25, 1:26 pm |
| sorry,
you need escape .
"yulangdong" <yulangdong@hotmail.com> 写入消息新闻
:44eef415$1@news.starhub.net.sg...
> sed -i "/......... ......... ......... ......... ......... ......... /d"
> urlog
>
> "Martin J鴕gensen" <hotmail_spam@hotmail.com>
> ??????:1k43s3-k95.ln1@news.tdc.dk...
>
> --------------------------------------------------------------------------
> -
>
>
| |
| Ed Morton 2006-08-25, 1:26 pm |
| yulangdong wrote:
> "Martin J鴕gensen" <hotmail_spam@hotmail.com>
> ??????:1k43s3-k95.ln1@news.tdc.dk...
>
>
> -
>
>
> sed -i "/......... ......... ......... ......... ......... ......... /d"
> urlog
>
Top-posting fixed. Please don't top-post.
A "." matches any character. Escape it ("\.") if you want to
specifically only match a period.
You also don't need to explicitly list every character if your sed
supports RE intervals, e.g. to match 6 repetitions of 9
periods-then-a-space:
sed '/\(\.\{9\} \)\{6\}/d'
Regards,
Ed.
| |
| Martin J鴕gensen 2006-08-26, 1:24 pm |
| Ed Morton <morton@lsupcaemnt.com> writes:
> yulangdong wrote:
>
That worked fine, thanks.
[vbcol=seagreen]
> Top-posting fixed. Please don't top-post.
>
> A "." matches any character. Escape it ("\.") if you want to
> specifically only match a period.
Only escape the first character, right? Or every one? I also don't
really understand whether or not the citation marks " " are necessary?
> You also don't need to explicitly list every character if your sed
> supports RE intervals, e.g. to match 6 repetitions of 9
> periods-then-a-space:
>
> sed '/\(\.\{9\} \)\{6\}/d'
That's too advanced for me... Can you break it up and explain that to me?
Best regards
Martin J鴕gensen
--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
| |
| First Lensman 2006-08-26, 7:22 pm |
| I would recommend learning PERL and regular expressions. Try using the
following script:
Script Name: unclog.pl (or whatever you want to call it)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
#!/usr/bin/perl -wn
print if (!/^.*\.{9} .*$/);
Make sure the permissions on unclog.pl are set to execute (i.e. 770)
Then run the script as follows:
unclog.pl wget-log > wget-unclog
Hope this helps.
Art Ramos
Martin J=F8rgensen wrote:
> Hi,
>
> I have a +200 kb log file from which I want to remove lines such as this
> one:
>
> 5950K ......... ......... ......... ......... ......... ......... 59%
> 5600K ......... ......... ......... ......... ......... ......... 60%
> 5650K ......... ......... ......... ......... ......... ......... 61%
>
> It's called wget-log. I'm not really good at linux bash shells so how do
> I remove those lines with "......... ......... ......... ........." ?
>
> I suppose I could do something like "cat wget-log |
> (some-command/grep?) > newfile" but there are probably better methods,
> including not creating a new file...
>
>
> Best regards
> Martin J=F8rgensen
>
> --
> -------------------------------------------------------------------------=
--
> Home of Martin J=F8rgensen - http://www.martinjoergensen.dk
| |
| Xicheng Jia 2006-08-26, 7:22 pm |
| First Lensman wrote:
> I would recommend learning PERL and regular expressions. Try using the
> following script:
>
> Script Name: unclog.pl (or whatever you want to call it)
> =======================
> #!/usr/bin/perl -wn
>
> print if (!/^.*\.{9} .*$/);
>
> Make sure the permissions on unclog.pl are set to execute (i.e. 770)
>
> Then run the script as follows:
>
> unclog.pl wget-log > wget-unclog
>
why not issue it directly on the command line:
perl -ne 'print if not /(? ?:\.){9} ){6}/' wget-log > wget-unclog
Xicheng
| |
|
|
Martin J=F8rgensen wrote:
> Hi,
>
> I have a +200 kb log file from which I want to remove lines such as this
> one:
>
> 5950K ......... ......... ......... ......... ......... ......... 59%
> 5600K ......... ......... ......... ......... ......... ......... 60%
> 5650K ......... ......... ......... ......... ......... ......... 61%
>
> It's called wget-log. I'm not really good at linux bash shells so how do
> I remove those lines with "......... ......... ......... ........." ?
>
> I suppose I could do something like "cat wget-log |
> (some-command/grep?) > newfile" but there are probably better methods,
> including not creating a new file...
>
>
> Best regards
> Martin J=F8rgensen
>
> --
> -------------------------------------------------------------------------=
--
> Home of Martin J=F8rgensen - http://www.martinjoergensen.dk
if you know Python,
o =3D open("newfile","a")
for lines in open("wget-log"):
if not "......... ......... ......... ........." in lines:
o.write(lines)
o=2Eclose()
| |
|
| Martin J=F8rgensen wrote:
> Ed Morton <morton@lsupcaemnt.com> writes:
>
>
> That's too advanced for me... Can you break it up and explain that to me?
It's easier to understand if you temporarily remove the - necessary -
backslashes...
/ start of regexp
( start of grouping
\.{9} a literal dot (needs escaping), nine times
) a space, and end of grouping
{6} six times the whole group
/d end of regexp, and delete command
Now every bracket for ranges and grouping requires escaping with a
backslash.
Janis
| |
| Martin J鴕gensen 2006-08-28, 1:42 pm |
| "Janis" <janis_papanagnou@hotmail.com> writes:
> Martin J鴕gensen wrote:
>
> It's easier to understand if you temporarily remove the - necessary -
> backslashes...
>
> / start of regexp
> ( start of grouping
> \.{9} a literal dot (needs escaping), nine times
> ) a space, and end of grouping
> {6} six times the whole group
> /d end of regexp, and delete command
>
> Now every bracket for ranges and grouping requires escaping with a
> backslash.
hmm. Thanks. I think I'll have to see that a couple of more times or
even try to experiment with it before I clearly understand it...
Best regards
Martin J鴕gensen
--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
| |
|
| Martin J=F8rgensen wrote:
> "Janis" <janis_papanagnou@hotmail.com> writes:
>
me?[vbcol=seagreen]
>
> hmm. Thanks. I think I'll have to see that a couple of more times or
> even try to experiment with it before I clearly understand it...
Regexps are not easy to read, especially if you're not used to.
I don't know where you have concrete problems, so I'll guess...
The input text to be matched by the pattern contains only dots
'=2E' and spaces ' '. To avoid repetition of such regexp atoms you
may want to use friendly counting schemes. In many advanced
regexp syntaxes it is possible to define ranges like
{2,5} at least two times, at most 5 times
{,5} at most five times, no lower limit
{2,} at least 2 times, no upper limit
{2} exactly two times
These ranges apply to the preceeding regexp atom, so
\=2E{2} means exactly two times the dot
..{2} means exactly two times any character
If you have more than a single atom to be repeated you may use
brackets to group a group of atoms
(whatever){5} repeat the pattern within the brackets 5 times
The brackets (, ), {, }, are meta characters in the regexp syntax,
and some regexp parsers require them to be escaped by a backslash
e=2Eg. to distinguish them from the literal bracket symbols.
Janis
> Best regards
> Martin J=F8rgensen
>
> --
> -------------------------------------------------------------------------=
--
> Home of Martin J=F8rgensen - http://www.martinjoergensen.dk
| |
| Martin J鴕gensen 2006-08-30, 1:23 am |
| "Janis" <janis_papanagnou@hotmail.com> writes:
> Martin J鴕gensen wrote:
>
> Regexps are not easy to read, especially if you're not used to.
> I don't know where you have concrete problems, so I'll guess...
I have problems reading it all over the place, everywhere :-)
> The input text to be matched by the pattern contains only dots
> '.' and spaces ' '. To avoid repetition of such regexp atoms you
> may want to use friendly counting schemes. In many advanced
> regexp syntaxes it is possible to define ranges like
>
> {2,5} at least two times, at most 5 times
> {,5} at most five times, no lower limit
> {2,} at least 2 times, no upper limit
> {2} exactly two times
Thanks.
> These ranges apply to the preceeding regexp atom, so
>
> \.{2} means exactly two times the dot
> .{2} means exactly two times any character
How about \.\{2\} ?
> If you have more than a single atom to be repeated you may use
> brackets to group a group of atoms
>
> (whatever){5} repeat the pattern within the brackets 5 times
Ok.
> The brackets (, ), {, }, are meta characters in the regexp syntax,
> and some regexp parsers require them to be escaped by a backslash
> e.g. to distinguish them from the literal bracket symbols.
Oh... What's the name of my regexp parser if I type sed '/\(\.\{9\}
\)\{6\}/d' in bash or a script?
Best regards
Martin J鴕gensen
--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
| |
| Chris F.A. Johnson 2006-08-30, 7:33 am |
| On 2006-08-29, Martin J鴕gensen wrote:
>
> Oh... What's the name of my regexp parser if I type sed '/\(\.\{9\}
> \)\{6\}/d' in bash or a script?
sed
--
Chris F.A. Johnson, author <http://cfaj.freeshell.org>
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
===== My code in this post, if any, assumes the POSIX locale
===== and is released under the GNU General Public Licence
| |
| Martin J鴕gensen 2006-08-30, 1:32 pm |
| "Chris F.A. Johnson" <cfajohnson@gmail.com> writes:
> On 2006-08-29, Martin J鴕gensen wrote:
>
> sed
Is awk also a regexp parser? Is the shell? Like bash?
Best regards
Martin J鴕gensen
--
---------------------------------------------------------------------------
Home of Martin J鴕gensen - http://www.martinjoergensen.dk
| |
| Chris F.A. Johnson 2006-08-30, 7:32 pm |
| On 2006-08-30, Martin J鴕gensen wrote:
> "Chris F.A. Johnson" <cfajohnson@gmail.com> writes:
>
>
> Is awk also a regexp parser?
Yes.
> Is the shell? Like bash?
Bash can use regular expressions in the non-portable [[ ... ]]
syntax. See the man page for more info.
--
Chris F.A. Johnson, author <http://cfaj.freeshell.org>
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
===== My code in this post, if any, assumes the POSIX locale
===== and is released under the GNU General Public Licence
| |
| Stephane CHAZELAS 2006-08-30, 7:32 pm |
| 2006-08-30, 17:21(-04), Chris F.A. Johnson:
> On 2006-08-30, Martin J鴕gensen wrote:
>
> Yes.
>
>
> Bash can use regular expressions in the non-portable [[ ... ]]
> syntax. See the man page for more info.
(only recent (>3) versions, and that's some sort of GNU regexps).
And every (well most) shell has expr, sed, perl, grep or awk
though generally not builtin (but what would be the point?)
zsh (Perl compatible RE) and ksh93 (AT&T RE) also have internal
support for regular expressions.
--
St閜hane
|
|
|
|
|