|
Home > Archive > Unix Shell > November 2007 > egrep match nine digit
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
egrep match nine digit
|
|
| harividya@gmail.com 2007-11-22, 1:44 am |
| from a file
abc there is a nine digit number i need only a egrep pattern which
matches all the numbers starting with 0 or 1
except 112345678 and 112345677
eg
abcdef112345678
acvgtb112345677
werttyq112345673
weerwq233434334
yhukot02345678
abcdef123456789
my output shuild contain
werttyq112345673
yhukot02345678
abcdef123456789
and not
abcdef112345678
acvgtb112345677
weerwq233434334
| |
| Cyrus Kriticos 2007-11-22, 1:44 am |
| harividya@gmail.com wrote:
> from a file
>
>
> abc there is a nine digit number i need only a egrep pattern which
> matches all the numbers starting with 0 or 1
> except 112345678 and 112345677
>
> eg
> abcdef112345678
> acvgtb112345677
> werttyq112345673
> weerwq233434334
> yhukot02345678
> abcdef123456789
>
> my output shuild contain
>
>
> werttyq112345673
> yhukot02345678
> abcdef123456789
>
>
> and not
> abcdef112345678
> acvgtb112345677
> weerwq233434334
e.g.
egrep '^[^0-9]*[0-1]' FILENAME | egrep -v '^[^0-9]*(112345678|112345677)'
--
Best regards | Be nice to America or they'll bring democracy to
Cyrus | your country.
| |
| Glenn Jackman 2007-11-22, 1:44 am |
| At 2007-11-21 10:21PM, "harividya@gmail.com" wrote:
> from a file
>
>
> abc there is a nine digit number i need only a egrep pattern which
> matches all the numbers starting with 0 or 1
> except 112345678 and 112345677
>
> eg
> abcdef112345678
> acvgtb112345677
> werttyq112345673
> weerwq233434334
> yhukot02345678
> abcdef123456789
>
> my output shuild contain
>
>
> werttyq112345673
> yhukot02345678
> abcdef123456789
awk '
/11234567[78]/ {next}
/[^0-9][01][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
'
outputs
werttyq112345673
abcdef123456789
"yhukot02345678" does not contain 9 digits.
If you need a single grep pattern, this will work:
(^|[^0-9])(0[0-9]{8}|1[^1][0-9]{7}|11[^2][0-9]{6}|112[^3][0-9]{5}|1123[^4][0-9]{4}|11234[^5][0-9]{3}|112345[^6][0-9]{2}|1123456[^7][0-9]|11234567[^78])($|[^0-9])
--
Glenn Jackman
"You can only be young once. But you can always be immature." -- Dave Barry
| |
| Glenn Jackman 2007-11-22, 1:44 am |
| At 2007-11-21 10:21PM, "harividya@gmail.com" wrote:
> from a file
>
>
> abc there is a nine digit number i need only a egrep pattern which
> matches all the numbers starting with 0 or 1
> except 112345678 and 112345677
>
> eg
> abcdef112345678
> acvgtb112345677
> werttyq112345673
> weerwq233434334
> yhukot02345678
> abcdef123456789
>
> my output shuild contain
>
>
> werttyq112345673
> yhukot02345678
> abcdef123456789
awk '
/11234567[78]/ {next}
/[^0-9][01][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
'
outputs
werttyq112345673
abcdef123456789
"yhukot02345678" does not contain 9 digits.
If you need a single grep pattern, this will work:
(^|[^0-9])(0[0-9]{8}|1[02-9][0-9]{7}|11[013-9][0-9]{6}|112[0-24-9][0-9]{5}|1123[0-35-9][0-9]{4}|11234[0-46-9][0-9]{3}|112345[0-57-9][0-9]{2}|1123456[0-689][0-9]|11234567[0-69])($|[^0-9])
--
Glenn Jackman
"You can only be young once. But you can always be immature." -- Dave Barry
| |
| harividya@gmail.com 2007-11-22, 1:44 am |
| On Nov 22, 1:22 pm, Cyrus Kriticos <cyrus.kriti...@googlemail.com>
wrote:
> harivi...@gmail.com wrote:
>
>
>
>
>
>
> e.g.
>
> egrep '^[^0-9]*[0-1]' FILENAME | egrep -v '^[^0-9]*(112345678|112345677)'
>
> --
> Best regards | Be nice to America or they'll bring democracy to
> Cyrus | your country.- Hide quoted text -
>
> - Show quoted text -
Thanks ,but I should not use pipe all should be done in one egrep
| |
| harividya@gmail.com 2007-11-22, 1:44 am |
| On Nov 22, 1:41 pm, Glenn Jackman <gle...@ncf.ca> wrote:
> At 2007-11-21 10:21PM, "harivi...@gmail.com" wrote:
>
>
>
>
>
>
>
>
>
>
> awk '
> /11234567[78]/ {next}
> /[^0-9][01][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]/
> '
>
> outputs
> werttyq112345673
> abcdef123456789
>
> "yhukot02345678" does not contain 9 digits.
>
> If you need a single grep pattern, this will work:
>
> (^|[^0-9])(0[0-9]{8}|1[02-9][0-9]{7}|11[013-9][0-9]{6}|112[0-24-9][0-9]{5}|-1123[0-35-9][0-9]{4}|11234[0-46-9][0-9]{3}|112345[0-57-9][0-9]{2}|1123456[0--689][0-9]|11234567[0-69])($|[^0-9])
>
> --
> Glenn Jackman
> "You can only be young once. But you can always be immature." -- Dave Barry- Hide quoted text -
>
> - Show quoted text -
This gives me syntax error
| |
| Michael Tosch 2007-11-22, 7:33 am |
| harividya@gmail.com wrote:
> from a file
>
>
> abc there is a nine digit number i need only a egrep pattern which
> matches all the numbers starting with 0 or 1
> except 112345678 and 112345677
>
> eg
> abcdef112345678
> acvgtb112345677
> werttyq112345673
> weerwq233434334
> yhukot02345678
> abcdef123456789
>
> my output shuild contain
>
>
> werttyq112345673
> yhukot02345678
> abcdef123456789
>
>
> and not
> abcdef112345678
> acvgtb112345677
> weerwq233434334
Your input file contains an eight digit number, too.
What is the minimum length then?
--
Michael Tosch @ hp : com
| |
| Ed Morton 2007-11-22, 1:23 pm |
|
On 11/22/2007 7:11 AM, Michael Tosch wrote:
> harividya@gmail.com wrote:
>
>
>
>
> Your input file contains an eight digit number, too.
ITYM "Your OUTPUT file contains an eight digit number, too.".
I also don't see why abcdef112345678 and acvgtb112345677 are NOT part of his
expected output since they both contain 9-digit numbers starting with 1.
> What is the minimum length then?
and what is the pattern you're REALLY trying to match?
Ed.
| |
| Barry Margolin 2007-11-22, 7:22 pm |
| In article
<77844f5d-29ff-4396-adbd-9e1cb55a3c5a@d4g2000prg.googlegroups.com>,
harividya@gmail.com wrote:
> On Nov 22, 1:41 pm, Glenn Jackman <gle...@ncf.ca> wrote:
>
> This gives me syntax error
There's a one-character typo. It should be pretty easy to see and fix,
so that's left as an exercise for the reader.
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
| |
| Michael Tosch 2007-11-22, 7:22 pm |
| Ed Morton wrote:
>
> On 11/22/2007 7:11 AM, Michael Tosch wrote:
....[vbcol=seagreen]
>
> I also don't see why abcdef112345678 and acvgtb112345677 are NOT part of his
> expected output since they both contain 9-digit numbers starting with 1.
>
Look again, Ed.
--
Michael Tosch @ hp : com
| |
| Ed Morton 2007-11-22, 7:22 pm |
|
On 11/22/2007 1:46 PM, Michael Tosch wrote:
> Ed Morton wrote:
>
> ...
>
>
>
> Look again, Ed.
>
>
Amazing how easy it is to see when someone points it out...
Thanks,
Ed.
| |
| unixques 2007-11-23, 7:31 am |
| On Nov 22, 10:17 pm, Ed Morton <mor...@lsupcaemnt.com> wrote:
> On 11/22/2007 7:11 AM, Michael Tosch wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
> ITYM "Your OUTPUT file contains an eight digit number, too.".
>
> I also don't see why abcdef112345678 and acvgtb112345677 are NOT part of his
> expected output since they both contain 9-digit numbers starting with 1.
>
>
> and what is the pattern you're REALLY trying to match?
>
> Ed.- Hide quoted text -
>
> - Show quoted text -
Thanks all for helping me
I just have a huge file in that there is a 9 digit number
this 9 digit number is followed after a decimal point
the decimal point space is 3
ie
12.345then my 9 digit number starts
my requirement is as below
i need to split the main file into many files using only one egrep no
pipe
say the 9 digit number is an account number
then i need all the account number starting with one
but say 123456789 should be in one file
and wsay 134567827 should be in another file
and the rest which has account number 1 as starting should be in one
file
account 2 in another file again 231456789 and 298765431 another file
etc
like that
all o accounts in 1 file
all 1 account in 1 file except 123456789 134567827
134567827 in i file
123456789 in 1 file
all 2 account in one file
again exclude some accounts
like that characters from 0-9
Please help
thanks in advance
| |
| Cyrus Kriticos 2007-11-23, 1:22 pm |
|
harividya@gmail.com wrote:
> Cyrus Kriticos wrote:
>
> Thanks ,but I should not use pipe all should be done in one egrep
Without pipe and bash 3.x:
egrep '^[^0-9]*[0-1]' <(egrep -v '^[^0-9]*(112345678|112345677)' FILENAME)
With one egrep:
egrep -v '^[^0-9]*(112345678|112345677|[2-9])' FILENAME
--
Best regards | Be nice to America or they'll bring democracy to
Cyrus | your country.
| |
| unixques 2007-11-26, 1:37 am |
| On Nov 23, 3:46 am, Michael Tosch <eed...@NO.eed.SPAM.ericsson.PLS.se>
wrote:
> Ed Morton wrote:
>
>
>
>
> ...
>
>
> Look again, Ed.
>
> --
> Michael Tosch @ hp : com- Hide quoted text -
>
> - Show quoted text -
fiel_1 account starting with 1 except 2 account numbers (e.g
123456789)
file_2 account strting with 2 except 3 account numbers (e.g 213456789)
file_3
..
;
;
;
file_9 accoun starting with 9 wexcept 987612345
thanks in advance
| |
| Glenn Jackman 2007-11-28, 1:24 pm |
| At 2007-11-22 12:41AM, "Glenn Jackman" wrote:
> At 2007-11-21 10:21PM, "harividya@gmail.com" wrote:
[...][vbcol=seagreen]
> If you need a single grep pattern, this will work:
>
> (^|[^0-9])(0[0-9]{8}|1[02-9][0-9]{7}|11[013-9][0-9]{6}|112[0-24-9][0-9]{5}|1123[0-35-9][0-9]{4}|11234[0-46-9][0-9]{3}|112345[0-57-9][0-9]{2}|1123456[0-689][0-9]|11234567[0-69])($|[^0-9])
>
The OP emailed me privately asking about that pattern. I'll break it
down for his and others' benefit.
The requirement (as I interpreted it) was for "any 9 digit number
starting with 0 or 1, except 112345678 and 112345677".
Basic regular expressions (unlike the advanced regular expressions many
scripting languages have imlemented, notably Perl) don't have the
feature to say "look ahead of my current position in the string for
pattern xyz". Using Perl, I could write:
perl -ne 'print if /(?!11234567[78])[01]\d{8}/' filename
With egrep, one has to be "positive" about what you're looking for (no
"negative lookahead"). Breaking the above long pattern into pieces, we
can see how it meets the requirement:
(^|[^0-9]) -- the beginning of the string, or a non-digit
(to mark the start of digits)
0[0-9]{8} -- a 9-digit number beginning with 0,
1[02-9][0-9]{7} -- 1 then any digit except 1 then any 7 digits,
11[013-9][0-9]{6} -- 11 then any digit except 2 then any 6 digits,
112[0-24-9][0-9]{5} -- 112 then any digit except 3 then any 5 digits,
1123[0-35-9][0-9]{4} -- 1123 then any digit except 4 then any 4 digits,
11234[0-46-9][0-9]{3} -- 11234 then any digit except 5 then any 3 digits,
112345[0-57-9][0-9]{2} -- 112345 then any digit except 6 then any 2 digits,
1123456[0-689][0-9] -- 1123456 then any digit except 7 then a digit,
11234567[0-69] -- 11234567 then any digit except 7 or 8
($|[^0-9]) -- the end of the string, or a non-digit
(to mark the end of digits)
--
Glenn Jackman
"You can only be young once. But you can always be immature." -- Dave Barry
|
|
|
|
|