|
Home > Archive > Unix Shell > November 2007 > ignoring non [0-9] chars when searching in a file
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
ignoring non [0-9] chars when searching in a file
|
|
| John Bagins 2007-11-23, 7:31 am |
| I want to search for telephone numbers in my text database where the
entries are in the form:
Tel: 99 99 999 9999 999
or
Tel: +99-999/ 9999 999
or any combination of spaces and punctuation chars.
I want to feed as input numbers in the form of
99999999999
What is the most efficient way to do this without first creating
an index of space and punctuation removed numbers?
Thanks
Eric
| |
| Janis Papanagnou 2007-11-23, 1:22 pm |
| John Bagins wrote:
> I want to search for telephone numbers in my text database where the
> entries are in the form:
> Tel: 99 99 999 9999 999
> or
> Tel: +99-999/ 9999 999
> or any combination of spaces and punctuation chars.
So the number of digits in the file is arbitrary, or are there just
the two possibilities of 14 or resp. 12 digits?
>
> I want to feed as input numbers in the form of
> 99999999999
And you want to search for 9 digit numbers? How is the semantics of
a match defined? (Match prefix, suffix, or arbitrary elisions in the
mid of the number?)
What shall the output be; all closly matching numbers, or all exact
matches. Just an output found/not found, or the number as stored in
the file?
>
> What is the most efficient way to do this without first creating
> an index of space and punctuation removed numbers?
Depends on what you need. Clarify your requirements first. Provide
a few samples of real (or at least meaningful) data and the desired
output.
Janis
>
> Thanks
>
> Eric
| |
| Cyrus Kriticos 2007-11-23, 7:20 pm |
| John Bagins wrote:
> I want to search for telephone numbers in my text database where the
> entries are in the form:
> Tel: 99 99 999 9999 999
> or
> Tel: +99-999/ 9999 999
> or any combination of spaces and punctuation chars.
>
> I want to feed as input numbers in the form of
> 99999999999
>
> What is the most efficient way to do this without first creating
> an index of space and punctuation removed numbers?
sed "s/[^0-9]//g" FILENAME | grep <telephone_number>
--
Best regards | Be nice to America or they'll bring democracy to
Cyrus | your country.
| |
| --==[ bman ]==-- 2007-11-23, 7:20 pm |
| On Nov 23, 7:52 am, John Bagins <to.eric.sm...@gmail.com> wrote:
> I want to search for telephone numbers in my text database where the
> entries are in the form:
> Tel: 99 99 999 9999 999
> or
> Tel: +99-999/ 9999 999
> or any combination of spaces and punctuation chars.
>
> I want to feed as input numbers in the form of
> 99999999999
>
> What is the most efficient way to do this without first creating
> an index of space and punctuation removed numbers?
>
> Thanks
>
> Eric
ksh93 or bash
var=<your telephone number>
var=${var//[^[:digit:]]/}
This will replace spaces and punctuation with nothing, squeezing
everything together to the format you want. Simple and very
efficient. No need to use external tools like 'sed', awk', tr, and
others.
| |
| Ed Morton 2007-11-25, 1:26 pm |
|
On 11/23/2007 6:52 AM, John Bagins wrote:
> I want to search for telephone numbers in my text database where the
> entries are in the form:
> Tel: 99 99 999 9999 999
> or
> Tel: +99-999/ 9999 999
> or any combination of spaces and punctuation chars.
>
> I want to feed as input numbers in the form of
> 99999999999
>
> What is the most efficient way to do this without first creating
> an index of space and punctuation removed numbers?
>
> Thanks
>
> Eric
This may or may not be more efficient than other solutions posted, but it just
uses one command and creates a solid foundation that'll be easy to enhance and
maintain in future in case your requirements change:
awk '{s=$0;gsub(/[^[:digit:]]/,"",s)}p~s' p="99999999999" file
Ed.
|
|
|
|
|