Unix Shell - Re: The precise behaviour of the | operator in POSIX extended

This is Interesting: Free IT Magazines  
Home > Archive > Unix Shell > December 2006 > Re: The precise behaviour of the | operator in POSIX extended





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Re: The precise behaviour of the | operator in POSIX extended
Icarus Sparry

2006-12-20, 1:39 am

On Tue, 19 Dec 2006 16:07:56 -0800, Spiros Bousbouras wrote:

> Assume we have a regexp of the form 'E1|E2' where E1 , E2 are also
> regexps and we attempt to match it against a string where both E1
> and E2 match. Does the POSIX standard (or some man page) determine
> which of E1 , E2 is considered the match ? This could be important
> in case the whole of E1|E2 is inside parentheses and we refer to it
> later.
>
> I'm interested in the answer in the general case where we have regexps
> E1 , E2 , ... , En and we form a regexp by taking their disjunction ie
> E1|E2|...|En


The usual matching is first "leftmost", then "longest" of successful
matches.

Using GNU sed, which has patterns of this form, one sees

I="I am the friend of fred and joe today"
echo "$I" | sed -r 's/joe|fred/bill/'
outputting
I am the friend of bill and joe today

So here joe matched leftmost (earliest) in the input.

echo "$I" | sed -r 's/of f[der]*|of fr[^t]*/peter /'
outputs
I am the friend of peter today

Here both patterns match at the same place, so the longer one, matching
"of fred and joe " rather than "of fred" wins.

This may be spelled out in your online manual for "regexp". If not the
O'Reilly book "Mastering Regular Expressions" is well worth reading.
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com