AWK
Web Server forum
Back To The Forum Home!Search!Private Messaging System

Web Server Talk Web Server Talk > Unix and Linux reviews > Free Unix support > Unix questions > AWK




Pages (2): [1] 2 »   Last Thread   Next Thread Next
  Show Printable Version Email this Page Subscribe to this Thread      Post New Thread    Post A Reply      

    AWK  
friend.05@gmail.com


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-17-06 11:05 PM

I am having some data in excel sheet in following manner.

----------------------------------------------------------------------------
---------------------------------------------------------------

Argonne Natl Lab        CNM    CNM-AR-3    2004
BioDelivery Sciences Intl Inc
Argonne Natl Lab        CNM    CNM-AR-3    2004
BioDelivery Sciences Intl Inc
Argonne Natl Lab        CNM    CNM-AR-3    2004
BioDelivery Sciences Intl Inc
Argonne Natl Lab        CNM    CNM-AR-3    2004
BioDelivery Sciences Intl Inc
Argonne Natl Lab        CNM    CNM-AR-5    2004              Univ
Illinois
Argonne Natl Lab        CNM    CNM-AR-5    2004              Univ
Illinois
Argonne Natl Lab        CNM    CNM-AR-9    2004              Univ
Chicago
Argonne Natl Lab        CNM    CNM-AR-9    2004              Univ
Chicago

----------------------------------------------------------------------------
---------------------------------------------------------------

I want the output in following manner.

----------------------------------------------------------------------------
---------------------------------------------------------------

CNM-AR-3 CNM
CNM-AR-3 BioDelivery Sciences Intl Inc
CNM-AR-5 CNM
CNM-AR-5 Univ Illinois
CNM-AR-9 CNM
CNM-AR-9 Univ Chicago


----------------------------------------------------------------------------
---------------------------------------------------------------

Can I use AWK for this. Or I should use any database like MS-Access






[ Post a follow-up to this message ]



    Re: AWK  
Ed Morton


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-17-06 11:05 PM



friend.05@gmail.com wrote:
> I am having some data in excel sheet in following manner.
>
>
--------------------------------------------------------------------------------------------
-----------------------------------------------
>
> Argonne Natl Lab        CNM    CNM-AR-3    2004
> BioDelivery Sciences Intl Inc
> Argonne Natl Lab        CNM    CNM-AR-3    2004
> BioDelivery Sciences Intl Inc
> Argonne Natl Lab        CNM    CNM-AR-3    2004
> BioDelivery Sciences Intl Inc
> Argonne Natl Lab        CNM    CNM-AR-3    2004
> BioDelivery Sciences Intl Inc
> Argonne Natl Lab        CNM    CNM-AR-5    2004              Univ
> Illinois
> Argonne Natl Lab        CNM    CNM-AR-5    2004              Univ
> Illinois
> Argonne Natl Lab        CNM    CNM-AR-9    2004              Univ
> Chicago
> Argonne Natl Lab        CNM    CNM-AR-9    2004              Univ
> Chicago
>
>
--------------------------------------------------------------------------------------------
-----------------------------------------------
>
> I want the output in following manner.
>
>
--------------------------------------------------------------------------------------------
-----------------------------------------------
>
> CNM-AR-3 CNM
> CNM-AR-3 BioDelivery Sciences Intl Inc
> CNM-AR-5 CNM
> CNM-AR-5 Univ Illinois
> CNM-AR-9 CNM
> CNM-AR-9 Univ Chicago
>
>
>
--------------------------------------------------------------------------------------------
-----------------------------------------------
>
> Can I use AWK for this. Or I should use any database like MS-Access
>

You can use awk, but your line-wrapping makes it unclear whether or not
your data's all on one line. Please post a small example of your problem
that doesn't wrap around lines, and also come up with a better subject
line than "AWK" which you've used repeatedly and so mixes up threads in
newsreaders.

Ed.





[ Post a follow-up to this message ]



    Re: AWK  
friend.05@gmail.com


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-17-06 11:05 PM

My data is in  one line as follows:


Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
Argonne Natl   CNM  CNM-AR-5  2004  Illinois
Argonne Natl   CNM  CNM-AR-5  2004  Illinois
Argonne Natl   CNM  CNM-AR-9  2004  Chicago
Argonne Natl   CNM  CNM-AR-9  2004  Chicago






[ Post a follow-up to this message ]



    Re: AWK  
Stephane CHAZELAS


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-17-06 11:05 PM

2006-01-17, 10:58(-08), friend.05@gmail.com:
> I am having some data in excel sheet in following manner.
>
> --------------------------------------------------------------------------
-----------------------------------------------------------------
>
> Argonne Natl Lab        CNM    CNM-AR-3    2004
> BioDelivery Sciences Intl Inc
> Argonne Natl Lab        CNM    CNM-AR-3    2004
> BioDelivery Sciences Intl Inc
> Argonne Natl Lab        CNM    CNM-AR-3    2004
> BioDelivery Sciences Intl Inc
> Argonne Natl Lab        CNM    CNM-AR-3    2004
> BioDelivery Sciences Intl Inc
> Argonne Natl Lab        CNM    CNM-AR-5    2004              Univ
> Illinois
> Argonne Natl Lab        CNM    CNM-AR-5    2004              Univ
> Illinois
> Argonne Natl Lab        CNM    CNM-AR-9    2004              Univ
> Chicago
> Argonne Natl Lab        CNM    CNM-AR-9    2004              Univ
> Chicago
>
> --------------------------------------------------------------------------
-----------------------------------------------------------------
>
> I want the output in following manner.
>
> --------------------------------------------------------------------------
-----------------------------------------------------------------
>
> CNM-AR-3 CNM
> CNM-AR-3 BioDelivery Sciences Intl Inc
> CNM-AR-5 CNM
> CNM-AR-5 Univ Illinois
> CNM-AR-9 CNM
> CNM-AR-9 Univ Chicago
>
>
> --------------------------------------------------------------------------
-----------------------------------------------------------------
>
> Can I use AWK for this. Or I should use any database like MS-Access

You can use awk, not AWK (remember case is sensitive on Unix),
but it may not be the best tool as awk has no logic to say
"what's after field 6".

POSIXLY_CORRECT=1 awk '
match(/^[[:blank:]]*([^[:blank:]]+[[:blank:]]+){
;6}/) {
print $5, $4
print $5, substr($0, RLENGTH+1)
}' < file.in > file.out

--
Stéphane





[ Post a follow-up to this message ]



    Re: AWK  
Ed Morton


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-17-06 11:05 PM



friend.05@gmail.com wrote:
> My data is in  one line as follows:
>
>
> Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
> Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
> Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
> Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
> Argonne Natl   CNM  CNM-AR-5  2004  Illinois
> Argonne Natl   CNM  CNM-AR-5  2004  Illinois
> Argonne Natl   CNM  CNM-AR-9  2004  Chicago
> Argonne Natl   CNM  CNM-AR-9  2004  Chicago
>

Please read http://cfaj.freeshell.org/google before posting again as
you're falling foul of google..

Now, from your previous post, I think what you want to output from the
above is this:

CNM-AR-3 CNM
CNM-AR-3 BioDelivery
CNM-AR-5 CNM
CNM-AR-5 Illinois
CNM-AR-9 CNM
CNM-AR-9 Chicago

so that might just be (untested):

awk '$4!=prev{print $4,$3; print $4,$NF; prev=$4}' file

depending on what your more general requirements are.

Regards,

Ed.





[ Post a follow-up to this message ]



    Re: AWK  
Chris F.A. Johnson


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-17-06 11:05 PM

On 2006-01-17, friend.05@gmail.com wrote:
> My data is in  one line as follows:

What do you want to do with the data?

Please read: <http://cfaj.freeshell.org/google>

And please use a more specific subject than "AWK".

> Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
> Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
> Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
> Argonne Natl   CNM  CNM-AR-3  2004  BioDelivery
> Argonne Natl   CNM  CNM-AR-5  2004  Illinois
> Argonne Natl   CNM  CNM-AR-5  2004  Illinois
> Argonne Natl   CNM  CNM-AR-9  2004  Chicago
> Argonne Natl   CNM  CNM-AR-9  2004  Chicago
>


--
Chris F.A. Johnson, author   |    <http://cfaj.freeshell.org>
Shell Scripting Recipes:     |  My code in this post, if any,
A Problem-Solution Approach  |          is released under the
2005, Apress                 |     GNU General Public Licence





[ Post a follow-up to this message ]



    Re: AWK  
Stephane CHAZELAS


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-17-06 11:05 PM

2006-01-17, 19:49(+00), Stephane CHAZELAS:
[...]
> POSIXLY_CORRECT=1 awk '
>   match(/^[[:blank:]]*([^[:blank:]]+[[:blank:]]+){6}/
) {

match($0, /...

sorry.

>     print $5, $4
>     print $5, substr($0, RLENGTH+1)
>   }' < file.in > file.out
>

Note that the POSIXLY_CORRECT is in case your awk is gawk which
doesn't handle the {} POSIX ERE intervals without that
environment variable.

--
Stéphane





[ Post a follow-up to this message ]



    Re: AWK  
Ed Morton


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-17-06 11:05 PM



Stephane CHAZELAS wrote:
> 2006-01-17, 19:49(+00), Stephane CHAZELAS:
> [...]
> 
>
>
> match($0, /...
>
> sorry.
>
> 
>
>
> Note that the POSIXLY_CORRECT is in case your awk is gawk which
> doesn't handle the {} POSIX ERE intervals without that
> environment variable.
>

If you're using gawk and you just want to use RE intervals, then as a
rule you should use "gawk --re-interval" rather than setting
POSIXLY_CORRECT or "gawk --posix" because if you do the former you
retain the gawk extensions (e.g. gensub()) that aren't supported in
POSIX compliance mode:

$ echo "hello" | gawk '{print gensub(/l{2}/,"LL","")}'
hello

$ echo "hello" | POSIXLY_CORRECT=1 gawk '{print gensub(/l{2}/,"LL"
,"")}'
gawk: warning: regexp constant for parameter #1 yields boolean value
gawk: (FILENAME=- FNR=1) fatal: function `gensub' not defined

$ echo "hello" | gawk --posix '{print gensub(/l{2}/,"LL","")}'
gawk: warning: regexp constant for parameter #1 yields boolean value
gawk: (FILENAME=- FNR=1) fatal: function `gensub' not defined

$ echo "hello" | gawk --re-interval '{print gensub(/l{2}/,"LL","")
}'
heLLo

Regards,

Ed





[ Post a follow-up to this message ]



    Re: AWK  
Stephane CHAZELAS


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-18-06 07:50 AM

2006-01-17, 17:47(-06), Ed Morton:
[...] 
>
> If you're using gawk and you just want to use RE intervals, then as a
> rule you should use "gawk --re-interval" rather than setting
> POSIXLY_CORRECT or "gawk --posix" because if you do the former you
> retain the gawk extensions (e.g. gensub()) that aren't supported in
> POSIX compliance mode:
[...]

The point was to use awk and be portable. POSIXLY_CORRECT fixes
gawk in that case and is harmless for other awks. That makes my
awk script portable to systems where awk is a GNU awk and to
other systems that have a UNIX/POSIX compliant awk.

Now it's true that gensub could be useful for that particular
problem, so you could write a gawk solution for it. But UNIX
systems generally don't have gawk. It's more likely that they
have perl, so I would prefer giving a PERL solution over a gawk
solution and an awk solution over a PERL one (in c.u.*).

--
Stéphane





[ Post a follow-up to this message ]



    Re: AWK  
Ed Morton


View Ip Address Report This Message To A Moderator Edit/Delete Message


 
01-18-06 10:55 PM

Stephane CHAZELAS wrote:

> 2006-01-17, 17:47(-06), Ed Morton:
> [...]
> 
>
> [...]
>
> The point was to use awk and be portable.

That's fine, but not everyone cares about portability to other awks.
Many of us are just fine being dependent on gawks MANY useful features
over other gawks, not least of which is gensub().

POSIXLY_CORRECT fixes
> gawk in that case and is harmless for other awks. That makes my
> awk script portable to systems where awk is a GNU awk and to
> other systems that have a UNIX/POSIX compliant awk.

Yes, I know, but again that wasn't my point. My point was about the way
to get REs in gawk() without sacrificing non-POSIX functionality.

> Now it's true that gensub could be useful for that particular
> problem, so you could write a gawk solution for it. But UNIX
> systems generally don't have gawk. It's more likely that they
> have perl, so I would prefer giving a PERL solution over a gawk
> solution and an awk solution over a PERL one (in c.u.*).
>

If gawk isn't available on your machine you can always install it. I'm
sure PERL is a fine tool. It's not available on many of the UNIX
machines I use daily at work while gawk is but that's really beside the
point which, once again, is:

"If you're using gawk and you just want to use RE intervals, then as a
rule you should use "gawk --re-interval""

Ed.





[ Post a follow-up to this message ]



    Sponsored Links  




 





   All times are GMT. The time now is 06:14 PM.      Post New Thread    Post A Reply      
Pages (2): [1] 2 »   Last Thread   Next Thread Next


Most Popular forums 

Forum Jump:
Rate This Thread:

Forum Rules:
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is OFF
vB code is ON
Smilies are ON
[IMG] code is OFF
 
Medical and Health forum | Computer Games Reviews | Graphics design forum

Back To The Top
Home | Usercp | Faq | Register