|
Home > Archive > Unix Shell > October 2004 > AIX Script to Summarize By First Column By Adding Values in Numerical Columns
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
AIX Script to Summarize By First Column By Adding Values in Numerical Columns
|
|
| SAP BASIS Consultant 2004-09-29, 8:09 pm |
| Hello,
I would greatly appreciate help for the following: (I am using ksh on
AIX).
I have a file which looks in style as follows (It is the output of ps
aux, though that it not important):
(The COLs are not part of the file..For description)
COL1 COL2 COL3 COL4 COL5 COL6
name1 xxx xxx xxx 1000 11
name1 xxx xxx xxx 50 40
name2 xxx xxx xxx 30 5
name1 xxx xxx xxx 50 10
name2 xxx xxx xxx 100 4
I would like to summarize the data by the first column, ignoring
columns #2, #3 and #4, and adding the values in col#5 and #6.
The output would be as follows: (For name1, 1100=1000+50+50,
etc..)
name1 1100 61
name2 130 9
Any ideas would be greatly appreciated.
Thanks,
BASIS Consultant
| |
| William Park 2004-09-29, 8:09 pm |
| SAP BASIS Consultant <basis_consultant@hotmail.com> wrote:
> Hello,
>
> I would greatly appreciate help for the following: (I am using ksh on
> AIX).
>
> I have a file which looks in style as follows (It is the output of ps
> aux, though that it not important):
>
> (The COLs are not part of the file..For description)
>
> COL1 COL2 COL3 COL4 COL5 COL6
> name1 xxx xxx xxx 1000 11
> name1 xxx xxx xxx 50 40
> name2 xxx xxx xxx 30 5
> name1 xxx xxx xxx 50 10
> name2 xxx xxx xxx 100 4
>
>
> I would like to summarize the data by the first column, ignoring
> columns #2, #3 and #4, and adding the values in col#5 and #6.
> The output would be as follows: (For name1, 1100=1000+50+50,
> etc..)
>
> name1 1100 61
> name2 130 9
>
>
> Any ideas would be greatly appreciated.
man awk
man ksh (associative array)
--
William Park <opengeometry@yahoo.ca>
Open Geometry Consulting, Toronto, Canada
| |
|
| On 2004-09-29, SAP BASIS Consultant <basis_consultant@hotmail.com> wrote:
> COL1 COL2 COL3 COL4 COL5 COL6
> name1 xxx xxx xxx 1000 11
> name1 xxx xxx xxx 50 40
> name2 xxx xxx xxx 30 5
> name1 xxx xxx xxx 50 10
> name2 xxx xxx xxx 100 4
>
>
> I would like to summarize the data by the first column, ignoring
> columns #2, #3 and #4, and adding the values in col#5 and #6.
> The output would be as follows: (For name1, 1100=1000+50+50,
> etc..)
>
> name1 1100 61
> name2 130 9
sort -u foo | awk '{print $1}' | while read name ; do # find unique names
awk -v name=$name '$0 ~ name {s5+=$5 ; s6+=$6} END {print name, s5, s6}' foo
done
There is most definitely a pure-awk solution, but I haven't figured it
out yet 
--Seb
| |
|
| On 2004-09-29, Seb <seb@foo.com> wrote:
> sort -u foo | awk '{print $1}' | while read name ; do # find unique names
> awk -v name=$name '$0 ~ name {s5+=$5 ; s6+=$6} END {print name, s5, s6}' foo
> done
hrm, sorry:
awk '{print $1}' foo | sort -u | while read name ; do # find unique names
awk -v name=$name '$0 ~ name {s5+=$5 ; s6+=$6} END {print name, s5, s6}' foo
done
--Seb
| |
| Chris F.A. Johnson 2004-09-29, 8:09 pm |
| On 2004-09-29, SAP BASIS Consultant wrote:
> Hello,
>
> I would greatly appreciate help for the following: (I am using ksh on
> AIX).
>
> I have a file which looks in style as follows (It is the output of ps
> aux, though that it not important):
>
> (The COLs are not part of the file..For description)
>
> COL1 COL2 COL3 COL4 COL5 COL6
> name1 xxx xxx xxx 1000 11
> name1 xxx xxx xxx 50 40
> name2 xxx xxx xxx 30 5
> name1 xxx xxx xxx 50 10
> name2 xxx xxx xxx 100 4
>
>
> I would like to summarize the data by the first column, ignoring
> columns #2, #3 and #4, and adding the values in col#5 and #6.
> The output would be as follows: (For name1, 1100=1000+50+50,
> etc..)
>
> name1 1100 61
> name2 130 9
It's a one-liner:
awk '{x[$1]+=$5;y[$1]+=$6}END{for(n in x) printf "%s\t%d\t%d\n",n,x[n],y[n]}'
Which, presented more sanely, is:
awk '{ x[$1] += $5
y[$1] += $6 }
END { for ( n in x )
printf "%s\t%s\t%s\n", n, x[n], y[n]
}'
--
Chris F.A. Johnson http://cfaj.freeshell.org/shell
========================================
===========================
My code (if any) in this post is copyright 2004, Chris F.A. Johnson
and may be copied under the terms of the GNU General Public License
| |
| Dana French 2004-10-02, 9:12 pm |
| "Chris F.A. Johnson" <cfajohnson@gmail.com> wrote in message news:<2s0kqqF1fkkavU1@uni-berlin.de>...
> On 2004-09-29, SAP BASIS Consultant wrote:
>
> It's a one-liner:
>
> awk '{x[$1]+=$5;y[$1]+=$6}END{for(n in x) printf "%s\t%d\t%d\n",n,x[n],y[n]}'
>
> Which, presented more sanely, is:
>
> awk '{ x[$1] += $5
> y[$1] += $6 }
> END { for ( n in x )
> printf "%s\t%s\t%s\n", n, x[n], y[n]
> }'
Another execellent one-liner, I'll add it to the one-liners page:
http://www.mtxia.com/fancyIndex/Too.../oneliners.html
The following is same concept presented by Chris, only in ksh93:
typeset -A x y
while read -- VAL1 VAL2 VAL3 VAL4 VAL5 VAL6
do
x[${VAL1}]=$(( ${x[${VAL1}]:=0} + ${VAL5} ))
y[${VAL1}]=$(( ${y[${VAL1}]:=0} + ${VAL6} ))
done < data.in
for n in "${!x[@]}"
do
print "${n}\t${x[${n}]}\t${y[${n}]}"
done
--------------------------------------------------------
Dana French dfrench@mtxia.com
Mt Xia Technical Consulting Group http://www.mtxia.com
100% Spam Free Email http://www.ridmail.com
MicroEmacs http://uemacs.tripod.com
Korn Shell Web http://dfrench.tripod.com/kshweb.html
| |
| Dana French 2004-10-03, 2:48 am |
| "Chris F.A. Johnson" <cfajohnson@gmail.com> wrote in message news:<2s0kqqF1fkkavU1@uni-berlin.de>...
> On 2004-09-29, SAP BASIS Consultant wrote:
>
> It's a one-liner:
>
> awk '{x[$1]+=$5;y[$1]+=$6}END{for(n in x) printf "%s\t%d\t%d\n",n,x[n],y[n]}'
>
> Which, presented more sanely, is:
>
> awk '{ x[$1] += $5
> y[$1] += $6 }
> END { for ( n in x )
> printf "%s\t%s\t%s\n", n, x[n], y[n]
> }'
Another execellent one-liner, I'll add it to the one-liners page:
http://www.mtxia.com/fancyIndex/Too.../oneliners.html
The following is same concept presented by Chris, only in ksh93:
typeset -A x y
while read -- VAL1 VAL2 VAL3 VAL4 VAL5 VAL6
do
x[${VAL1}]=$(( ${x[${VAL1}]:=0} + ${VAL5} ))
y[${VAL1}]=$(( ${y[${VAL1}]:=0} + ${VAL6} ))
done < data.in
for n in "${!x[@]}"
do
print "${n}\t${x[${n}]}\t${y[${n}]}"
done
--------------------------------------------------------
Dana French dfrench@mtxia.com
Mt Xia Technical Consulting Group http://www.mtxia.com
100% Spam Free Email http://www.ridmail.com
MicroEmacs http://uemacs.tripod.com
Korn Shell Web http://dfrench.tripod.com/kshweb.html
| |
| SAP BASIS Consultant 2004-10-04, 6:01 pm |
| Seb <seb@foo.com> wrote in message news:<XND6d.21508$QJ3.5548@newssvr21.news.prodigy.com>...
> On 2004-09-29, Seb <seb@foo.com> wrote:
>
> hrm, sorry:
>
> awk '{print $1}' foo | sort -u | while read name ; do # find unique names
> awk -v name=$name '$0 ~ name {s5+=$5 ; s6+=$6} END {print name, s5, s6}' foo
> done
>
> --Seb
Thanks to Seb and all others who took the time to respond.
I used Seb's solution, and it works well.
SAP BASIS Consultant
| |
|
| On 2004-10-04, SAP BASIS Consultant <basis_consultant@hotmail.com> wrote:
>
>
> Thanks to Seb and all others who took the time to respond.
> I used Seb's solution, and it works well.
Hehe, except that my solution is ugly, while Chris's one is nicer and
shorter...
--Seb
|
|
|
|
|