Unix Shell - Lookup between 2 files ( korn shell )

This is Interesting: Free IT Magazines  
Home > Archive > Unix Shell > January 2006 > Lookup between 2 files ( korn shell )





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Lookup between 2 files ( korn shell )
pawan_test

2006-01-29, 9:31 pm

Hello

Thanks guys for your suggestions;

FILE1=/home/pavi/DS.txt
FILE2=/home/pavi/chain.txt

awk -F ":" '{print $1}' $FILE1 > f1.txt
awk -F " " '{print $2}' $FILE2 > f2.txt


DS.txt contains;

ALBD:xxxxx
FLN:yyyyyyy
WLG:ttttttt

chain.txt contain;

ECK: ECK: 331
ECK: ECK: 336
FED :FED :427
FIE :FIE: 212
FLN :FLN: 140
FLN :FLN :141
FMF: FMF: 183
FR1 :FR1: 303
FTT :FTT :423
GAT :GAT: 185
GEN :GEN: 396
GFY :GFY: 268
GIA :GIA: 228
GMD :GMD: 264
MIN: MIN: 158
MRO :MRO: 354
MY0 :MY0: 315
NCS :NCS: 390
OSC :ALBD: 377
OSC :ALBD :378
OSC: ALBD: 379
OSC ALBD 380
WLG :WLG :243
WLG: WLG: 244
WLG :WLG: 245
WLG :WLG :246
MAT: WLG :247

the result.txt has to look like this;


OSC :ALBD: 377
OSC :ALBD :378
OSC: ALBD: 379
FLN :FLN: 140
FLN :FLN :141
WLG :WLG :243
WLG: WLG: 244
WLG :WLG: 245
WLG :WLG :246
MAT: WLG :247

the script has to match the 1st column from DS.txt with the 2nd column
of chain.txt and only those that matched, move all the corresponding
1st , 2nd and 3rd columns of chain.txt to a output file say result.txt

this is what i am trying to accomplish.

can anyone please help me ..

tx
pavi

base60

2006-01-29, 9:31 pm

pawan_test wrote:
> Hello
>
> Thanks guys for your suggestions;


Read the part concerning associative arrays.

>
> FILE1=/home/pavi/DS.txt
> FILE2=/home/pavi/chain.txt
>
> awk -F ":" '{print $1}' $FILE1 > f1.txt
> awk -F " " '{print $2}' $FILE2 > f2.txt
>
>
> DS.txt contains;
>
> ALBD:xxxxx
> FLN:yyyyyyy
> WLG:ttttttt
>
> chain.txt contain;
>
> ECK: ECK: 331
> ECK: ECK: 336
> FED :FED :427
> FIE :FIE: 212
> FLN :FLN: 140
> FLN :FLN :141
> FMF: FMF: 183
> FR1 :FR1: 303
> FTT :FTT :423
> GAT :GAT: 185
> GEN :GEN: 396
> GFY :GFY: 268
> GIA :GIA: 228
> GMD :GMD: 264
> MIN: MIN: 158
> MRO :MRO: 354
> MY0 :MY0: 315
> NCS :NCS: 390
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> OSC ALBD 380
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>
> the result.txt has to look like this;
>
>
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> FLN :FLN: 140
> FLN :FLN :141
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>
> the script has to match the 1st column from DS.txt with the 2nd column
> of chain.txt and only those that matched, move all the corresponding
> 1st , 2nd and 3rd columns of chain.txt to a output file say result.txt
>
> this is what i am trying to accomplish.
>
> can anyone please help me ..
>
> tx
> pavi
>

pawan_test

2006-01-29, 9:31 pm

Hello.,

i tried with
awk -F ":" '{print $1}' $FILE1 > f1.txt
awk -F " " '{print $2}' $FILE2 > f2.txt

but i did not know how to get the entire record from the chain.txt for
those with matches.this is where i am having trouble.

thanks
pavi

Janis Papanagnou

2006-01-29, 9:31 pm

pawan_test wrote:
> Hello
>
> Thanks guys for your suggestions;
>
> FILE1=/home/pavi/DS.txt
> FILE2=/home/pavi/chain.txt
>
> awk -F ":" '{print $1}' $FILE1 > f1.txt
> awk -F " " '{print $2}' $FILE2 > f2.txt


Modulo the order of your output data this program will do...

awk -F ":" '
NR==FNR{k[$1]}
NR!=FNR{x=$0;gsub(/ /,"",$2);if($2 in k)print x}' DS.txt chain.txt

....though I'd rather suggest to fix the formatting of the original
data in the first place; it would make a solution much simpler.

Janis


> DS.txt contains;
>
> ALBD:xxxxx
> FLN:yyyyyyy
> WLG:ttttttt
>
> chain.txt contain;
>
> ECK: ECK: 331
> ECK: ECK: 336
> FED :FED :427
> FIE :FIE: 212
> FLN :FLN: 140
> FLN :FLN :141
> FMF: FMF: 183
> FR1 :FR1: 303
> FTT :FTT :423
> GAT :GAT: 185
> GEN :GEN: 396
> GFY :GFY: 268
> GIA :GIA: 228
> GMD :GMD: 264
> MIN: MIN: 158
> MRO :MRO: 354
> MY0 :MY0: 315
> NCS :NCS: 390
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> OSC ALBD 380
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>
> the result.txt has to look like this;
>
>
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> FLN :FLN: 140
> FLN :FLN :141
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>
> the script has to match the 1st column from DS.txt with the 2nd column
> of chain.txt and only those that matched, move all the corresponding
> 1st , 2nd and 3rd columns of chain.txt to a output file say result.txt
>
> this is what i am trying to accomplish.
>
> can anyone please help me ..
>
> tx
> pavi
>

Ed Morton

2006-01-29, 9:31 pm

pawan_test wrote:
> Hello
>
> Thanks guys for your suggestions;
>
> FILE1=/home/pavi/DS.txt
> FILE2=/home/pavi/chain.txt
>
> awk -F ":" '{print $1}' $FILE1 > f1.txt
> awk -F " " '{print $2}' $FILE2 > f2.txt
>
>
> DS.txt contains;
>
> ALBD:xxxxx
> FLN:yyyyyyy
> WLG:ttttttt
>
> chain.txt contain;
>
> ECK: ECK: 331
> ECK: ECK: 336
> FED :FED :427
> FIE :FIE: 212
> FLN :FLN: 140
> FLN :FLN :141
> FMF: FMF: 183
> FR1 :FR1: 303
> FTT :FTT :423
> GAT :GAT: 185
> GEN :GEN: 396
> GFY :GFY: 268
> GIA :GIA: 228
> GMD :GMD: 264
> MIN: MIN: 158
> MRO :MRO: 354
> MY0 :MY0: 315
> NCS :NCS: 390
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> OSC ALBD 380
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>
> the result.txt has to look like this;
>
>
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> FLN :FLN: 140
> FLN :FLN :141
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>
> the script has to match the 1st column from DS.txt with the 2nd column
> of chain.txt and only those that matched, move all the corresponding
> 1st , 2nd and 3rd columns of chain.txt to a output file say result.txt
>
> this is what i am trying to accomplish.
>
> can anyone please help me ..
>
> tx
> pavi
>


This will output the records you want but not in the order you show.
It'll use the same order the records appear in chain.txt:

awk -F' *: *' 'NR==FNR{k[$1];next}$2 in k' DS.txt chain.txt

It's not obvious how you're ordering your output - if it's important,
let us know the sort criteria.

By the way, did you deliberately have one record with no colons
(OSC ALBD 380)?

Ed.
Xicheng

2006-01-29, 9:31 pm

Ed Morton wrote:
> pawan_test wrote:
> This will output the records you want but not in the order you show.
> It'll use the same order the records appear in chain.txt:
>
> awk -F' *: *' 'NR==FNR{k[$1];next}$2 in k' DS.txt chain.txt


perl -F: -ane '$h{$F[0]}=1 and next if @ARGV;print if exists
$h{"@{[$F[1]=~/(\S+)/]}"}' DS.txt chain.txt

Xicheng

> It's not obvious how you're ordering your output - if it's important,
> let us know the sort criteria.
>
> By the way, did you deliberately have one record with no colons
> (OSC ALBD 380)?
> Ed.


pawan_test

2006-01-29, 9:31 pm

Hello Ed.,

1) all the records are colon ( delimited. (OSC :ALBD: 380) i think i
missed to add a colon for this record when i posted my question.

2) whenever there is a match, then all those corresponding records
from chain.txt have to be moved to the result.txt file

the result.txt will look like this

OSC :ALBD: 377
OSC :ALBD :378
OSC: ALBD: 379
OSC ALBD 380
FLN :FLN: 140
FLN :FLN :141
WLG :WLG :243
WLG: WLG: 244
WLG :WLG: 245
WLG :WLG :246
MAT: WLG :247

thanks ed.,
pavi

hq00e

2006-01-29, 9:31 pm

pawan_test wrote:
> Hello
>
> Thanks guys for your suggestions;
>
> FILE1=/home/pavi/DS.txt
> FILE2=/home/pavi/chain.txt
>
> awk -F ":" '{print $1}' $FILE1 > f1.txt
> awk -F " " '{print $2}' $FILE2 > f2.txt
>
>
> DS.txt contains;
>
> ALBD:xxxxx
> FLN:yyyyyyy
> WLG:ttttttt
>
> chain.txt contain;
>
> ECK: ECK: 331
> ECK: ECK: 336
> FED :FED :427
> FIE :FIE: 212
> FLN :FLN: 140
> FLN :FLN :141
> FMF: FMF: 183
> FR1 :FR1: 303
> FTT :FTT :423
> GAT :GAT: 185
> GEN :GEN: 396
> GFY :GFY: 268
> GIA :GIA: 228
> GMD :GMD: 264
> MIN: MIN: 158
> MRO :MRO: 354
> MY0 :MY0: 315
> NCS :NCS: 390
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> OSC ALBD 380
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>


2-pass sed script,

$ sed 's@\(.*\):.*@/: *\1 *:/p@' $FILE1 > f1
$ sed -nf f1 $FILE2 >result.txt

here is an one-liner (without yielding a temp file 'f1'),

$ sed -nf <(sed 's@\(.*\):.*@/: *\1 *:/p@' $FILE1) $FILE2
FLN :FLN: 140
FLN :FLN :141
OSC :ALBD: 377
OSC :ALBD :378
OSC: ALBD: 379
WLG :WLG :243
WLG: WLG: 244
WLG :WLG: 245
WLG :WLG :246
MAT: WLG :247

--
Regards,
hq00e

Ed Morton

2006-01-29, 9:31 pm

pawan_test wrote:
> Hello Ed.,
>
> 1) all the records are colon ( delimited. (OSC :ALBD: 380) i think i
> missed to add a colon for this record when i posted my question.


They're more than colon-delimitted, their colon plus white-space
delimitted, otherwise " ALBD" would have to be treated as different from
"ALBD" or "ALBD ".

> 2) whenever there is a match, then all those corresponding records
> from chain.txt have to be moved to the result.txt file
>
> the result.txt will look like this
>
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> OSC ALBD 380


Ahem. Still no colons above

> FLN :FLN: 140
> FLN :FLN :141
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247


So take what I posted previously and add > result.txt to redirect the
output.

Ed.
pawan_test

2006-01-30, 8:42 am

Hi ALL.,

THANKS for all your responses.
it seems to be working if the DS.txt is in this format
ALBD:xxxxx
FLN:yyyyyyy
WLG:ttttttt

i am getting the desired output i.e
OSC :ALBD: 377
OSC :ALBD :378
OSC: ALBD: 379
FLN :FLN: 140
FLN :FLN :141
WLG :WLG :243
WLG: WLG: 244
WLG :WLG: 245
WLG :WLG :246
MAT: WLG :247


but when i modified the DS.txt as;

ALBD:xxxxx
FLN:yyyyyyy
GIE:ttttttt ( here there is no match)
WLG:ttttttt

then the desired output is getting repeated twice..result.txt is;

OSC :ALBD: 377
OSC :ALBD :378
OSC: ALBD: 379
FLN :FLN: 140
FLN :FLN :141
WLG :WLG :243
WLG: WLG: 244
WLG :WLG: 245
WLG :WLG :246
MAT: WLG :247
OSC :ALBD: 377
OSC :ALBD :378
OSC: ALBD: 379
FLN :FLN: 140
FLN :FLN :141
WLG :WLG :243
WLG: WLG: 244
WLG :WLG: 245
WLG :WLG :246
MAT: WLG :247

the code that i tried is;
for item in `cat $FILE1 | awk -F: '{print $1}'`
do
grep -w $item $FILE2 >>result_file.txt
done

can anyone please give me a suggestion;

thanks
pavi

Ed Morton

2006-01-30, 5:56 pm

pawan_test wrote:

<snip>
> the code that i tried is;
> for item in `cat $FILE1 | awk -F: '{print $1}'`
> do
> grep -w $item $FILE2 >>result_file.txt
> done
>
> can anyone please give me a suggestion;


Yes, pick a different solution. The above is bad for many reasons,
including UUOC, missing quotes, and notr initing the output file.

Ed.
Andreas Ferrari

2006-01-30, 5:56 pm

do it with a PERL onliner...

regards

pawan_test wrote:
> Hi ALL.,
>
> THANKS for all your responses.
> it seems to be working if the DS.txt is in this format
> ALBD:xxxxx
> FLN:yyyyyyy
> WLG:ttttttt
>
> i am getting the desired output i.e
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> FLN :FLN: 140
> FLN :FLN :141
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>
>
> but when i modified the DS.txt as;
>
> ALBD:xxxxx
> FLN:yyyyyyy
> GIE:ttttttt ( here there is no match)
> WLG:ttttttt
>
> then the desired output is getting repeated twice..result.txt is;
>
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> FLN :FLN: 140
> FLN :FLN :141
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
> OSC :ALBD: 377
> OSC :ALBD :378
> OSC: ALBD: 379
> FLN :FLN: 140
> FLN :FLN :141
> WLG :WLG :243
> WLG: WLG: 244
> WLG :WLG: 245
> WLG :WLG :246
> MAT: WLG :247
>
> the code that i tried is;
> for item in `cat $FILE1 | awk -F: '{print $1}'`
> do
> grep -w $item $FILE2 >>result_file.txt
> done
>
> can anyone please give me a suggestion;
>
> thanks
> pavi
>

romy

2006-01-31, 2:48 am

awk -F: '{ print $1 }' DS.txt > f1.txt

now write a small script :
for strings in `cat f1.txt`
do
grep $strings chain.txt
done

The result is :
OSC :ALBD: 377
OSC :ALBD :378
OSC: ALBD: 379
OSC ALBD 380
FLN :FLN: 140
FLN :FLN :141
WLG :WLG :243
WLG: WLG: 244
WLG :WLG: 245
WLG :WLG :246
MAT: WLG :247

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com