WebSphere Edge Server - Edge v5.0 NAT Problem

This is Interesting: Free IT Magazines  
Home > Archive > WebSphere Edge Server > January 2004 > Edge v5.0 NAT Problem





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Edge v5.0 NAT Problem
Salman Moghal

2004-01-19, 3:03 pm

I have setup Edge server (v5.0) to load balance 2 LDAP server. LDAP servers
are on a different subnet than the load balancer (Edge Serv). Hence I use
NAT feature of load balancer (LB). Cluster IP is also properly configered,
i.e. there are separate return addresses for each LDAP servers. These
return addresses belong to the same VLAN as the load balancer. I can ping
the cluster IP so that shows LB is adding the IP to the ARP table (doing an
arp -a lists the IP for the correct VLAN IFC).

Now the problem is that if I do consecutive connects to the cluster IP on
port 389, only alternate requests go through. By that I mean out of 4
connect requests, only 2 will work. And it seems to happen for alternate
requests. So a connect to 389 on cluster IP will succeed for first request,
fail for second, succeed for third.. and so on.

Has someone experienced this before. Any pointers, help will be
appreciated.

TIA.
Salman Moghal


JLee

2004-01-19, 3:03 pm

Salman,

Sounds like the connection to one of the LDAP servers isn't working
(isn't getting there or the server isn't responding correctly). Can you
connect directly to both of the LDAP servers? Can each of those
boxes ping the return address on the LB machine? If you don't have
the manager running, then LB is doing straight roundrobin (default) so
every other connection failing means one of the 2 servers is failing.
You could run one connection, check 'server report :389:' and see which
server it went to, then run a second connection, check 'server report'
again and see where that one went. Which server is it failing on?

Things to check...

Jeff

Salman Moghal

2004-01-19, 3:03 pm

Hi Jeff. Thanks for your response. I was able to fix the problem. Not
both return addresses were in the ARP table. apparently when Edge server
starts, it calls the goActive.cmd script. The script, and along with other
go*.cmd, only had one return address listed in there. So I added the second
LDAP return address and all the requests are now being routed properly.
Thanks for the hint.

Now I have another question regarding the go*.cmd scripts. Is there a
reference manual for how the Load Balancer calls the go*.cmd scripts and the
command line parameters that get passed to the script? The reason for
asking this is becuase I have two Edge Servers configured in HA mode. The
sequence in which the 2 LBs should be brought up is crucial (or so I found).
The standby LB should be brought up first, and then the primary. If it's
done the other way around, return addresses of the 2 LDAP servers get
assigned to the standby MAC address (on both primary and standby machines).
However, when the proper start-up sequence is followed, both servers work
well in HA mode. I emulated the failure of primary by plugging out the
network cable, and the stanby takes over when the next request is received.
So that part is working great. This start-up sequence can be fixed with the
help of some conditional logic inside the .cmd script. That's why I wanted
to know if Edge server passes any arguments to this script. Based on this
argument, goActive.cmd script can be made more intelligent such that ARP
entries are only re-assigned when the primary host is unreachable.

Any comments will be appreciated.

TIA
Salman Moghal

"JLee" <leeja@NOSPAMus.ibm.com> wrote in message
news:3EE0A100.7010108@NOSPAMus.ibm.com...
quote:

> Salman,
>
> Sounds like the connection to one of the LDAP servers isn't working
> (isn't getting there or the server isn't responding correctly). Can you
> connect directly to both of the LDAP servers? Can each of those
> boxes ping the return address on the LB machine? If you don't have
> the manager running, then LB is doing straight roundrobin (default) so
> every other connection failing means one of the 2 servers is failing.
> You could run one connection, check 'server report :389:' and see which
> server it went to, then run a second connection, check 'server report'
> again and see where that one went. Which server is it failing on?
>
> Things to check...
>
> Jeff
>




JLee

2004-01-19, 3:03 pm

Salman,

In LB 5.0, the single parameter passed to the go* scripts is the
"primaryhost" IP that is changing status. So "goActive IP1" would be
called when the clusters assigned to IP1 are going active. The mutual
HA scripts just need to check which IP was passed in and then have the
correct cluster/return IPs in that conditional block that correctly
match that input IP. The sample scripts should be a good starting point
for that.

The order of startup SHOULD not matter -- not to say it won't ever, but
that it shouldn't. The only time problems might occur is when your
router or servers don't accept the gratuitous arps back to back (i.e.
both LBs grat arp the cluster IP within 10 seconds of each other and the
router always keeps the first one in the table -- and doesn't process
the second "final" arp). Other than that, the only problem would be
with incorrect scripts that are aliasing IPs at the wrong time (i.e.
putting them in the wrong conditional block).

Jeff

Salman Moghal

2004-01-19, 3:03 pm

Thanks Jeff.

"JLee" <leeja@NOSPAMus.ibm.com> wrote in message
news:3EE49863.2030706@NOSPAMus.ibm.com...
quote:

> Salman,
>
> In LB 5.0, the single parameter passed to the go* scripts is the
> "primaryhost" IP that is changing status. So "goActive IP1" would be
> called when the clusters assigned to IP1 are going active. The mutual
> HA scripts just need to check which IP was passed in and then have the
> correct cluster/return IPs in that conditional block that correctly
> match that input IP. The sample scripts should be a good starting point
> for that.
>
> The order of startup SHOULD not matter -- not to say it won't ever, but
> that it shouldn't. The only time problems might occur is when your
> router or servers don't accept the gratuitous arps back to back (i.e.
> both LBs grat arp the cluster IP within 10 seconds of each other and the
> router always keeps the first one in the table -- and doesn't process
> the second "final" arp). Other than that, the only problem would be
> with incorrect scripts that are aliasing IPs at the wrong time (i.e.
> putting them in the wrong conditional block).
>
> Jeff
>




Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com