Unix administration - Weird behaviour of traceroute; please help diagnose

This is Interesting: Free IT Magazines  
Home > Archive > Unix administration > September 2007 > Weird behaviour of traceroute; please help diagnose





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Weird behaviour of traceroute; please help diagnose
Joe D.

2007-09-18, 1:27 pm

Hi all;

I have an issue that has cropped up in the past couple of weeks on one
of our UNIX servers running Solaris 2.8. I suspect one of my network
config files is messed up, but not sure which one. Would appreciate
any troubleshooting assistance. The issue is I cannot resolve a
machine name when fully qualified. Non-fully qualified name resolves
OK (via nslookup, traceroute, but not ping).

Some background:

In our environment, we are providing DNS services in 2 domains: cgnt
and cgh. cgnt is provided by Windows DNS, cgh is provided by n2h and
named on a UNIX server.

My resolv.conf file looked like this (IPs changed to protect the
innocent). The servers listed are indeed runnind named and the 1st on
the list is the 'master'. There is a blank line at the end, in case
that matters.

domain cgh.org
nameserver 10.1.1.209
nameserver 1xx.2.12.134
nameserver 1xx.2.0.70
nameserver 1xx.2.12.130
{blank line}

/etc/nsswitch.conf contains the following:

# consult /etc "files" only if nis is down.
hosts: files nis dns
ipnodes: files

/etc/hosts contains only the local host, the loghost, and the master
and slacve YP servers. None other.

SO, if I understand the above correctly, ALL my name service
resolution should be entirely within the cgh romain, correct?
However, a traceroute to the fully qualified domain name of 'rocky'
appends BOTH the cgnt and cgh domains to the lookup, resulting in
going out into the wild for the lookup:

gopher (/) # traceroute rocky.cgh.org
traceroute to rocky.cgh.org.cgnt.org (209.62.20.188), 30 hops max, 40
byte packets
1 10.1.16.3 (10.1.16.3) 0.474 ms 0.385 ms 0.242 ms
2 ^C
......and so on......

I've flushed the name service cache by bouncing nscd, fiddled around
with resolv.conf (copied the production version into place; see
below), and still doesn't resolve correctly. A reboot clears this up,
but it eventually returns. Fortunately this is a test server, and we
can reboot at will, but if I have something amiss, then I want to make
sure I get it straightened out and double-check production.
Unfortunately, I don't understand what's wrong, and therefore can't
fix it.

Here is the production resolv.conf that didn't work:

domain cgnt.org
nameserver 10.1.1.209
nameserver 1xx.2.12.134
nameserver 1xx.2.0.70
nameserver 1xx.2.12.130
search cgh.org

Any assistance appreciated; I can't figure out how it is not able to
resolve in the cgh domain, or why gnt.org is being appended to the
resolve during nslookup.

Thanks in advance....

Joe D.

barville@hotmail.co.uk

2007-09-18, 1:27 pm

On 18 Sep, 16:33, "Joe D." <newbie_from_new...@yahoo.com> wrote:
> Hi all;
>
> I have an issue that has cropped up in the past couple of weeks on one
> of our UNIX servers running Solaris 2.8. I suspect one of my network
> config files is messed up, but not sure which one. Would appreciate
> any troubleshooting assistance. The issue is I cannot resolve a
> machine name when fully qualified. Non-fully qualified name resolves
> OK (via nslookup, traceroute, but not ping).

<snip>
> My resolv.conf file looked like this (IPs changed to protect the
> innocent). The servers listed are indeed runnind named and the 1st on
> the list is the 'master'. There is a blank line at the end, in case
> that matters.
>
> domain cgh.org
> nameserver 10.1.1.209
> nameserver 1xx.2.12.134
> nameserver 1xx.2.0.70
> nameserver 1xx.2.12.130
> {blank line}
>
> /etc/nsswitch.conf contains the following:
>
> # consult /etc "files" only if nis is down.
> hosts: files nis dns
> ipnodes: files

<snip>

Given that the hosts line reads 'files nis dns' is it possible that
one of your YP servers is getting in the way somewhere? They'll be
queried before the DNS servers with that setup.


Joe D.

2007-09-18, 7:19 pm


>
> Given that the hosts line reads 'files nis dns' is it possible that
> one of your YP servers is getting in the way somewhere? They'll be
> queried before the DNS servers with that setup.- Hide quoted text -
>


Thanks;

I changed that this AM; have not experienced the issue as yet, but
don't know for sure if that is the ticket. Does anyone else have any
other suggestions/tips?


Moe Trin

2007-09-20, 1:27 am

On Tue, 18 Sep 2007, in the Usenet newsgroup comp.unix.admin, in article
<1190129603.121448.247310@q3g2000prf.googlegroups.com>, Joe D. wrote:

>I have an issue that has cropped up in the past couple of weeks on one
>of our UNIX servers running Solaris 2.8. I suspect one of my network
>config files is messed up, but not sure which one.


<quote> What did you change? </quote>

>The issue is I cannot resolve a machine name when fully qualified.
>Non-fully qualified name resolves OK (via nslookup, traceroute,
>but not ping).


I think there is also a definition problem here. What exactly do you
define as fully qualified names verses non-fully qualified names. A
statement below indicates you're not meaning what ISC means.

>My resolv.conf file looked like this (IPs changed to protect the
>innocent). The servers listed are indeed runnind named and the 1st on
>the list is the 'master'. There is a blank line at the end, in case
>that matters.
>
>domain cgh.org
>nameserver 10.1.1.209
>nameserver 1xx.2.12.134
>nameserver 1xx.2.0.70
>nameserver 1xx.2.12.130
>{blank line}


Caveat: I'm not running Solaris at the moment, and don't have access to
a Solaris set of man pages.

man 5 resolver

There are two items here. On the resolver code I use, only $MAXNS
(see <resolv.h> ) can be listed. Typically, MAXNS is three. Second, have
you looked at the function of the 'domain' (and 'search') directives?
What happens (other than having to use the full name) when you remove
them?

>/etc/nsswitch.conf contains the following:
>
># consult /etc "files" only if nis is down.
>hosts: files nis dns


And that comment isn't how that line works. See if you have a nsswitch
man page. For '"files" only if NIS is down', that line needs to read

hosts: nis files dns

>SO, if I understand the above correctly, ALL my name service
>resolution should be entirely within the cgh romain, correct?


No

>However, a traceroute to the fully qualified domain name of 'rocky'
>appends BOTH the cgnt and cgh domains to the lookup, resulting in
>going out into the wild for the lookup:


'rocky' is not a _fully_qualified_domain_name_ at all. It's a short
name. An FQDN might be 'rocky.cgh.org' - that is, it contains both
the host and DNS Domain parts. "qualification" has nothing to do with
your definition in a local domain. When you try to use the short name,
those search and domain lines in /etc/resolv.conf alter the name that
is searched for - possibly in a way that is not desirable. Certainly
there are times when those directives are a massive security problem.

>gopher (/) # traceroute rocky.cgh.org
>traceroute to rocky.cgh.org.cgnt.org (209.62.20.188), 30 hops max, 40
>byte packets


Wowser!

[compton ~]$ host 209.62.20.188
188.20.62.209.IN-ADDR.ARPA domain name pointer
ev1s-209-62-20-188.ev1servers.net
[compton ~]$ host rocky.cgh.org
rocky.cgh.org has address 1xx.2.1.250
[compton ~]$

Well, that is quite obviously fscked up beyond belief. I have _no_ idea
how your resolver decided that we need to go to ev1servers.net. At least
the nameserver accessible to me has clue. I can't guess what the "real"
hostname might be to that ev1servers host (typically, A and PTR records
should match, but all bets are off at a hosting service), but do you
have some hosts there? Could this be caused by a wild-card record in
your DNS zonefiles?

>I've flushed the name service cache by bouncing nscd, fiddled around
>with resolv.conf (copied the production version into place; see
>below), and still doesn't resolve correctly. A reboot clears this up,
>but it eventually returns.


Without knowing what your NIS looks like, it's hard to say. Also, how
the cache may be contaminated by those 'search' and 'domain' options
is open to question.

>Here is the production resolv.conf that didn't work:
>
>domain cgnt.org
>nameserver 10.1.1.209
>nameserver 1xx.2.12.134
>nameserver 1xx.2.0.70
>nameserver 1xx.2.12.130
>search cgh.org


There's one answer - man 5 resolver - and loose either the search or
domain term.

The domain and search keywords are mutually exclusive. If more than
one instance of these keywords is present, the last instance wins.

Those options are meant to help your lusers when they are to lazy to
type the first dot and beyond in a hostname in a network application.

>Any assistance appreciated; I can't figure out how it is not able to
>resolve in the cgh domain, or why gnt.org is being appended to the
>resolve during nslookup.


The other thing to look at is 'nslookup' (which may have it's own
configuration file).

[compton ~]$ whatis nslookup
nslookup (8) - query Internet name servers interactively
[compton ~]$

Doesn't say a damn thing about hosts files or NIS servers - which it
doesn't know about anyway. Tools like 'dig', 'dnsquery', 'host' and
'nslookup' query name servers and name server only. They are not even
aware of /etc/host.conf or /etc/nsswitch.conf. That's not how your
network stack works. Another tool that might come in handy is a
common packet sniffer. Snoop, Ethereal (now called Wireshark) or even
the classic 'tcpdump' should show what's in the packets going back and
forth between this box, the NIS and DNS servers.

Old guy
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com