Closed Bug 64857 Opened 24 years ago Closed 22 years ago

Conn: Mozilla does not recognize DNS server changes (DHCP)

Categories

(Core :: Networking, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla1.0

People

(Reporter: brian, Assigned: gordon)

References

Details

(Keywords: relnote, Whiteboard: patch [adt2])

Mozlla does not seem to recognize when DHCP changes the computer's DHCP server.

I've got a laptop running Mandrake Linux 7.2. Working in the office and I'm
running Mozilla (01-06-01 build). Networking (including DNS server) is
configured using DHCP. Time to go home, so I suspend my laptop and head out
(leaving Mozilla open).

When I get home, I plug into my home network, and do a release/renew DHCP to get
a home IP address. This changes resolv.conf and sets up DNS to use my home DNS
server. I'm able to check email, access telnet/ssh, etc. fine using the home
DNS, but Mozilla doesn't seem to recognize that the DNS server has changed, so
name lookups fail. I have to quit and restart Mozilla in order to use it.
Looked for a roaming bug but couldnt find one to match this bug so I am going to
go ahead and mark it NEW.
Status: UNCONFIRMED → NEW
Ever confirmed: true
--> dns 
Assignee: neeti → darin
what version of glibc are you using?  thx!
ls /lib/libc*
glibc-2.1.3 is the version.

blanders@newvaio:~/tmp/mozilla-01.13.00/package> ls -l /lib/libc*
lrwxrwxrwx    1 root     root           13 Nov  4 18:33 /lib/libc.so.6 ->
libc-2.1.3.so*  
-rwxr-xr-x    1 root     root         910k Oct  4 12:26 /lib/libc-2.1.3.so* 
Target Milestone: --- → Future
Also reproduced on FreeBSD 4-STABLE with mozilla-0.8 built from source.
FreeBSD doesn't use glibc.  Must exit and restart mozilla to enable
DNS queries.

  uname -a
  FreeBSD scott.renfro.org 4.3-BETA FreeBSD 4.3-BETA #5:
  Sat Mar 10 16:45:03 PST 2001
  scott@renfro.org:/usr/obj/usr/src/sys/SCOTT-Z505LS  i386

I looked, but didn't see any sethostint()/endhostent() calls, which may
have explained the behavior.
argh.  s/sethostint/sethostent/
Scott: Thank you for pointing me to sethostent/endhostent... I was not familar
with these methods.  I just assumed that the underlying netdb implementation
would have to handle changes to /etc/resolv.conf automatically.  But, now with
a way to tell DNS to restart, we should be set!
Status: NEW → ASSIGNED
Keywords: nsbeta1
Target Milestone: Future → mozilla0.9
Looks like NSPR doesn't export any equivalent to sethostent/endhostent...
Sorry about the confusion; I meant that sethostent()/endhostent()
could *cause* this kind of problem, not resolve it.

I've since confirmed that this behavior is a 'feature' of the
FreeBSD libc, and apparently glibc as well.  Neither stat(2)'s
/etc/resolv.conf to detect changes, so the list of nameservers
is fixed upon the first resolver call by the application.

There is a thread on libc-alpha that discusses
this issue[1] and concludes that libc shouldn't punish all callers
with a stat(2) on every resolver call.

The only options, then, are to a) retain the current
behavior (i.e., restart the application on changes
to resolv.conf) or b) have mozilla detect
resolv.conf changes and trigger a rescan.

The initial scan of resolv.conf is performed by res_init(3).
It appears that setting unsetting the RES_INIT bit in
_res.options, then calling res_init(3) would perform a rescan,
and that we could perform this either on a detected change
to resolv.conf (or just periodically).  I have not yet
had a chance to actually test this, however.

When I get some time, I'll look at the res_init(3) code and
test the rescan behavior, then post an update.

We'd also, obviously, have to ensure that any such code was
portable or only used where supported.


[1] http://sources.redhat.com/ml/libc-alpha/2001-01/msg00077.html

I think we might solve 99% of peoples' concerns by doing the rescan on a toggle
of the offline button.
I verified that simply calling res_init() is sufficient to update the ns address
table from resolv.conf.

Tested on FreeBSD 4-STABLE, RedHat 6.2 and 7.0, and Solaris 2.7.  (On Solaris,
must use -lresolv).  Of course needs some configure bits as well.

Agree that doing this in the Offline->Online transition is sufficient
(once you know to toggle it ;-)
Excellent!  Thanks for investigating this ;-)
(Too bad we don't have a reliable XP solution for auto-detecting
when our network connection goes away and comes back.)
Keywords: nsbeta1nsbeta1+
Target Milestone: mozilla0.9 → mozilla0.9.1
cc'ing gordon and wtc
Gagan: I've read the description and comments in this bug
report and I have no comments to add.
->taking over to ease his pain... :-)
Assignee: darin → gagan
Status: ASSIGNED → NEW
from mtg w/gagan: move target milestone to 0.9.2
Target Milestone: mozilla0.9.1 → mozilla0.9.2
mass move, v2.
qa to me.
QA Contact: tever → benc
->gordon
Okay...I'll take it.
Assignee: gagan → gordon
Status: NEW → ASSIGNED
Priority: -- → P3
I could use some x-head help on this.  I've pasted a first take at the proposed
patch below, but I need help verifying it's going to build and run on the
various flavors of Unix we target.

I also think the typedefs shouldn't be needed, but I wasn't able to locate a
header file that defined them.

Index: nsDnsService.cpp
===================================================================
RCS file: /cvsroot/mozilla/netwerk/dns/src/nsDnsService.cpp,v
retrieving revision 1.85
diff -r1.85 nsDnsService.cpp
45a46,53
> #if defined(XP_UNIX)
> typedef unsigned long   u_long;
> typedef unsigned int    u_int;
> typedef unsigned short  u_short;
> typedef unsigned char   u_char;
> #include <resolv.h>
> #endif
> 
928a937,941
> #if defined(XP_UNIX)
>     _res.options &= ~RES_INIT;
>     int error = res_init();
>     NS_ASSERTION(error == 0, "res_init() failed");
> #endif
Whiteboard: patch
Gordon,

According to the resolver man page on Solaris 8, you are
supposed to include the following headers:
    #include <sys/types.h>
    #include <netinet/in.h>
    #include <arpa/nameser.h>
    #include <resolv.h>

I think to get those typedefs, you just need to include
<sys/types.h>.

Note that modifying the global variable _res and calling
res_init() is not thread-safe.  The Solaris man page has
these to say:
    These interfaces are unsafe in multithreaded applications.
    Unsafe interfaces should be called only from the main
    thread.
Only one thread can be executing nsDNSService::Init() at a time.  Wouldn't that
be sufficient?  It's highly likely that thread is the main UI thread anyway.
Keywords: patch
It must be called by the main thread only if libresolv.so was
compiled without -D_REENTRANT.  It only matters to a couple of
global variables like 'errno' that are thread-local storage if
compiled with -D_REENTRANT.  Without reading the libresolv
source code, it is safest to follow what the man page says.
Since this has a potential of breaking some Unix platforms, how about we land 
this on the trunk and bake and then bring it in for the 0.9.2 branch if needed? 
Keywords: nsBranch
Target Milestone: mozilla0.9.2 → mozilla0.9.3
Is this a fix for bug 63564, by puting a rescan into the toggle for Linux-only?
What about other platforms?
Summary: Mozilla does not recognize DNS server changes (DHCP) → Conn: Mozilla does not recognize DNS server changes (DHCP)
*** Bug 88144 has been marked as a duplicate of this bug. ***
i will take a look.
Assignee: gordon → dougt
Status: ASSIGNED → NEW
I suck.  back to gordon.
Assignee: dougt → gordon
Target Milestone: mozilla0.9.3 → mozilla0.9.4
*** Bug 89501 has been marked as a duplicate of this bug. ***
*** Bug 93000 has been marked as a duplicate of this bug. ***
+ RELNOTE for NS 6.1:
"
Mozilla does not recognize changes in DNS servers while running. PPP and VPN
users should restart the browser after connecting to a different network.
"
Keywords: relnote
how about if we tried running this code on DNS failure?  we could then retry the
DNS lookup once.  this way, we'd get closer to solving the problem of detecting
changes to /etc/resolv.conf in at least the false error case.  how does this sound?
Sounds plausible.  How expensive would that be?  Probably not much is my guess.
This would solve the case that users find most annoying and would be a very
welcome change.

It's not too expensive; certainly miniscule compared to the time that the DNS
query took to time out.  The bind code is in lib/resolv/res_init.c in the
function __res_vinit() and just parses the contents of the environment variable
LOCALDOMAIN and the contents of your resolv.conf file (typically just a couple
of lines) into internal data structures.

The general case (including preventing packets from traversing the whole
Internet to get to your old nameserver) is much harder to fix and probably
involves watching resolv.conf (and knowing where it is to watch it and ...)


*** Bug 95218 has been marked as a duplicate of this bug. ***
*** Bug 90913 has been marked as a duplicate of this bug. ***
+mostfreq - this has been a popular one...
Keywords: mostfreq
*** Bug 48094 has been marked as a duplicate of this bug. ***
Why not also stat resolv.conf if we're doing a lookup after no lookups have been
done for a while (say 5 or 10 minutes)?  This would be a miniscule hit, and
would catch the common case of somebody changing networks (which usually takes
most people longer than 10 minutes), etc, before the first failed lookup (and
resulting long timeout period) happens..
Well, speaking for myself, I work on a Notebook, using dhcp. so if I change my
location in our Area (wired to 802.11, to the other building etc.) my
resolve.conf is rewritten every 3 minutes. Not that there need to be changes, but 
dhclient or pump or whatever recreates it with the 'new' data it retrieves.
Target Milestone: mozilla0.9.4 → mozilla0.9.5
Blocks: 99142
Gordon/Gagan - Where are we on this one? This looks like a nice to have, but not
a stop ship. If this is the case, please mark it as nsbranch- for this round.
Keywords: nsbranchnsbranch+
We don't have a current patch proposal, so I'm changing this to nsbranch-.  It
shouldn't be too hard to get something that will work, but we'll need more bake
time.  This should be doable in the 0.9.5 timeframe.
Keywords: nsbranch+, patchnsbranch-
er, maybe we can get to it in 0.9.6.
Target Milestone: mozilla0.9.5 → mozilla0.9.6
I have:
500 mhz Indigo iMac, 384 mb ram, OS 9.2.1 and OSX 10.1, 56K internal modem,
CD-RW, Lexmark Z53, Earthlink ISP, dial up service, Mozilla OSX 0.9.4 Build ID:
2001091313

I'm not sure if this is a duplicate or not but Brad Baetz says it is.

The problem on inital launch is duplicatable by double clicking Mozilla OSX
0.9.4 with the phone connection in a disconnected state. Mozilla brings the
start page URL into the window, it dials the ISP, connects, then sits there
trying to resolve the host, and eventually says that it can't find it. The
reload button doesn't work at this point. You can click on the URL window and
then return and Mozilla will go to the URL site. I have
http://start.earthlink.net set as my home page. It doesn't seem to be site
dependant however. This can happen on any site that you try to go to from a
state of your phone connection being disconnected.

I have just downloaded and tested Mozilla OSX 0.9.5 ( Fizzilla? ) and the
problem is still there.

Blocks: 107067
Keywords: nsbranch-
This is arguably a performance enhancement, so I'm setting the target for 0.9.7.
Target Milestone: mozilla0.9.6 → mozilla0.9.7
isn't this a duplicate of bug 26718?
Target Milestone: mozilla0.9.7 → mozilla0.9.8
*** Bug 117242 has been marked as a duplicate of this bug. ***
Blocks: 61683
DNS change problem still exists in Mozilla (windows version, build
ID:2001112009), when reconnecting  PPP session from one to another ISP with
different DNS server
Mozilla must be restarted to solve the problem
*** Bug 117613 has been marked as a duplicate of this bug. ***
*** Bug 117628 has been marked as a duplicate of this bug. ***
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Lokms like lots of dupes on this one, nominating for nsbeta1
Keywords: nsbeta1
No longer blocks: 107067
Priority: P3 → P1
Keywords: nsbeta1+
Removing nsbeta1 nomination because this bug has been plussed.
Keywords: nsbeta1
Target Milestone: mozilla0.9.9 → mozilla1.0
*** Bug 115603 has been marked as a duplicate of this bug. ***
*** Bug 130505 has been marked as a duplicate of this bug. ***
*** Bug 88144 has been marked as a duplicate of this bug. ***
*** Bug 26718 has been marked as a duplicate of this bug. ***
I change Platform/OS to All/All because bug 26718 was marked so (this bug does
occur on Windows and MacOS too)
OS: Linux → All
Hardware: PC → All
if we are turning this bug into an all-plat bug, do we have an idea of how to
implement this on each platform?

UNIX uses /etc/resolv.conf, I don't know how the Mac or Windows stacks update
their resolver lists, or how mechanisms change the setting (DHCP, dialup, etc.)
Hi,

I can confirm the bug running FreeBSD 4.5 stable with multiple dialup isps.

What about that idea:

Whenever a dns lookup fails, /etc/resolv.conf is re-read.

This will minimize extra cost for the update and solve the problem entirely (at
least to my mind).

Regards,
 Simon
I think we need both the DNS failure -> rescan AND the Offline|Online button
solutions.

The auto-rescan on DNS failure would not solve problems where the old DNS server
is visible, but does not server all domains needed for the new network (VPN
users or companies that use shadow DNS domains would have this problem).
Frankly, while a rescan after a failed lookup makes the problem slightly better,
this does not, IMO, actually qualify as a "solution" because the initial failure
can take a fairly long time to timeout, and unlike other timeouts, one can't
even do other web accesses in other windows, etc, because they too require DNS
lookups which haven't yet been rescanned so have to timeout also, and so on.

If this bug is actually to be considered fixed, we need to address it in such a
way that it becomes invisible to most if not all users, and that solution simply
doesn't come close.

Should we rescan after a DNS failure?  Yes, definitely.  This should really be a
no-brainer..

Should we (continue to) rescan after an offline-online switch?  Yes, again,
definitely.  We can never be assured of doing things perfectly, so there should
always be a way for the user to force a rescan, and this seems an obvious way to
do it.

However, IMO, we should also (as I reccomended many months ago) automatically
rescan if we have not done any DNS lookups in a while (a few minutes or so). 
The extra hit of checking the DNS configuration in this case is miniscule and
will not be noticed by anyone, and it would fix most cases of DNS server
changes, which are often accompanied by some system idle time (moving a computer
around, overnight configuration updates, etc)
I'd be interested in finding out how other applications use res_init. I
remembered reading about it in DNS & Bind, 2nd edition (a long time ago), but it
did not give any practical discussions about how to use it. Not many programs
have worked as an application that persist through multiple network changes. I
suppose someone could change the code so it scans everytime, then run a page
load test to see how it affects performance.
*** Bug 132970 has been marked as a duplicate of this bug. ***
*** Bug 133359 has been marked as a duplicate of this bug. ***
Whiteboard: patch → patch [adt2]
It seems that on linux debian testing, using the 0.9.9 mozilla version, the
workaround (going offline/online) doesn't work anymore. Version 0.9.8 was
running OK.
*** Bug 134145 has been marked as a duplicate of this bug. ***
*** Bug 143585 has been marked as a duplicate of this bug. ***
I have fixed this problem.  See 117628.  
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
*** Bug 145953 has been marked as a duplicate of this bug. ***
*** Bug 141654 has been marked as a duplicate of this bug. ***
*** Bug 156661 has been marked as a duplicate of this bug. ***
Verified per comment #69. Please reopen bug 117628 if there is still a problem.
Status: RESOLVED → VERIFIED
QA Contact: benc → junruh
*** Bug 199929 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.