Firefox 3.0 freezes waiting to resolve safebrowsing-cache.google.com in DNS

My current daytime setup is for various reasons a Windows XP installation with Ubuntu Jaunty running inside VirtualBox. I use Microsoft Windows for Outlook, SQL Navigator and some web browsing while using the Linux installation for development. This morning I started Firefox in Windows XP, changed focus to VirtualBox or some other window, and when I returned to Firefox it was frozen. I followed the standard Windows trouble-shooting procedure: reboot and get a coffee. When I was logged in again in both Windows and Ubuntu I got the same issue with Firefox in Linux. WTF?

At least I have the tools in Ubuntu to debug this issue. This is a simplified version and approximate order of what I did.

First, create ~/.gdbinit to make GDB a tad more user-friendly:

set pagination off
set radix 16
set print pretty
set history save on

Second, add ddebs.ubuntu.com to /etc/apt/sources.list:

deb //ddebs.ubuntu.com/ jaunty main restricted universe multiverse
deb //ddebs.ubuntu.com/ jaunty-updates main restricted universe multiverse
deb //ddebs.ubuntu.com/ jaunty-security main restricted universe multiverse
deb //ddebs.ubuntu.com/ jaunty-proposed main restricted universe multiverse

Install some debug symbols:

sudo apt-get install firefox-3.0-dbgsym libnspr4-0d-dbgsym xulrunner-1.9-dbgsym

Debugging time!

$ gdb `which firefox` `pidof firefox`

(gdb) thread apply all bt

Thread 2 (Thread 0xb08eab90 (LWP 4253)):

#9  0xb7e16c7f in getaddrinfo () from /lib/tls/i686/cmov/libc.so.6
#10 0xb7c8d739 in PR_GetAddrInfoByName (hostname=0xbc01ff4 “safebrowsing-cache.google.com”, af=0x0, flags=0x8020) at prnetdb.c:2026
#11 0xb7267940 in nsHostResolver::ThreadFunc (arg=0x92d9fd8) at nsHostResolver.cpp:697

Thread 1 (Thread 0xb7d4b6d0 (LWP 4243)):
#0  0xb8003422 in __kernel_vsyscall ()
#1  0xb7fe30e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/i686/cmov/libpthread.so.0
#2  0xb7c94ed9 in PR_WaitCondVar (cvar=0xcd1ebf8, timeout=0xffffffff) at ptsynch.c:405
#3  0xb7c94f57 in PR_Wait (mon=0xd47d178, timeout=0xffffffff) at ptsynch.c:584
#4  0xb726621b in nsDNSService::Resolve (this=0x92d4b00, hostname=@0xabaf730, flags=<value optimized out>, result=0xbff19ac0) at nsDNSService2.cpp:49

So, we have a thread that is resolving “safebrowsing-cache.google.com” and another thread waiting for this hostname to be resolved. Could this be an issue?

Back at the command line, is there an issue with this domain name? Checking on my local computer:

$ host safebrowsing-cache.google.com
;; connection timed out; no servers could be reached

Trouble at Google? I must confirm that, so I login to one of my servers and run the same command:

$ host safebrowsing-cache.google.com
;; Truncated, retrying in TCP mode.
safebrowsing-cache.google.com is an alias for safebrowsing.cache.l.google.com.
safebrowsing.cache.l.google.com has address 74.125.10.92

Works fine, but what does Truncated, retrying in TCP mode mean? I will investigate that later.

Apparently the company firewall is unable to resolve this domain name, at least for the time being. Google Safe Browsing is built into Firefox 3, so how do I disable it? I looked in about:config and yes, there was a setting called browser.safebrowsing.enabled set to true. I set it to false and… Firefox still froze. Looking at about:config again, I found browser.safebrowsing.malware.enabled and set that one to false as well. Now I am able to write this blog post!

Disabling these configuration options is only curing the symptoms, not the disease. But can I cure an enterprise DNS server that fails to handle truncated responses? I doubt it.

4 comments

  1. I saw this problem too, very annoying. Trying to get to the bottom of the ‘truncated’ problem – all ways to look up that domain seem to fail on my linux box. I suppose the server I use ignores TCP on port 53. Using wireshark the packet returned by UDP seems to be OK. Maybe it’s a bug that the packet seems truncated when it’s really OK.

  2. Add “minimal-responses yes;” in your bind9 configuration or ask your ISP to do so.

    /etc/bind/named.conf.options

    options {

    // …

    // only add records to the authority and additional data sections when required
    minimal-responses yes;

    };

    By doing this Google’s safebrowsing-cache.google.com
    will fit in a standard UDP DNS packet otherwise with additional section it will be TCP DNS packet.

    check the result with or without minimal-responses of
    dig safebrowsing-cache.google.com

    With minimal-responses no (default on Bind9)

    IP (tos 0x0, ttl 64, id 40627, offset 0, flags [none], proto UDP (17), length 75) 127.0.0.1.49553 > 127.0.0.1.53: [bad udp cksum 6429!] 40815+ A? safebrowsing-cache.google.com. (47)
    IP (tos 0x0, ttl 64, id 40628, offset 0, flags [none], proto UDP (17), length 526) 127.0.0.1.53 > 127.0.0.1.49553: 40815| q: A? safebrowsing-cache.google.com. 25/2/0 safebrowsing-cache.google.com.[|domain]
    IP (tos 0x0, ttl 64, id 4337, offset 0, flags [DF], proto TCP (6), length 60) 127.0.0.1.57552 > 127.0.0.1.53: S, cksum 0x30e4 (correct), 272739230:272739230(0) win 32792
    IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) 127.0.0.1.53 > 127.0.0.1.57552: S, cksum 0x6453 (correct), 281541131:281541131(0) ack 272739231 win 32768
    IP (tos 0x0, ttl 64, id 4338, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.57552 > 127.0.0.1.53: ., cksum 0x4b76 (correct), 1:1(0) ack 1 win 513
    IP (tos 0x0, ttl 64, id 4339, offset 0, flags [DF], proto TCP (6), length 101) 127.0.0.1.57552 > 127.0.0.1.53: P 1:50(49) ack 1 win 513 5198+[|domain]
    IP (tos 0x0, ttl 64, id 16739, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.53 > 127.0.0.1.57552: ., cksum 0x4b46 (correct), 1:1(0) ack 50 win 512
    14:44:32.883449 IP (tos 0x0, ttl 64, id 16740, offset 0, flags [DF], proto TCP (6), length 632) 127.0.0.1.53 > 127.0.0.1.57552: P 1:581(580) ack 50 win 512 5198 q:[|domain]
    IP (tos 0x0, ttl 64, id 4340, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.57552 > 127.0.0.1.53: ., cksum 0x48ef (correct), 50:50(0) ack 581 win 531
    IP (tos 0x0, ttl 64, id 4341, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.57552 > 127.0.0.1.53: F, cksum 0x48ee (correct), 50:50(0) ack 581 win 531
    IP (tos 0x0, ttl 64, id 16741, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.53 > 127.0.0.1.57552: F, cksum 0x4900 (correct), 581:581(0) ack 51 win 512
    IP (tos 0x0, ttl 64, id 4342, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.57552 > 127.0.0.1.53: ., cksum 0x48ed (correct), 51:51(0) ack 582 win 531

    With minimal-responses yes

    IP (tos 0x0, ttl 64, id 40623, offset 0, flags [none], proto UDP (17), length 75) 127.0.0.1.40215 > 127.0.0.1.53: [bad udp cksum 8a13!] 55747+ A? safebrowsing-cache.google.com. (47)
    IP (tos 0x0, ttl 64, id 40624, offset 0, flags [none], proto UDP (17), length 494) 127.0.0.1.53 > 127.0.0.1.40215: 55747 q: A? safebrowsing-cache.google.com. 25/0/0 safebrowsing-cache.google.com.[|domain]

    Best Regards,
    Guy Baconniere

  3. Thanks a lot Guy, I’ll see if the I get get the network people to set the “minimal-responses yes” option… 🙂

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.