Discussion:
[Dnsmasq-discuss] 2.79 SOMETIMES fails with SERVFAIL
Karim Scheik
2018-04-28 13:18:23 UTC
Permalink
Hi!

So I've found the other threads with 2.79 SERVFAIL issues, but cannot
resolve the issue for us.
We use dnsmasq on many different machines (~100) in different environments
and I also use it at home. All the setups only use very simple caching and
forwarding to our LAN DNS, which has the same config but forwards to the
respective ISP's DNS. Some also do DHCP, but no complex zones/advanced
features, except DNSSEC which we added at least 2 years ago without issues.
We've used all these setups for many years with the exact same config,
starting with v2.39 in 2007. We have updated to every single version since
on our Gentoo Linux boxes and never had any issues.
Whenever we update any 2.78 to 2.79 (irrespective of if being a
forward-to-LAN-dnsmasq or the LAN-dnsmasqs which forward to the ISP) we will
get random (!) SERVFAILS. So sometimes clients cannot resolve e.g.
google.com, mellanox.com, whatever, but most of the time it works. Whenever
it fails that seems to get cached and restarting dnsmasq mostly resolves it
(not 100% sure).

As suggested I did a dig with 2.78 and 2.79 and had to try multiple domains
until it failed. Attached output below.

Please help!

Thanks!

Regards,
Karim

--------------

2.78
====
# nslookup computerbase.de
Server: ::1
Address: ::1#53

Non-authoritative answer:
Name: computerbase.de
Address: 87.230.75.2
Name: computerbase.de
Address: 2a01:488:2000:201::a2

# dig @127.0.0.1 computerbase.de

; <<>> DiG 9.12.1 <<>> @127.0.0.1 computerbase.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36522
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;computerbase.de. IN A

;; ANSWER SECTION:
computerbase.de. 76335 IN A 87.230.75.2

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:05:22 CEST 2018
;; MSG SIZE rcvd: 60

2.79
====

//// GOOGLE, MELLANOX ETC. WORKED UNTIL THIS FAILED

# nslookup computerbase.de
Server: ::1
Address: ::1#53

** server can't find computerbase.de: SERVFAIL

# dig @127.0.0.1 computerbase.de

; <<>> DiG 9.12.1 <<>> @127.0.0.1 computerbase.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 61606
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;computerbase.de. IN A

;; Query time: 7 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:04:42 CEST 2018
;; MSG SIZE rcvd: 44

////// RESTART DNSMASQ 2.79 AND IT WORKS, BUT NOW ANANDTECH.COM FAILS....

# nslookup computerbase.de
Server: ::1
Address: ::1#53

Non-authoritative answer:
Name: computerbase.de
Address: 87.230.75.2
Name: computerbase.de
Address: 2a01:488:2000:201::a2

# dig @127.0.0.1 computerbase.de

; <<>> DiG 9.12.1 <<>> @127.0.0.1 computerbase.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 65413
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;computerbase.de. IN A

;; ANSWER SECTION:
computerbase.de. 75678 IN A 87.230.75.2

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:16:19 CEST 2018
;; MSG SIZE rcvd: 60

///////////// NOW ANANDTECH.COM FAILS

# dig @127.0.0.1 anandtech.com

; <<>> DiG 9.12.1 <<>> @127.0.0.1 anandtech.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 55015
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;anandtech.com. IN A

;; Query time: 29 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:17:11 CEST 2018
;; MSG SIZE rcvd: 42

/////////////// RESTART AGAIN, NOW IT WORKS

# dig @127.0.0.1 anandtech.com

; <<>> DiG 9.12.1 <<>> @127.0.0.1 anandtech.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17400
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;anandtech.com. IN A

;; ANSWER SECTION:
anandtech.com. 270 IN A 192.65.241.100

;; Query time: 45 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:17:41 CEST 2018
;; MSG SIZE rcvd: 58
Simon Kelley
2018-05-02 12:15:33 UTC
Permalink
First question: Are you setting the

dnssec-check-unsigned

option in your configuration?


There's a bug in 2.79 which means that you're using that option even if
you don't explictly set it, so that would be an immediate large change
from the upgrade.

Second question: what happens if you forward to Google (8.8.8.8) or
Cloudflare (1.1.1.1) instead of your ISPs nameservers.


Explanation:

Without dnssec-check-unsigned, dnsmasq will work fine when forwarding to
a nameserver which doesn't support DNSSEC or supports it badly: With no
upstream support, no DNSSEC validation will happen, but the answers will
still get through.

With dnssec-check-unsigned the upstream has to support DNSSEC. Even is
the domain being queried is not DNSSEC signed, dnsmasq has to find
signed proof that the domain is not signed, so DNSSEC upstream must work.

There are interesting reports of upstream nameservers which fail to do
DNSSEC intermittently, but the only one I've actually tracked down so
far is behind dnscrypt-proxy, so it's difficult to nail down exactly
what is breaking.

Here at dnsmasq, inc, we eat our own dogfood using Google public DNS or
Cloudflare's DNS, and don't see these problems.


To nail this down better, it would be really useful to turn on

log-queries

in your dnsmasq config.

Cheers,

Simon.
Post by Karim Scheik
Hi!
So I've found the other threads with 2.79 SERVFAIL issues, but cannot
resolve the issue for us.
We use dnsmasq on many different machines (~100) in different environments
and I also use it at home. All the setups only use very simple caching and
forwarding to our LAN DNS, which has the same config but forwards to the
respective ISP's DNS. Some also do DHCP, but no complex zones/advanced
features, except DNSSEC which we added at least 2 years ago without issues.
We've used all these setups for many years with the exact same config,
starting with v2.39 in 2007. We have updated to every single version since
on our Gentoo Linux boxes and never had any issues.
Whenever we update any 2.78 to 2.79 (irrespective of if being a
forward-to-LAN-dnsmasq or the LAN-dnsmasqs which forward to the ISP) we will
get random (!) SERVFAILS. So sometimes clients cannot resolve e.g.
google.com, mellanox.com, whatever, but most of the time it works. Whenever
it fails that seems to get cached and restarting dnsmasq mostly resolves it
(not 100% sure).
As suggested I did a dig with 2.78 and 2.79 and had to try multiple domains
until it failed. Attached output below.
Please help!
Thanks!
Regards,
Karim
--------------
2.78
====
# nslookup computerbase.de
Server: ::1
Address: ::1#53
Name: computerbase.de
Address: 87.230.75.2
Name: computerbase.de
Address: 2a01:488:2000:201::a2
; (1 server found)
;; global options: +cmd
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36522
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
; EDNS: version: 0, flags:; udp: 4096
;computerbase.de. IN A
computerbase.de. 76335 IN A 87.230.75.2
;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:05:22 CEST 2018
;; MSG SIZE rcvd: 60
2.79
====
//// GOOGLE, MELLANOX ETC. WORKED UNTIL THIS FAILED
# nslookup computerbase.de
Server: ::1
Address: ::1#53
** server can't find computerbase.de: SERVFAIL
; (1 server found)
;; global options: +cmd
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 61606
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
; EDNS: version: 0, flags:; udp: 4096
;computerbase.de. IN A
;; Query time: 7 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:04:42 CEST 2018
;; MSG SIZE rcvd: 44
////// RESTART DNSMASQ 2.79 AND IT WORKS, BUT NOW ANANDTECH.COM FAILS....
# nslookup computerbase.de
Server: ::1
Address: ::1#53
Name: computerbase.de
Address: 87.230.75.2
Name: computerbase.de
Address: 2a01:488:2000:201::a2
; (1 server found)
;; global options: +cmd
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 65413
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
; EDNS: version: 0, flags:; udp: 4096
;computerbase.de. IN A
computerbase.de. 75678 IN A 87.230.75.2
;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:16:19 CEST 2018
;; MSG SIZE rcvd: 60
///////////// NOW ANANDTECH.COM FAILS
; (1 server found)
;; global options: +cmd
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 55015
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
; EDNS: version: 0, flags:; udp: 4096
;anandtech.com. IN A
;; Query time: 29 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:17:11 CEST 2018
;; MSG SIZE rcvd: 42
/////////////// RESTART AGAIN, NOW IT WORKS
; (1 server found)
;; global options: +cmd
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17400
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
; EDNS: version: 0, flags:; udp: 4096
;anandtech.com. IN A
anandtech.com. 270 IN A 192.65.241.100
;; Query time: 45 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 27 19:17:41 CEST 2018
;; MSG SIZE rcvd: 58
_______________________________________________
Dnsmasq-discuss mailing list
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Loading...