Discussion:
[Dnsmasq-discuss] Bug forward upstream SERVFAIL
Dave Taht
2017-01-23 03:31:35 UTC
Permalink
From a brief conversation with the bind9 maintainer:

D: if bind gets a servfail, and has two forwarders, will it try the
other forwarder?
E: Yes.

D: Even in the case of a dnssec query?
E:

Bind9 retries an authoritative answer because it might have been
spoofed or one of the servers might be out of date or misconfigured.
It uses the function fctx_nextaddress() to get the next address to try
when a query fails. fctx_nextaddress() searches through both
forwarders and auth servers, depending on what kind of query it is.

D: So I believe it is correct for dnsmasq to try all upstreams on a
servfail response, which restores the prior dnsmasq behavior, and is
more robust.
E: Yes.

D: This seems to look like the right thing:

https://github.com/MartinWetterwald/dnsmasq/pull/1/files
--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org
Eric Luehrsen
2017-01-23 05:17:49 UTC
Permalink
If you a customer of some "we build or host your website" companies, then you may also suffer then other end of this. That is your registrar does a horrible job of pushing your DNSKEY to the correct next-level server and getting a valid DSKEY ... and doing that for all redundant server chains. So one chain of trust may pass, and another chain of trust may fail. Then you lose customer contacts because of single-fail implementations like this.

ERIC




From: Dnsmasq-discuss <dnsmasq-discuss-***@lists.thekelleys.org.uk> on behalf of Dave Taht <***@gmail.com>
Sent: Sunday, January 22, 2017 22:31
To: dnsmasq-discuss
Subject: Re: [Dnsmasq-discuss] Bug forward upstream SERVFAIL
 
From a brief conversation with the bind9 maintainer:

D: if bind gets a servfail, and has two forwarders, will it try the
other forwarder?
E: Yes.

D: Even in the case of a dnssec query?
E:

Bind9 retries an authoritative answer because it might have been
spoofed or one of the servers might be out of date or misconfigured.
It uses the function fctx_nextaddress() to get the next address to try
when a query fails. fctx_nextaddress() searches through both
forwarders and auth servers, depending on what kind of query it is.

D: So I believe it is correct for dnsmasq to try all upstreams on a
servfail response, which restores the prior dnsmasq behavior, and is
more robust.
E: Yes.

D: This seems to look like the right thing:

https://github.com/MartinWetterwald/dnsmasq/pull/1/files



Consider SERVFAIL as a non-successful response by MartinWetterwald · Pull Request #1 · MartinWetterwald/dnsmasq
github.com
Mirror of git://thekelleys.org.uk/dnsmasq.git
--
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


http://blog.cerowrt.org/ - CeroWrt notebook: On fixing the ...
blog.cerowrt.org
When experiments go awry - sometimes you learn something. Doing the same thing over and over again expecting a different result is a definition of insanity - doing a ...

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-***@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss


Dnsmasq-discuss Info Page
lists.thekelleys.org.uk
A list for discussion about the dnsmasq DNS and DHCP server. Configuration, bugs and development. To control spam, only subscribers are allowed to post to the list.
Kurt H Maier
2017-01-23 06:57:22 UTC
Permalink
BIND is far from being a normative DNS reference, and I certainly do
not believe that "BIND does it" is a good reason for anything. Quite
the contrary.

However, this discussion has been happening for a while now; last thing
Simon Kelley said about it was that SERVFAIL in a DNSSEC context meant
that the upstream server cannot validate the record's chain of trust --
meaning that this particular SERVFAIL is not recoverable. In that case
you don't want to waste time spamming other resolvers just to get the
same failure.

Where are you getting SERVFAIL in this case? Is it a DNSSEC failure?

khm
Martin Wetterwald
2017-01-23 09:40:17 UTC
Permalink
Hi,
I agree with khm that it's not because A software does something that
it's right and that B should also do it.

I do think however like Dave (independently of what BIND does) that the
aim of having several upstreams is to provide robustness. The upstreams
in our case are the customer's ISP DNS (we use several ISP at the same
time). We cannot control those DNS. If one of them is misconfigured and
has a internal failure, it will return SERVFAIL. This should not affect
dnsmasq's robustness. The end DNS client wants to get an answer. If he
gets a SERVFAIL answer, it's terrible, usually it means no Internet at
all.
Our case is not DNSSEC related. DNSSEC is disabled, but upstream can
still return SERVFAIL.

We've already patched our dnsmasq internally and it's already in
production. We are happy with this behaviour. Our clients only need to
have at least one ISP DNS which is working, and dnsmasq will make sure
he gets an answer.
Of course if all upstreams return SERVFAIL, dnsmasq will forward
SERVFAIL.

I just wanted to share it here to help in case dnsmasq maintainers also
think it's a good behaviour. :)


Martin
Post by Kurt H Maier
BIND is far from being a normative DNS reference, and I certainly do
not believe that "BIND does it" is a good reason for anything. Quite
the contrary.
However, this discussion has been happening for a while now; last thing
Simon Kelley said about it was that SERVFAIL in a DNSSEC context meant
that the upstream server cannot validate the record's chain of trust --
meaning that this particular SERVFAIL is not recoverable. In that case
you don't want to waste time spamming other resolvers just to get the
same failure.
Where are you getting SERVFAIL in this case? Is it a DNSSEC failure?
khm
_______________________________________________
Dnsmasq-discuss mailing list
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Eric Luehrsen
2017-01-24 08:02:52 UTC
Permalink
As dnsmasq is a stub resolver I believe it _IS_ important to consider what poppular recursive resolvers do. Bind, Unbound, and NSD do need to be reference because they do most of the heavy lifting. Bind was already discussed. Unbound not only checks for multiple response paths but caches all kinds of infrastructure info (response times n such). For DNSSEC, this means servers with lame DSKEY are passed over for awhile and servers with clean databases are preferred. Obviously dnsmasq wouldnt do this, but the design concepts need to account for common lessons in robustness. ... some servers are lame ...

- Eric



-------- Original message --------
From: Martin Wetterwald <***@corp.ovh.com>
Date: 1/23/17 05:09 (GMT-05:00)
To: dnsmasq-***@thekelleys.org.uk
Subject: Re: [Dnsmasq-discuss] Bug forward upstream SERVFAIL

-------- Original message --------
From: Martin Wetterwald <***@corp.ovh.com>
Date: 1/23/17 05:09 (GMT-05:00)
To: dnsmasq-***@thekelleys.org.uk
Subject: Re: [Dnsmasq-discuss] Bug forward upstream SERVFAIL

Hi,
I agree with khm that it's not because A software does something that
it's right and that B should also do it.

I do think however like Dave (independently of what BIND does) that the
aim of having several upstreams is to provide robustness. The upstreams
in our case are the customer's ISP DNS (we use several ISP at the same
time). We cannot control those DNS. If one of them is misconfigured and
has a internal failure, it will return SERVFAIL. This should not affect
dnsmasq's robustness. The end DNS client wants to get an answer. If he
gets a SERVFAIL answer, it's terrible, usually it means no Internet at
all.
Our case is not DNSSEC related. DNSSEC is disabled, but upstream can
still return SERVFAIL.

We've already patched our dnsmasq internally and it's already in
production. We are happy with this behaviour. Our clients only need to
have at least one ISP DNS which is working, and dnsmasq will make sure
he gets an answer.
Of course if all upstreams return SERVFAIL, dnsmasq will forward
SERVFAIL.

I just wanted to share it here to help in case dnsmasq maintainers also
think it's a good behaviour. :)


Martin
Post by Kurt H Maier
BIND is far from being a normative DNS reference, and I certainly do
not believe that "BIND does it" is a good reason for anything. Quite
the contrary.
However, this discussion has been happening for a while now; last thing
Simon Kelley said about it was that SERVFAIL in a DNSSEC context meant
that the upstream server cannot validate the record's chain of trust --
meaning that this particular SERVFAIL is not recoverable. In that case
you don't want to waste time spamming other resolvers just to get the
same failure.
Where are you getting SERVFAIL in this case? Is it a DNSSEC failure?
khm
_______________________________________________
Dnsmasq-discuss mailing list
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-***@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Kurt H Maier
2017-01-24 16:22:04 UTC
Permalink
Post by Eric Luehrsen
As dnsmasq is a stub resolver I believe it _IS_ important to consider
what poppular recursive resolvers do. Bind, Unbound, and NSD do need
to be reference because they do most of the heavy lifting.
This really just reinforces my point -- if you need specific behavior to
recover from unreliable DNS servers (wholly aside from the question 'why
are you using unreliable DNS servers') then the better solution is to
install the 'heavy-weight' resolver of your choice and let it do all the
Rube Goldberg machinations, rather than trying to patch dnsmasq to be
something it's not.

If the answer to every question is 'what does BIND do?' then before too
long you've just got a bug-compatible reimplementation of BIND, and I
think dnsmasq is better off staying dnsmasq.

But none of this is germane. The question boils down to:

Is getting a SERVFAIL rcode from an upstream server sufficient cause to
stop querying and return SERVFAIL?

khm

Loading...