Discussion:
[Dnsmasq-discuss] [PATCH] --dont-mirror-queries option
Chris Novakovic
2016-01-29 13:38:22 UTC
Permalink
I have a (rather odd, and perhaps ill-advised) network setup in which
names in a particular domain (e.g. example.com) are split across three
sites, and I need three dnsmasq servers to be mutually dependent in the
following hierarchy to resolve names for that domain:

master
/ \
/ \
area1 area2

If a client sends a query for x.example.com to area1 that area1 can't
answer, or if another client sends a query for y.example.com to area2
that area2 can't answer, both servers will forward the query to master,
which is configured (with --server) to be the sole upstream DNS server
for example.com on both area1 and area2. If master can't answer a query
for example.com, it is configured to forward the query to area1 and
area2. Clearly, master shouldn't forward queries that originate from
area1 back to area1: this would lead to an infinite forwarding loop.

The attached patch implements a new option, --dont-mirror-queries. When
enabled, this option prevents dnsmasq from forwarding a request to an
upstream server if its IP address matches that of the sender of the
query. I suppose this could be considered a dynamic, per-query version
of the --dns-loop-detect option that is only capable of detecting 1-hop
loops.

Kurt H Maier <***@sciops.net> was the brains of this operation, helping
me figure out the part of forward.c that needed patching.

Cheers,
Chris
Simon Kelley
2016-02-05 22:22:59 UTC
Permalink
That's very ingenious!

Your post begs the question "Will you merge the patch?"

I'm not sure: it's a pretty niche application, and there are lots of
cases where it does the wrong thing. For instance when a query arrives
from area1, there's nothing to check (and no way of checking) that the
query comes from dnsmasq, and not a stub resolver running on the same
machine.

Is there not a conceptually simpler fix for this by splitting master
in two, maybe listening on different ports?

area1 is configured to forward to master1, which is configured to
forward again to area2. area2 forwards to master2, which forwards to
area1 when it can't answer.


Cheers,

Simon.
Post by Chris Novakovic
I have a (rather odd, and perhaps ill-advised) network setup in
which names in a particular domain (e.g. example.com) are split
across three sites, and I need three dnsmasq servers to be mutually
dependent in the following hierarchy to resolve names for that
master / \ / \ area1 area2
If a client sends a query for x.example.com to area1 that area1
can't answer, or if another client sends a query for y.example.com
to area2 that area2 can't answer, both servers will forward the
query to master, which is configured (with --server) to be the sole
upstream DNS server for example.com on both area1 and area2. If
master can't answer a query for example.com, it is configured to
forward the query to area1 and area2. Clearly, master shouldn't
forward queries that originate from area1 back to area1: this would
lead to an infinite forwarding loop.
The attached patch implements a new option, --dont-mirror-queries.
When enabled, this option prevents dnsmasq from forwarding a
request to an upstream server if its IP address matches that of the
sender of the query. I suppose this could be considered a dynamic,
per-query version of the --dns-loop-detect option that is only
capable of detecting 1-hop loops.
helping me figure out the part of forward.c that needed patching.
Cheers, Chris
_______________________________________________ Dnsmasq-discuss
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Chris Novakovic
2016-02-06 00:01:37 UTC
Permalink
Post by Simon Kelley
That's very ingenious!
Thanks --- Kurt (repeatedly) described it in far less flattering terms :)
Post by Simon Kelley
Your post begs the question "Will you merge the patch?"
I'm not sure: it's a pretty niche application, and there are lots of
cases where it does the wrong thing. For instance when a query arrives
from area1, there's nothing to check (and no way of checking) that the
query comes from dnsmasq, and not a stub resolver running on the same
machine.
Indeed: sadly, there's no way around this (unless you know there's no
stub resolver running on the other host, and since I have full control
over all three hosts I know that won't be the case).
Post by Simon Kelley
Is there not a conceptually simpler fix for this by splitting master
in two, maybe listening on different ports?
area1 is configured to forward to master1, which is configured to
forward again to area2. area2 forwards to master2, which forwards to
area1 when it can't answer.
Before writing this patch I tried to get similar functionality by
setting up secondary DNS-only servers on each of the hosts and having
them refuse queries that couldn't be answered locally, then configuring
the primary dnsmasq servers in the way you suggested. I decided that it
wasn't an ideal solution because I'm also using the DHCP server
functionality of dnsmasq on all three hosts, and I wanted the names of
their DHCP clients to be resolved correctly too. In that scenario, the
definition of "locally" is murky: the secondary dnsmasqs technically
don't have DHCP lease databases of their own, and would have to share
the dnsmasq.leases file (or, at least, the leases contained within it)
with the primary dnsmasq. I wrote a --dhcp-script script to work around
this, but it didn't give me the results I was looking for (I'm hazy on
the details now, but I recall that the secondary dnsmasq wasn't always
notified of static /and/ dynamic DHCP lease events and the two dnsmasqs
would get out of sync, which sort of defeated the point). I tried to
solve the problem in a number of other ways that wouldn't have required
patching the code (including using --leasefile-ro and maintaining my
"own" leases database elsewhere), but again there were strange corner
cases that would lead to each dnsmasq giving a different response to the
same query.

I agree with you: the patch adds a weird option that's only useful if
you have full control over the DNS-resolving hosts on a network, but for
those who do and want this kind of behaviour, it's really the only way
to be sure that the results of a particular DNS query are correct.

Cheers,
Chris
Post by Simon Kelley
Post by Chris Novakovic
I have a (rather odd, and perhaps ill-advised) network setup in
which names in a particular domain (e.g. example.com) are split
across three sites, and I need three dnsmasq servers to be mutually
dependent in the following hierarchy to resolve names for that
master / \ / \ area1 area2
If a client sends a query for x.example.com to area1 that area1
can't answer, or if another client sends a query for y.example.com
to area2 that area2 can't answer, both servers will forward the
query to master, which is configured (with --server) to be the sole
upstream DNS server for example.com on both area1 and area2. If
master can't answer a query for example.com, it is configured to
forward the query to area1 and area2. Clearly, master shouldn't
forward queries that originate from area1 back to area1: this would
lead to an infinite forwarding loop.
The attached patch implements a new option, --dont-mirror-queries.
When enabled, this option prevents dnsmasq from forwarding a
request to an upstream server if its IP address matches that of the
sender of the query. I suppose this could be considered a dynamic,
per-query version of the --dns-loop-detect option that is only
capable of detecting 1-hop loops.
helping me figure out the part of forward.c that needed patching.
Cheers, Chris
_______________________________________________ Dnsmasq-discuss
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
--
Chris Novakovic
CPAN: CHRISN (http://search.cpan.org/~chrisn)
Simon Kelley
2016-02-13 13:09:00 UTC
Permalink
Post by Chris Novakovic
Before writing this patch I tried to get similar functionality by
setting up secondary DNS-only servers on each of the hosts and having
them refuse queries that couldn't be answered locally, then configuring
the primary dnsmasq servers in the way you suggested. I decided that it
wasn't an ideal solution because I'm also using the DHCP server
functionality of dnsmasq on all three hosts, and I wanted the names of
their DHCP clients to be resolved correctly too. In that scenario, the
definition of "locally" is murky: the secondary dnsmasqs technically
don't have DHCP lease databases of their own, and would have to share
the dnsmasq.leases file (or, at least, the leases contained within it)
with the primary dnsmasq. I wrote a --dhcp-script script to work around
this, but it didn't give me the results I was looking for (I'm hazy on
the details now, but I recall that the secondary dnsmasq wasn't always
notified of static /and/ dynamic DHCP lease events and the two dnsmasqs
would get out of sync, which sort of defeated the point). I tried to
solve the problem in a number of other ways that wouldn't have required
patching the code (including using --leasefile-ro and maintaining my
"own" leases database elsewhere), but again there were strange corner
cases that would lead to each dnsmasq giving a different response to the
same query.
Will try and remember to reply to your other points, but on this one,
the way I'd do it (assuming you don't have problems with slow or
intermittent connectivity) is to have one (primary) dnsmasq which is the
DHCP server for all three networks. You declare all the address ranges
in the config of the primary, and tell the secondaries to do dhcp-relay
to the primary.

That keeps all the DHCP address information in the primary, so as long
as the secondaries forward to the primary, all names should be resolvable.


Cheers,


Simon
Chris Novakovic
2016-02-13 14:21:23 UTC
Permalink
Post by Simon Kelley
Will try and remember to reply to your other points, but on this one,
the way I'd do it (assuming you don't have problems with slow or
intermittent connectivity) is to have one (primary) dnsmasq which is the
DHCP server for all three networks. You declare all the address ranges
in the config of the primary, and tell the secondaries to do dhcp-relay
to the primary.
That keeps all the DHCP address information in the primary, so as long
as the secondaries forward to the primary, all names should be resolvable.
Ideally this is what I would have done, but the three sites (which each
use their own /26 subnet inside a common /24) are geographically
distant, connected to each other via a layer-3 VPN over somewhat
unreliable links --- this means that each site really has to have an
authoritative DHCP server for its own /26 subnet, and the only thing
suitable for splitting across all three sites is DNS service (that way,
if area1 gets cut off from the rest of the /24, area1's dnsmasq can
still assign DHCP leases for its own /26, and it doesn't matter that it
can't resolve a name that's active on area2 because it wouldn't be able
to communicate with that host anyway).

Cheers,
Chris
Simon Kelley
2016-02-24 17:20:14 UTC
Permalink
Post by Chris Novakovic
Post by Simon Kelley
Will try and remember to reply to your other points, but on this
one, the way I'd do it (assuming you don't have problems with
slow or intermittent connectivity) is to have one (primary)
dnsmasq which is the DHCP server for all three networks. You
declare all the address ranges in the config of the primary, and
tell the secondaries to do dhcp-relay to the primary.
That keeps all the DHCP address information in the primary, so as
long as the secondaries forward to the primary, all names should
be resolvable.
Ideally this is what I would have done, but the three sites (which
each use their own /26 subnet inside a common /24) are
geographically distant, connected to each other via a layer-3 VPN
over somewhat unreliable links --- this means that each site really
has to have an authoritative DHCP server for its own /26 subnet,
and the only thing suitable for splitting across all three sites is
DNS service (that way, if area1 gets cut off from the rest of the
/24, area1's dnsmasq can still assign DHCP leases for its own /26,
and it doesn't matter that it can't resolve a name that's active on
area2 because it wouldn't be able to communicate with that host
anyway).
I can see that this sort of setup is a problem in search of a
solution, and I quite like the distributed flooding arrangement.

I wonder if a better solution to the loop-detection is to mark queries
with a UID of all the servers they've been forwarded by, in an EDNS0
option. That would avoid the false detection of queries coming from
master, but not from the dnsmasq instance on master. It would also
detect arbitrary loops. Dnsmasq has the relevant code to examine and
add EDNS0, so it wouldn't be too difficult to add.

That really would be a dynamic versions of the loop detect mode, and
could be configured and documented as such.


Cheers,

Simon.
Kurt H Maier
2016-02-24 23:38:16 UTC
Permalink
Post by Simon Kelley
I wonder if a better solution to the loop-detection is to mark queries
with a UID of all the servers they've been forwarded by, in an EDNS0
option. That would avoid the false detection of queries coming from
master, but not from the dnsmasq instance on master. It would also
detect arbitrary loops. Dnsmasq has the relevant code to examine and
add EDNS0, so it wouldn't be too difficult to add.
What guarantees does dnsmasq have that other servers won't strip the
EDNS0 field or otherwise modify the query? All it takes is one
misconfigured OPT RR and you risk losing the 'chain of custody' data.

I'm not against exploring this approach to loop-detection in general,
since I haven't had trouble working with EDNS0 for some years, but
it doesn't solve the immediate reflection problem we're facing now.

Thanks for giving this some thought, I'm interested to see what you
decide!


khm
Simon Kelley
2016-03-01 18:50:14 UTC
Permalink
Post by Kurt H Maier
Post by Simon Kelley
I wonder if a better solution to the loop-detection is to mark queries
with a UID of all the servers they've been forwarded by, in an EDNS0
option. That would avoid the false detection of queries coming from
master, but not from the dnsmasq instance on master. It would also
detect arbitrary loops. Dnsmasq has the relevant code to examine and
add EDNS0, so it wouldn't be too difficult to add.
What guarantees does dnsmasq have that other servers won't strip the
EDNS0 field or otherwise modify the query? All it takes is one
misconfigured OPT RR and you risk losing the 'chain of custody' data.
I'm not against exploring this approach to loop-detection in general,
since I haven't had trouble working with EDNS0 for some years, but
it doesn't solve the immediate reflection problem we're facing now.
Thanks for giving this some thought, I'm interested to see what you
decide!
This approach assumes that all the servers are dnsmasq, and running the
loop-detection code, which is a reasonable assumption. Once a query
escapes from the "cloud" of interconnected dnsmasq servers towards an
upstream server, the EDNS0 options are no longer required and can be
stripped without problem. (They will be stripped from replies.)


Cheers,

Simon.
Post by Kurt H Maier
khm
_______________________________________________
Dnsmasq-discuss mailing list
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Kurt H Maier
2016-03-01 21:23:59 UTC
Permalink
Post by Simon Kelley
This approach assumes that all the servers are dnsmasq, and running the
loop-detection code, which is a reasonable assumption. Once a query
escapes from the "cloud" of interconnected dnsmasq servers towards an
upstream server, the EDNS0 options are no longer required and can be
stripped without problem. (They will be stripped from replies.)
Part of the concern here was that in some of these deployments we have
'enclaves' of devices with dnsmasq on the edge nodes. I'm concerned
about the interaction on those edges, because EDNS0 data suddenly
disappearing has caused problems for me in the past. I'm also concerned
about whether we'll have to re-architect our DNS infrastructure to avoid
EDNS0 data growing too large. Do you have draft code for this solution
anywhere?

Thanks,
khm
Simon Kelley
2016-03-04 21:35:36 UTC
Permalink
Post by Kurt H Maier
Post by Simon Kelley
This approach assumes that all the servers are dnsmasq, and running the
loop-detection code, which is a reasonable assumption. Once a query
escapes from the "cloud" of interconnected dnsmasq servers towards an
upstream server, the EDNS0 options are no longer required and can be
stripped without problem. (They will be stripped from replies.)
Part of the concern here was that in some of these deployments we have
'enclaves' of devices with dnsmasq on the edge nodes. I'm concerned
about the interaction on those edges, because EDNS0 data suddenly
disappearing has caused problems for me in the past. I'm also concerned
about whether we'll have to re-architect our DNS infrastructure to avoid
EDNS0 data growing too large. Do you have draft code for this solution
anywhere?
Thanks,
khm
No draft code yet. No version of dnsmasq has ever removed EDNS0 from
queries, and note that queries are all we're concerned about here. The
EDNS0 options should not be included in replies. Packet size of queries
is not generally a problem.


Cheers,

Simon.
Chris Novakovic
2017-06-07 22:46:33 UTC
Permalink
Here's the "don't mirror queries" patch from last year updated for 2.77,
partially as a courtesy to anyone who was using it and partially to
stimulate discussion about including functionality like this in a future
version of dnsmasq.

To quickly recap:

- The patch prevents dnsmasq from forwarding an incoming DNS query to an
upstream server if its IP address matches the IP address from which the
query originated. This makes it possible for two dnsmasq DNS servers to
be mutually dependent on each other, without the risk of inducing
forwarding loops.

- There seemed to be agreement that this would also be a useful thing to
have outside of my mad network setup.

- Simon expressed the possibility of genericising the patch by including
a trace of dnsmasq instances that a query had been processed by in an
EDNS0 record, which could then be checked to prevent all forwarding
loops, not just those of length 2.

- Concerns raised about Simon's idea included the possibility of
firewalls interfering with the larger DNS packets this would create, and
the privacy implications of the details of internal DNS setups leaking
out over public networks.

My patch is certainly a hack, and I'd like to see something like this
genericised in the way Simon described. Here's a proposal that ought to
keep everyone happy:

- Every dnsmasq instance stamps the EDNS0 record with some identifier (I
don't know dnsmasq well enough to know what the source of this could be,
sorry) for each request it forwards to upstream servers.

- If an incoming request's EDNS0 record contains the dnsmasq instance's
identifier, dnsmasq replies with REFUSED.

- This behaviour becomes the default. A new option --no-dns-loop-detect
avoids stamping the identifier into any forwarded queries, for people
who don't want this behaviour (an optional list of upstream servers can
be given as an argument to --no-dns-loop-detect to specify that queries
should be stamped except for those forwarded to the listed servers).
Another option --only-dns-loop-detect only stamps the identifier into
queries forwarded to the specified servers. Both options are allowed to
be specified in --servers-file files, so the list of upstream servers
affected by this can be changed as the upstream servers change.

The extra options would be useful for people who don't want information
about their internal DNS infrastructure to leak publicly, or know that
upstream servers are protected by firewalls that may drop large DNS
query packets with EDNS0 records.

I'm afraid I don't know anywhere near enough about dnsmasq to implement
either Simon's original idea or my proposed refinement, but hopefully
someone does...
Post by Simon Kelley
Post by Kurt H Maier
Post by Simon Kelley
This approach assumes that all the servers are dnsmasq, and running the
loop-detection code, which is a reasonable assumption. Once a query
escapes from the "cloud" of interconnected dnsmasq servers towards an
upstream server, the EDNS0 options are no longer required and can be
stripped without problem. (They will be stripped from replies.)
Part of the concern here was that in some of these deployments we have
'enclaves' of devices with dnsmasq on the edge nodes. I'm concerned
about the interaction on those edges, because EDNS0 data suddenly
disappearing has caused problems for me in the past. I'm also concerned
about whether we'll have to re-architect our DNS infrastructure to avoid
EDNS0 data growing too large. Do you have draft code for this solution
anywhere?
Thanks,
khm
No draft code yet. No version of dnsmasq has ever removed EDNS0 from
queries, and note that queries are all we're concerned about here. The
EDNS0 options should not be included in replies. Packet size of queries
is not generally a problem.
Cheers,
Simon.
Kurt H Maier
2016-02-14 00:51:52 UTC
Permalink
Post by Simon Kelley
Will try and remember to reply to your other points, but on this one,
the way I'd do it (assuming you don't have problems with slow or
intermittent connectivity) is to have one (primary) dnsmasq which is the
DHCP server for all three networks. You declare all the address ranges
in the config of the primary, and tell the secondaries to do dhcp-relay
to the primary.
For what it's worth, I've been running this patch for some time, and
it's greatly simplified some tasks wherein I've got
low-power-consumption machines managing a relatively distributed network
of IoT-style sensors and loggers. The default behavior of reflecting
queries back at the requestor was a little egregious in these
circumstances.

khm
Continue reading on narkive:
Loading...