Discussion:
[Dnsmasq-discuss] Dnsmasq doesn't reply to queries made over (link-local) IPv6
Toke Høiland-Jørgensen
2016-09-01 21:29:40 UTC
Permalink
Hi

I have this weird problem where my dnsmasq instance won't reply to
queries made over (link-local) IPv6. I can see the query coming in, it
shows up in the logs (with log-queries) enabled and gets resolved, but
no reply ever goes back out. Don't see any IPv6 DNS packets going out at
all on that interface. Queries made over IPv4 work fine.

I am stumped as to how to debug this. This is dnsmasq 2.76 running on
LEDE nightlies.

-Toke
Simon Kelley
2016-09-02 07:55:06 UTC
Permalink
My first thought is that it's probably replying to the wrong
interface: link local addresses can't be routed: you have to specify
the interface they're connected to. This insight came late to me, and
there's a chance that the dnsmasq code is still messing it up. I'll
take a closer look in the next day or two.

Cheers,

Simon.
Post by Toke Høiland-Jørgensen
Hi
I have this weird problem where my dnsmasq instance won't reply to
queries made over (link-local) IPv6. I can see the query coming in,
it shows up in the logs (with log-queries) enabled and gets
resolved, but no reply ever goes back out. Don't see any IPv6 DNS
packets going out at all on that interface. Queries made over IPv4
work fine.
I am stumped as to how to debug this. This is dnsmasq 2.76 running
on LEDE nightlies.
-Toke
_______________________________________________ Dnsmasq-discuss
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Toke Høiland-Jørgensen
2016-09-02 09:14:41 UTC
Permalink
Post by Simon Kelley
My first thought is that it's probably replying to the wrong
interface: link local addresses can't be routed: you have to specify
the interface they're connected to. This insight came late to me, and
there's a chance that the dnsmasq code is still messing it up. I'll
take a closer look in the next day or two.
Awesome, thanks! :)

-Toke
Simon Kelley
2016-09-03 21:28:41 UTC
Permalink
OK, naive attempts to reproduce this have failed entirely, it just works
for me :-)

Can you run dnsmasq under strace -e trace=network and see what syscalls
it makes, specifically, if it's calling sendmsg() with the reply?

This is what I see, not that sin6_scope_id is correct in both calls.

recvmsg(6, {msg_name(28)={sa_family=AF_INET6, sin6_port=htons(40524),
inet_pton(AF_INET6, "fe80::224:d6ff:feb0:75a2", &sin6_addr),
sin6_flowinfo=0, sin6_scope_id=if_nametoindex("wlan0")},
msg_iov(1)=[{"?9\1
\0\1\0\0\0\0\0\1\3mit\3edu\0\0\1\0\1\0\0)\20\0\0\0"..., 4096}],
msg_controllen=40, {cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=, ...},
msg_flags=0}, 0) = 36
dnsmasq: query[A] mit.edu from fe80::224:d6ff:feb0:75a2
dnsmasq: cached mit.edu is 104.64.165.212
sendmsg(6, {msg_name(28)={sa_family=AF_INET6, sin6_port=htons(40524),
inet_pton(AF_INET6, "fe80::224:d6ff:feb0:75a2", &sin6_addr),
sin6_flowinfo=0, sin6_scope_id=if_nametoindex("wlan0")},
msg_iov(1)=[{"?9\201\200\0\1\0\1\0\0\0\1\3mit\3edu\0\0\1\0\1\300\f\0\1\0\1\0"...,
52}], msg_controllen=36, {cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=,
...}, msg_flags=0}, 0) = 52


Cheers,

Simon.
Post by Toke Høiland-Jørgensen
Post by Simon Kelley
My first thought is that it's probably replying to the wrong
interface: link local addresses can't be routed: you have to specify
the interface they're connected to. This insight came late to me, and
there's a chance that the dnsmasq code is still messing it up. I'll
take a closer look in the next day or two.
Awesome, thanks! :)
-Toke
Toke Høiland-Jørgensen
2016-09-04 11:14:30 UTC
Permalink
Post by Simon Kelley
OK, naive attempts to reproduce this have failed entirely, it just works
for me :-)
Can you run dnsmasq under strace -e trace=network and see what syscalls
it makes, specifically, if it's calling sendmsg() with the reply?
This is what I see, not that sin6_scope_id is correct in both calls.
recvmsg(6, {msg_name(28)={sa_family=AF_INET6, sin6_port=htons(40524),
inet_pton(AF_INET6, "fe80::224:d6ff:feb0:75a2", &sin6_addr),
sin6_flowinfo=0, sin6_scope_id=if_nametoindex("wlan0")},
msg_iov(1)=[{"?9\1
\0\1\0\0\0\0\0\1\3mit\3edu\0\0\1\0\1\0\0)\20\0\0\0"..., 4096}],
msg_controllen=40, {cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=, ...},
msg_flags=0}, 0) = 36
dnsmasq: query[A] mit.edu from fe80::224:d6ff:feb0:75a2
dnsmasq: cached mit.edu is 104.64.165.212
sendmsg(6, {msg_name(28)={sa_family=AF_INET6, sin6_port=htons(40524),
inet_pton(AF_INET6, "fe80::224:d6ff:feb0:75a2", &sin6_addr),
sin6_flowinfo=0, sin6_scope_id=if_nametoindex("wlan0")},
msg_iov(1)=[{"?9\201\200\0\1\0\1\0\0\0\1\3mit\3edu\0\0\1\0\1\300\f\0\1\0\1\0"...,
52}], msg_controllen=36, {cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=,
...}, msg_flags=0}, 0) = 52
I see something similar:

recvmsg(10, {msg_name={sa_family=AF_INET6, sin6_port=htons(50214), inet_pton(AF_INET6, "fe80::c23f:d5ff:fe62:22ac", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=if_nametoindex("eth1.1")}, msg_namelen=28, msg_iov=[{iov_base="\243\307\1\0\0\1\0\0\0\0\0\0\6google\3com\0\0\1\0\1", iov_len=4096}], msg_iovlen=1, msg_control=[{cmsg_len=32, cmsg_level=SOL_IPV6, cmsg_type=0x32}], msg_controllen=32, msg_flags=0}, 0) = 28
dnsmasq: query[A] google.com from fe80::c23f:d5ff:fe62:22ac
socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 16
bind(16, {sa_family=AF_INET6, sin6_port=htons(25784), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, 28) = 0
sendto(16, "%\316\1\0\0\1\0\0\0\0\0\1\6google\3com\0\0\1\0\1\0\0)\20"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(5333), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, 28) = 39
dnsmasq: forwarded google.com to ::1
recvfrom(16, "%\316\201\200\0\1\0\6\0\0\0\1\6google\3com\0\0\1\0\1\300\f\0\1"..., 5131, 0, {sa_family=AF_INET6, sin6_port=htons(5333), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [28]) = 135
dnsmasq: dnssec-query[DS] com to ::1
socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 17
bind(17, {sa_family=AF_INET6, sin6_port=htons(18533), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, 28) = 0
sendto(17, "B0\1\0\0\1\0\0\0\0\0\1\3com\0\0+\0\1\0\0)\20\0\0\0\200\0\0\0", 32, 0, {sa_family=AF_INET6, sin6_port=htons(5333), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, 28) = 32
recvfrom(17, "B0\201\200\0\1\0\2\0\0\0\1\3com\0\0+\0\1\300\f\0+\0\1\0\1Q\200\0"..., 5131, 0, {sa_family=AF_INET6, sin6_port=htons(5333), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [28]) = 239
dnsmasq: reply com is DS keytag 30909, algo 8, digest 2
dnsmasq: dnssec-query[DS] google.com to ::1
socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 17
bind(17, {sa_family=AF_INET6, sin6_port=htons(60387), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, 28) = 0
sendto(17, "\6s\1\0\0\1\0\0\0\0\0\1\6google\3com\0\0+\0\1\0\0)\20"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(5333), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, 28) = 39
recvfrom(17, "\6s\201\200\0\1\0\0\0\6\0\1\6google\3com\0\0+\0\1 CK0"..., 5131, 0, {sa_family=AF_INET6, sin6_port=htons(5333), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [28]) = 760
dnsmasq: dnssec-query[DNSKEY] com to ::1
socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP) = 18
bind(18, {sa_family=AF_INET6, sin6_port=htons(19389), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, 28) = 0
sendto(18, "V\252\1\0\0\1\0\0\0\0\0\1\3com\0\0000\0\1\0\0)\20\0\0\0\200\0\0\0", 32, 0, {sa_family=AF_INET6, sin6_port=htons(5333), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, 28) = 32
recvfrom(18, "V\252\201\200\0\1\0\3\0\0\0\1\3com\0\0000\0\1\300\f\0000\0\1\0\1P>\0"..., 5131, 0, {sa_family=AF_INET6, sin6_port=htons(5333), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=0}, [28]) = 743
dnsmasq: reply com is DNSKEY keytag 27452, algo 8
dnsmasq: reply com is DNSKEY keytag 30909, algo 8
dnsmasq: reply google.com is no DS
dnsmasq: validation result is INSECURE
dnsmasq: reply google.com is 173.194.222.139
dnsmasq: reply google.com is 173.194.222.138
dnsmasq: reply google.com is 173.194.222.113
dnsmasq: reply google.com is 173.194.222.100
dnsmasq: reply google.com is 173.194.222.101
dnsmasq: reply google.com is 173.194.222.102
sendmsg(10, {msg_name={sa_family=AF_INET6, sin6_port=htons(50214), inet_pton(AF_INET6, "fe80::c23f:d5ff:fe62:22ac", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=if_nametoindex("eth1.1")}, msg_namelen=28, msg_iov=[{iov_base="\243\307\201\200\0\1\0\6\0\0\0\0\6google\3com\0\0\1\0\1\300\f\0\1"..., iov_len=124}], msg_iovlen=1, msg_control=[{cmsg_len=32, cmsg_level=SOL_IPV6, cmsg_type=0x32}], msg_controllen=32, msg_flags=0}, 0) = 124


... but nothing shows up on eth1.1, even when running tcpdump on the
same box as dnsmasq is on.

-Toke
Kevin Darbyshire-Bryant
2016-09-04 15:11:27 UTC
Permalink
Post by Simon Kelley
OK, naive attempts to reproduce this have failed entirely, it just works
for me :-)
recvmsg(10, {msg_name={sa_family=AF_INET6, sin6_port=htons(50214), inet_pton(AF_INET6, "fe80::c23f:d5ff:fe62:22ac", &sin6_addr), sin6_flowinfo=htonl(0), sin6_scope_id=if_nametoindex("eth1.1")}, msg_namelen=28, msg_iov=[{iov_base="\243\307\1\0\0\1\0\0\0\0\0\0\6google\3com\0\0\1\0\1", iov_len=4096}], msg_iovlen=1, msg_control=[{cmsg_len=32, cmsg_level=SOL_IPV6, cmsg_type=0x32}], msg_controllen=32, msg_flags=0}, 0) = 28
dnsmasq: query[A] google.com from fe80::c23f:d5ff:fe62:22ac
So I've LEDE r1504 (+8 special sauce local tweaks) + bleeding edge
dnsmasq commit 16800ea072dd0cdf14d951c4bb8d2808b3dfe53d on an Archer C7
router. Using linux mint 18 client: ***@Animal ~/git/github/lede
(exp) $ dig -6 @fe80::62e3:27ff:feaf:9e50%wlan0 google.com AAAA

; <<>> DiG 9.10.3-P4-Ubuntu <<>> -6 @fe80::62e3:27ff:feaf:9e50%wlan0
google.com AAAA
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54808
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com. IN AAAA

;; ANSWER SECTION:
google.com. 33 IN AAAA 2a00:1450:4009:80a::200e

;; Query time: 10 msec
;; SERVER: fe80::62e3:27ff:feaf:9e50%3#53(fe80::62e3:27ff:feaf:9e50%3)
;; WHEN: Sun Sep 04 16:01:01 BST 2016
;; MSG SIZE rcvd: 67


strace running on the router:

clock_gettime(CLOCK_REALTIME, {1473001369, 678036759}) = 0
recvmsg(10, {msg_name={sa_family=AF_INET6, sin6_port=htons(43191),
inet_pton(AF_INET6, "fe80::2677:3ff:fe47:8fec", &sin6_addr),
sin6_flowinfo=htonl(0), sin6_scope_id=if_nametoindex("br-lan")},
msg_namelen=28, msg_iov=[{iov_base="\356;\1
\0\1\0\0\0\0\0\1\6google\3com\0\0\34\0\1\0\0)\20"..., iov_len=4096}],
msg_iovlen=1, msg_control=[{cmsg_len=32, cmsg_level=SOL_IPV6,
cmsg_type=0x32}], msg_controllen=32, msg_flags=0}, 0) = 39
ioctl(10, SIOCGIFNAME, {ifr_index=17, ifr_name="br-lan"}) = 0
sendmsg(10, {msg_name={sa_family=AF_INET6, sin6_port=htons(43191),
inet_pton(AF_INET6, "fe80::2677:3ff:fe47:8fec", &sin6_addr),
sin6_flowinfo=htonl(0), sin6_scope_id=if_nametoindex("br-lan")},
msg_namelen=28,
msg_iov=[{iov_base="\356;\201\200\0\1\0\1\0\0\0\1\6google\3com\0\0\34\0\1\300\f\0\34"...,
iov_len=67}], msg_iovlen=1, msg_control=[{cmsg_len=32,
cmsg_level=SOL_IPV6, cmsg_type=0x32}], msg_controllen=32, msg_flags=0},
0) = 67
poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6,
events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9,
events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12,
events=POLLIN}, {fd=13, events=POLLIN}], 10, -1

I can confirm client box has linklocal inet6 addr:
fe80::2677:3ff:fe47:8fec/64 Scope:Link. I think it 'just works' for me too.

However I'm sure recently I saw some discussion on ipv6 link local
'unresponsive' type issues in the lede chat room. Maybe worth asking
in there.

Kevin
Simon Kelley
2016-09-04 20:47:04 UTC
Permalink
The traces you've both posted look good to me: dnsmasq is providing
the correct value in the sin6_scope_id field of the destination
address when sending the reply.

The obvious difference between the failing case and the working one is
that Toke is using an interface to a VLAN, eth1.1, whilst I used a
standard physical interface and Kevin used a bridge interface. Toke,
could you try on a interface other than a tagged VLAN, and see if you
get the same effect?

Also, try doing a packet capture on the physical interface underlying
the VLAN interface, to see if that gives you the packets that you're n
not seeing by capturing the VLAN interface.

Cheers,

Simon.
Toke Høiland-Jørgensen
2016-09-05 14:42:54 UTC
Permalink
Post by Simon Kelley
The traces you've both posted look good to me: dnsmasq is providing
the correct value in the sin6_scope_id field of the destination
address when sending the reply.
The obvious difference between the failing case and the working one is
that Toke is using an interface to a VLAN, eth1.1, whilst I used a
standard physical interface and Kevin used a bridge interface. Toke,
could you try on a interface other than a tagged VLAN, and see if you
get the same effect?
Also, try doing a packet capture on the physical interface underlying
the VLAN interface, to see if that gives you the packets that you're n
not seeing by capturing the VLAN interface.
Tried all that, doesn't help. However, I have another box where things
work fine; "only" difference being the hardware. So I guess it's not a
bug in dnsmasq at least. Thanks for the help in debugging... :)

-Toke
Simon Kelley
2016-09-05 19:55:44 UTC
Permalink
Duplicate MAC addresses, leading to duplicated link-local addresses?

Cheers,

Simon.
Post by Toke Høiland-Jørgensen
Tried all that, doesn't help. However, I have another box where
things work fine; "only" difference being the hardware. So I guess
it's not a bug in dnsmasq at least. Thanks for the help in
debugging... :)
-Toke
Toke Høiland-Jørgensen
2016-09-05 21:53:05 UTC
Permalink
Post by Simon Kelley
Duplicate MAC addresses, leading to duplicated link-local addresses?
Yeah, there seems to be plenty of those. But changing the MAC address of
the affected interface (i.e. eth1.1) doesn't help. And the box that
works has even more duplicate addresses.

Guess I'll try to cook up a small test program and see if I can
reproduce the effect outside of dnsmasq. When I get some more time to
burn on this, that is.

In the meantime, I guess I'll just disable DHCPv6 on the affected hosts...

-Toke

Loading...