Mark Fermor, HolidayExtras.com
2018-06-08 15:20:06 UTC
Hello,
Running dnsmasq with these options:
/usr/sbin/dnsmasq -k --cache-size=50 --log-facility=- --user=nobody
--group=nobody --no-hosts --neg-ttl=60 --max-ttl=240 --max-cache-ttl=300
No local dnsmasq config file so that's literally all the config other than
defaults applied by dnsmasq
dnsmasq -v
Dnsmasq version 2.78 Copyright (c) 2000-2017 Simon Kelley
Compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6
no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
This is something running running in Kubernetes, they run as sidekick
containers to the main application container. I have 5x of the same
deployment running in the cluster, so they're all at the same versions and
receiving equal amounts of traffic via load balancing. They all talk to the
same endpoints outbound and do the same work load etc. I've sent sigusr1
signal to all of the 5x pods individually with a 60 second interval between
sending the signal and this was the output (all pods have been running for
approx 48 hours bar pod4 which has been running less hours):
pod1
I0608 15:10:30.222091 1 nanny.go:116] dnsmasq[19]: time 1528470630
I0608 15:10:30.222843 1 nanny.go:116] dnsmasq[19]: cache size 50,
0/2267390 cache insertions re-used unexpired cache entries.
I0608 15:10:30.222862 1 nanny.go:116] dnsmasq[19]: queries forwarded
3218532, queries answered locally 3179484
I0608 15:10:30.222868 1 nanny.go:116] dnsmasq[19]: queries for
authoritative zones 0
I0608 15:10:30.222874 1 nanny.go:116] dnsmasq[19]: server
10.227.240.10#53: queries sent 3218532, retried or failed 16
I0608 15:11:34.998127 1 nanny.go:116] dnsmasq[19]: time 1528470694
I0608 15:11:34.998169 1 nanny.go:116] dnsmasq[19]: cache size 50,
0/2267416 cache insertions re-used unexpired cache entries.
I0608 15:11:34.998175 1 nanny.go:116] dnsmasq[19]: queries forwarded
3218560, queries answered locally 3182486
I0608 15:11:34.998180 1 nanny.go:116] dnsmasq[19]: queries for
authoritative zones 0
I0608 15:11:34.998184 1 nanny.go:116] dnsmasq[19]: server
10.227.240.10#53: queries sent 3218560, retried or failed 16
pod2
I0608 15:10:30.888610 1 nanny.go:116] dnsmasq[18]: time 1528470630
I0608 15:10:30.888674 1 nanny.go:116] dnsmasq[18]: cache size 50,
0/197436 cache insertions re-used unexpired cache entries.
I0608 15:10:30.888705 1 nanny.go:116] dnsmasq[18]: queries forwarded
240814, queries answered locally 6156846
I0608 15:10:30.888739 1 nanny.go:116] dnsmasq[18]: queries for
authoritative zones 0
I0608 15:10:30.888768 1 nanny.go:116] dnsmasq[18]: server
10.227.240.10#53: queries sent 240814, retried or failed 4
I0608 15:11:35.909168 1 nanny.go:116] dnsmasq[18]: time 1528470695
I0608 15:11:35.909206 1 nanny.go:116] dnsmasq[18]: cache size 50,
0/197465 cache insertions re-used unexpired cache entries.
I0608 15:11:35.909211 1 nanny.go:116] dnsmasq[18]: queries forwarded
240843, queries answered locally 6159789
I0608 15:11:35.909216 1 nanny.go:116] dnsmasq[18]: queries for
authoritative zones 0
I0608 15:11:35.909219 1 nanny.go:116] dnsmasq[18]: server
10.227.240.10#53: queries sent 240843, retried or failed 4
pod3
I0608 15:10:31.492089 1 nanny.go:116] dnsmasq[20]: time 1528470631
I0608 15:10:31.492127 1 nanny.go:116] dnsmasq[20]: cache size 50,
0/63615 cache insertions re-used unexpired cache entries.
I0608 15:10:31.492132 1 nanny.go:116] dnsmasq[20]: queries forwarded
45974, queries answered locally 6344157
I0608 15:10:31.492137 1 nanny.go:116] dnsmasq[20]: queries for
authoritative zones 0
I0608 15:10:31.492143 1 nanny.go:116] dnsmasq[20]: server
10.227.240.10#53: queries sent 45974, retried or failed 1
I0608 15:11:36.948015 1 nanny.go:116] dnsmasq[20]: time 1528470696
I0608 15:11:36.948083 1 nanny.go:116] dnsmasq[20]: cache size 50,
0/63648 cache insertions re-used unexpired cache entries.
I0608 15:11:36.948138 1 nanny.go:116] dnsmasq[20]: queries forwarded
46004, queries answered locally 6347223
I0608 15:11:36.948188 1 nanny.go:116] dnsmasq[20]: queries for
authoritative zones 0
I0608 15:11:36.948219 1 nanny.go:116] dnsmasq[20]: server
10.227.240.10#53: queries sent 46004, retried or failed 1
pod4
I0608 15:10:32.290488 1 nanny.go:116] dnsmasq[24]: time 1528470632
I0608 15:10:32.290564 1 nanny.go:116] dnsmasq[24]: cache size 50,
0/1358169 cache insertions re-used unexpired cache entries.
I0608 15:10:32.290597 1 nanny.go:116] dnsmasq[24]: queries forwarded
1937179, queries answered locally 741609
I0608 15:10:32.290629 1 nanny.go:116] dnsmasq[24]: queries for
authoritative zones 0
I0608 15:10:32.290652 1 nanny.go:116] dnsmasq[24]: server
10.227.240.10#53: queries sent 1937179, retried or failed 7
I0608 15:11:38.032330 1 nanny.go:116] dnsmasq[24]: time 1528470698
I0608 15:11:38.032374 1 nanny.go:116] dnsmasq[24]: cache size 50,
0/1359727 cache insertions re-used unexpired cache entries.
I0608 15:11:38.032382 1 nanny.go:116] dnsmasq[24]: queries forwarded
1939395, queries answered locally 742411
I0608 15:11:38.032388 1 nanny.go:116] dnsmasq[24]: queries for
authoritative zones 0
I0608 15:11:38.032394 1 nanny.go:116] dnsmasq[24]: server
10.227.240.10#53: queries sent 1939395, retried or failed 7
pod5
I0608 15:10:32.941318 1 nanny.go:116] dnsmasq[22]: time 1528470632
I0608 15:10:32.941364 1 nanny.go:116] dnsmasq[22]: cache size 50,
0/62517 cache insertions re-used unexpired cache entries.
I0608 15:10:32.941371 1 nanny.go:116] dnsmasq[22]: queries forwarded
45651, queries answered locally 6328926
I0608 15:10:32.941411 1 nanny.go:116] dnsmasq[22]: queries for
authoritative zones 0
I0608 15:10:32.941415 1 nanny.go:116] dnsmasq[22]: server
10.227.240.10#53: queries sent 45651, retried or failed 1
I0608 15:11:38.965572 1 nanny.go:116] dnsmasq[22]: time 1528470698
I0608 15:11:38.965702 1 nanny.go:116] dnsmasq[22]: cache size 50,
0/62538 cache insertions re-used unexpired cache entries.
I0608 15:11:38.965771 1 nanny.go:116] dnsmasq[22]: queries forwarded
45673, queries answered locally 6331954
I0608 15:11:38.965822 1 nanny.go:116] dnsmasq[22]: queries for
authoritative zones 0
I0608 15:11:38.965875 1 nanny.go:116] dnsmasq[22]: server
10.227.240.10#53: queries sent 45673, retried or failed 1
The problem I notice, is some pods (pod1, pod2, pod4) are forwarding far
more requests than the others. I'm not sure what's caused this seeing as
the application is doing the same across all 5 pods. The only thing I
notice here, is that pods 1/2/4 all have a number of "retried or failed",
which the other pods don't. Therefore I wonder if the reason that those
pods have sent so many more requests upstream instead of hitting the cache,
is because of something to do with a "retried or failed", which then stops
the cache from working for a decent period of time? I've not been able to
find anything (google) relating to this scenario but it's the only thing
that makes sense to me right now. I can accept a couple of failures for
lookup here and there, but one failure (if i'm onto something that is),
seems to then cause no cache hits for a large period of time?
Thanks and Best,
Mark
Running dnsmasq with these options:
/usr/sbin/dnsmasq -k --cache-size=50 --log-facility=- --user=nobody
--group=nobody --no-hosts --neg-ttl=60 --max-ttl=240 --max-cache-ttl=300
No local dnsmasq config file so that's literally all the config other than
defaults applied by dnsmasq
dnsmasq -v
Dnsmasq version 2.78 Copyright (c) 2000-2017 Simon Kelley
Compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6
no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
This is something running running in Kubernetes, they run as sidekick
containers to the main application container. I have 5x of the same
deployment running in the cluster, so they're all at the same versions and
receiving equal amounts of traffic via load balancing. They all talk to the
same endpoints outbound and do the same work load etc. I've sent sigusr1
signal to all of the 5x pods individually with a 60 second interval between
sending the signal and this was the output (all pods have been running for
approx 48 hours bar pod4 which has been running less hours):
pod1
I0608 15:10:30.222091 1 nanny.go:116] dnsmasq[19]: time 1528470630
I0608 15:10:30.222843 1 nanny.go:116] dnsmasq[19]: cache size 50,
0/2267390 cache insertions re-used unexpired cache entries.
I0608 15:10:30.222862 1 nanny.go:116] dnsmasq[19]: queries forwarded
3218532, queries answered locally 3179484
I0608 15:10:30.222868 1 nanny.go:116] dnsmasq[19]: queries for
authoritative zones 0
I0608 15:10:30.222874 1 nanny.go:116] dnsmasq[19]: server
10.227.240.10#53: queries sent 3218532, retried or failed 16
I0608 15:11:34.998127 1 nanny.go:116] dnsmasq[19]: time 1528470694
I0608 15:11:34.998169 1 nanny.go:116] dnsmasq[19]: cache size 50,
0/2267416 cache insertions re-used unexpired cache entries.
I0608 15:11:34.998175 1 nanny.go:116] dnsmasq[19]: queries forwarded
3218560, queries answered locally 3182486
I0608 15:11:34.998180 1 nanny.go:116] dnsmasq[19]: queries for
authoritative zones 0
I0608 15:11:34.998184 1 nanny.go:116] dnsmasq[19]: server
10.227.240.10#53: queries sent 3218560, retried or failed 16
pod2
I0608 15:10:30.888610 1 nanny.go:116] dnsmasq[18]: time 1528470630
I0608 15:10:30.888674 1 nanny.go:116] dnsmasq[18]: cache size 50,
0/197436 cache insertions re-used unexpired cache entries.
I0608 15:10:30.888705 1 nanny.go:116] dnsmasq[18]: queries forwarded
240814, queries answered locally 6156846
I0608 15:10:30.888739 1 nanny.go:116] dnsmasq[18]: queries for
authoritative zones 0
I0608 15:10:30.888768 1 nanny.go:116] dnsmasq[18]: server
10.227.240.10#53: queries sent 240814, retried or failed 4
I0608 15:11:35.909168 1 nanny.go:116] dnsmasq[18]: time 1528470695
I0608 15:11:35.909206 1 nanny.go:116] dnsmasq[18]: cache size 50,
0/197465 cache insertions re-used unexpired cache entries.
I0608 15:11:35.909211 1 nanny.go:116] dnsmasq[18]: queries forwarded
240843, queries answered locally 6159789
I0608 15:11:35.909216 1 nanny.go:116] dnsmasq[18]: queries for
authoritative zones 0
I0608 15:11:35.909219 1 nanny.go:116] dnsmasq[18]: server
10.227.240.10#53: queries sent 240843, retried or failed 4
pod3
I0608 15:10:31.492089 1 nanny.go:116] dnsmasq[20]: time 1528470631
I0608 15:10:31.492127 1 nanny.go:116] dnsmasq[20]: cache size 50,
0/63615 cache insertions re-used unexpired cache entries.
I0608 15:10:31.492132 1 nanny.go:116] dnsmasq[20]: queries forwarded
45974, queries answered locally 6344157
I0608 15:10:31.492137 1 nanny.go:116] dnsmasq[20]: queries for
authoritative zones 0
I0608 15:10:31.492143 1 nanny.go:116] dnsmasq[20]: server
10.227.240.10#53: queries sent 45974, retried or failed 1
I0608 15:11:36.948015 1 nanny.go:116] dnsmasq[20]: time 1528470696
I0608 15:11:36.948083 1 nanny.go:116] dnsmasq[20]: cache size 50,
0/63648 cache insertions re-used unexpired cache entries.
I0608 15:11:36.948138 1 nanny.go:116] dnsmasq[20]: queries forwarded
46004, queries answered locally 6347223
I0608 15:11:36.948188 1 nanny.go:116] dnsmasq[20]: queries for
authoritative zones 0
I0608 15:11:36.948219 1 nanny.go:116] dnsmasq[20]: server
10.227.240.10#53: queries sent 46004, retried or failed 1
pod4
I0608 15:10:32.290488 1 nanny.go:116] dnsmasq[24]: time 1528470632
I0608 15:10:32.290564 1 nanny.go:116] dnsmasq[24]: cache size 50,
0/1358169 cache insertions re-used unexpired cache entries.
I0608 15:10:32.290597 1 nanny.go:116] dnsmasq[24]: queries forwarded
1937179, queries answered locally 741609
I0608 15:10:32.290629 1 nanny.go:116] dnsmasq[24]: queries for
authoritative zones 0
I0608 15:10:32.290652 1 nanny.go:116] dnsmasq[24]: server
10.227.240.10#53: queries sent 1937179, retried or failed 7
I0608 15:11:38.032330 1 nanny.go:116] dnsmasq[24]: time 1528470698
I0608 15:11:38.032374 1 nanny.go:116] dnsmasq[24]: cache size 50,
0/1359727 cache insertions re-used unexpired cache entries.
I0608 15:11:38.032382 1 nanny.go:116] dnsmasq[24]: queries forwarded
1939395, queries answered locally 742411
I0608 15:11:38.032388 1 nanny.go:116] dnsmasq[24]: queries for
authoritative zones 0
I0608 15:11:38.032394 1 nanny.go:116] dnsmasq[24]: server
10.227.240.10#53: queries sent 1939395, retried or failed 7
pod5
I0608 15:10:32.941318 1 nanny.go:116] dnsmasq[22]: time 1528470632
I0608 15:10:32.941364 1 nanny.go:116] dnsmasq[22]: cache size 50,
0/62517 cache insertions re-used unexpired cache entries.
I0608 15:10:32.941371 1 nanny.go:116] dnsmasq[22]: queries forwarded
45651, queries answered locally 6328926
I0608 15:10:32.941411 1 nanny.go:116] dnsmasq[22]: queries for
authoritative zones 0
I0608 15:10:32.941415 1 nanny.go:116] dnsmasq[22]: server
10.227.240.10#53: queries sent 45651, retried or failed 1
I0608 15:11:38.965572 1 nanny.go:116] dnsmasq[22]: time 1528470698
I0608 15:11:38.965702 1 nanny.go:116] dnsmasq[22]: cache size 50,
0/62538 cache insertions re-used unexpired cache entries.
I0608 15:11:38.965771 1 nanny.go:116] dnsmasq[22]: queries forwarded
45673, queries answered locally 6331954
I0608 15:11:38.965822 1 nanny.go:116] dnsmasq[22]: queries for
authoritative zones 0
I0608 15:11:38.965875 1 nanny.go:116] dnsmasq[22]: server
10.227.240.10#53: queries sent 45673, retried or failed 1
The problem I notice, is some pods (pod1, pod2, pod4) are forwarding far
more requests than the others. I'm not sure what's caused this seeing as
the application is doing the same across all 5 pods. The only thing I
notice here, is that pods 1/2/4 all have a number of "retried or failed",
which the other pods don't. Therefore I wonder if the reason that those
pods have sent so many more requests upstream instead of hitting the cache,
is because of something to do with a "retried or failed", which then stops
the cache from working for a decent period of time? I've not been able to
find anything (google) relating to this scenario but it's the only thing
that makes sense to me right now. I can accept a couple of failures for
lookup here and there, but one failure (if i'm onto something that is),
seems to then cause no cache hits for a large period of time?
Thanks and Best,
Mark