Discussion:
[Dnsmasq-discuss] coredump in dnsmasq on InfiniBand network with OpenStack
Moshe Levi
2015-10-27 11:12:04 UTC
Permalink
Hi,

I am experiencing coredump in dnsmasq on OpenStack environment.
This is my setup:

1. RH 7.1

2. OpenStack Liberty release

3. dnsmasq-utils-2.66-14.el7_1.x86_64

4. we are using dhcp on InfiniBand network (using client id and there is no MAC)
Some times when spawning a VM the dnsmasq crashes see [1] and [2]
Just to point out when spawning a VM the neutron-dhcp-agent (which manage the dnsmasq instances for OpenStack) send SIGHUP to reload the config files
And after that I see
Oct 27 12:11:18 r-smg37 neutron-dhcp-agent: 2015-10-27 12:11:18.213 44868 ERROR neutron.agent.linux.external_process [-] respawning dnsmasq for uuid 82acf0a3-ec07-4009-84b5-74f750c89dc6
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: started, version 2.66 cachesize 150
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: compile time options: IPv6 GNU-getopt DBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: warning: no upstream servers configured
Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: DHCP, static leases only on 192.168.111.0, lease time 1d
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: read /var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/addn_hosts - 6 addresses
Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: read /var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/host
Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: read /var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/opts
Oct 27 12:11:18 r-smg37 kernel: dnsmasq[41374]: segfault at 7a ip 00007f886e5501e8 sp 00007fff9c540b80 error 4 in dnsmasq[7f886e51e000+43000]

This is how the neutron-dhcp-agent spawn the dnsmasq
dnsmasq --no-hosts --no-resolv --strict-order --except-interface=lo --pid-file=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/host --addn-hosts=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/opts --dhcp-leasefile=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/leases --dhcp-match=set:ipxe,175 --bind-interfaces --interface=tap04c60fe7-62 --dhcp-range=set:tag0,192.168.111.0,static,86400s --dhcp-lease-max=256 --conf-file= --domain=openstacklocal --dhcp-broadcast

And I also attached the config files: opts, leases, addn_hosts, host

Please note that it is not happened on the Ethernet network only InfiniBand, so I guess the crash (and as it seems in the logs) related to the client id.
It will be great if you can help me debug this issue.
Thank,
Moshe Levi.



[1] - /var/log/messages
Oct 27 12:11:18 r-smg37 neutron-dhcp-agent: 2015-10-27 12:11:18.213 44868 ERROR neutron.agent.linux.external_process [-] respawning dnsmasq for uuid 82acf0a3-ec07-4009-84b5-74f750c89dc6
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: started, version 2.66 cachesize 150
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: compile time options: IPv6 GNU-getopt DBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: warning: no upstream servers configured
Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: DHCP, static leases only on 192.168.111.0, lease time 1d
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: read /var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/addn_hosts - 6 addresses
Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: read /var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/host
Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: read /var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/opts
Oct 27 12:11:18 r-smg37 kernel: dnsmasq[41374]: segfault at 7a ip 00007f886e5501e8 sp 00007fff9c540b80 error 4 in dnsmasq[7f886e51e000+43000]


[2]: CoreDump
[***@r-smg37 ~(keystone_admin)]# gdb /usr/sbin/dnsmasq /root/core-dnsmasq-11-99-40-46692-1445940738
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/dnsmasq...Reading symbols from /usr/lib/debug/usr/sbin/dnsmasq.debug...done.
done.
[New LWP 46692]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `dnsmasq --no-hosts --no-resolv --strict-order --except-interface=lo --pid-file='.
Program terminated with signal 11, Segmentation fault.
#0 find_config (configs=0x7f4b5b014350, context=***@entry=0x0, clid=0x7f4b5b00bc00 "\377", clid_len=20, hwaddr=***@entry=0x7f4b5b00bb90 "", hw_len=0, hw_type=32, hostname=***@entry=0x0)
at dhcp-common.c:319
319 if (!(context->flags & CONTEXT_V6) && *clid == 0 && config->clid_len == clid_len-1 &&
(gdb) bt
#0 find_config (configs=0x7f4b5b014350, context=***@entry=0x0, clid=0x7f4b5b00bc00 "\377", clid_len=20, hwaddr=***@entry=0x7f4b5b00bb90 "", hw_len=0, hw_type=32, hostname=***@entry=0x0)
at dhcp-common.c:319
#1 0x00007f4b5952575b in lease_update_from_configs () at lease.c:193
#2 0x00007f4b59521c2c in clear_cache_and_reload (now=1445940738) at dnsmasq.c:1236
#3 0x00007f4b5950c334 in async_event (now=1445940738, pipe=10) at dnsmasq.c:1049
#4 main (argc=<optimized out>, argv=<optimized out>) at dnsmasq.c:852
Simon Kelley
2015-11-21 22:14:13 UTC
Permalink
This bug was fixed in the 2.67 release. The fix is here:

http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=53c4c5c85942d4
733f4723531c4d325235448326

the patch should apply fine to the version you're using, if that suits
you best.


Cheers,

Simon.
Post by Moshe Levi
Hi,
I am experiencing coredump in dnsmasq on OpenStack environment.
1.
http://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=53c4c5c85942
d4733f4723531c4d325235448326RH
Post by Moshe Levi
7.1
2. OpenStack Liberty release
3. dnsmasq-utils-2.66-14.el7_1.x86_64
4. we are using dhcp on InfiniBand network (using client id
and there is no MAC) Some times when spawning a VM the dnsmasq
crashes see [1] and [2] Just to point out when spawning a VM the
neutron-dhcp-agent (which manage the dnsmasq instances for
OpenStack) send SIGHUP to reload the config files And after that I
see Oct 27 12:11:18 r-smg37 neutron-dhcp-agent: 2015-10-27
12:11:18.213 44868 ERROR neutron.agent.linux.external_process [-]
respawning dnsmasq for uuid 82acf0a3-ec07-4009-84b5-74f750c89dc6
Oct 27 12:11:18 r-smg37 dnsmasq[41374]: started, version 2.66
cachesize 150 Oct 27 12:11:18 r-smg37 dnsmasq[41374]: compile time
options: IPv6 GNU-getopt DBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP
warning: no upstream servers configured Oct 27 12:11:18 r-smg37
dnsmasq-dhcp[41374]: DHCP, static leases only on 192.168.111.0,
lease time 1d Oct 27 12:11:18 r-smg37 dnsmasq[41374]: read
/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/addn_hosts
- 6 addresses Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: read
/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/host Oct
27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: read
/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/opts Oct
27 12:11:18 r-smg37 kernel: dnsmasq[41374]: segfault at 7a ip
00007f886e5501e8 sp 00007fff9c540b80 error 4 in
dnsmasq[7f886e51e000+43000]
This is how the neutron-dhcp-agent spawn the dnsmasq dnsmasq
--no-hosts --no-resolv --strict-order --except-interface=lo
--pid-file=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/
pid
Post by Moshe Levi
--dhcp-hostsfile=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c
89dc6/host
Post by Moshe Levi
--addn-hosts=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc
6/addn_hosts
Post by Moshe Levi
--dhcp-optsfile=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c8
9dc6/opts
Post by Moshe Levi
--dhcp-leasefile=/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c
89dc6/leases
Post by Moshe Levi
--dhcp-match=set:ipxe,175 --bind-interfaces
--interface=tap04c60fe7-62
--dhcp-range=set:tag0,192.168.111.0,static,86400s
--dhcp-lease-max=256 --conf-file= --domain=openstacklocal
--dhcp-broadcast
And I also attached the config files: opts, leases, addn_hosts, host
Please note that it is not happened on the Ethernet network only
InfiniBand, so I guess the crash (and as it seems in the logs)
related to the client id. It will be great if you can help me debug
this issue. Thank, Moshe Levi.
2015-10-27 12:11:18.213 44868 ERROR
neutron.agent.linux.external_process [-] respawning dnsmasq for
uuid 82acf0a3-ec07-4009-84b5-74f750c89dc6 Oct 27 12:11:18 r-smg37
dnsmasq[41374]: started, version 2.66 cachesize 150 Oct 27 12:11:18
r-smg37 dnsmasq[41374]: compile time options: IPv6 GNU-getopt DBus
no-i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth Oct 27
12:11:18 r-smg37 dnsmasq[41374]: warning: no upstream servers
configured Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: DHCP,
static leases only on 192.168.111.0, lease time 1d Oct 27 12:11:18
r-smg37 dnsmasq[41374]: read
/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/addn_hosts
- 6 addresses Oct 27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: read
/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/host Oct
27 12:11:18 r-smg37 dnsmasq-dhcp[41374]: read
/var/lib/neutron/dhcp/82acf0a3-ec07-4009-84b5-74f750c89dc6/opts Oct
27 12:11:18 r-smg37 kernel: dnsmasq[41374]: segfault at 7a ip
00007f886e5501e8 sp 00007fff9c540b80 error 4 in
dnsmasq[7f886e51e000+43000]
/usr/sbin/dnsmasq /root/core-dnsmasq-11-99-40-46692-1445940738 GNU
gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7 Copyright (C) 2013
Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or
later <http://gnu.org/licenses/gpl.html> This is free software: you
are free to change and redistribute it. There is NO WARRANTY, to
the extent permitted by law. Type "show copying" and "show
warranty" for details. This GDB was configured as
"x86_64-redhat-linux-gnu". For bug reporting instructions, please
see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols
from /usr/sbin/dnsmasq...Reading symbols from
/usr/lib/debug/usr/sbin/dnsmasq.debug...done. done. [New LWP
46692] [Thread debugging using libthread_db enabled] Using host
libthread_db library "/lib64/libthread_db.so.1". Core was generated
by `dnsmasq --no-hosts --no-resolv --strict-order
--except-interface=lo --pid-file='. Program terminated with signal
11, Segmentation fault. #0 find_config (configs=0x7f4b5b014350,
if (!(context->flags & CONTEXT_V6) && *clid == 0 &&
config->clid_len == clid_len-1 && (gdb) bt #0 find_config
clid=0x7f4b5b00bc00 "\377", clid_len=20,
0x00007f4b5952575b in lease_update_from_configs () at lease.c:193
#2 0x00007f4b59521c2c in clear_cache_and_reload (now=1445940738)
at dnsmasq.c:1236 #3 0x00007f4b5950c334 in async_event
(now=1445940738, pipe=10) at dnsmasq.c:1049 #4 main
(argc=<optimized out>, argv=<optimized out>) at dnsmasq.c:852
_______________________________________________ Dnsmasq-discuss
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss
Loading...