deCONZ upgrade from 2.11.5 to 2.12.6 crashes

To support@dresden-elektronik.de please include “for manup” in the mail.

sent

/R

You should be able to get the backtrace even without enabling the code dumps by letting deconz run through gdb directly, I guess

Headless
gdb --args /usr/bin/deCONZ -platform minimal

With GUI
gdb --args /usr/bin/deCONZ

You might need to add further deconz parameters as to your requirements. Once this is started, press r to let it run and wait for the crash (assumption is that it occurs rather quickly).

I’ve tested to run your database file with v2.12.6 on my PC with a ConBee II and Raspberry Pi 3B with a RaspBee II, no issues here. So my assumptions that this could be device based might be wrong, at least for the database loading stage.

Hm …
I am running Pi3+ with RaspBee II with all packages up-to-date.
uname -a :
Linux fhem 5.10.52-v8+ #1441 SMP PREEMPT Tue Aug 3 18:14:03 BST 2021 aarch64 GNU/Linux

Havn’t any specials installed. This setup runs stable for 2 years now.

Are there some linux/rasbian os related dependencies to this 2.12.6 version?

At the end, I can continue with V2.11.5 …

My testing setup is the 32-bit version of Raspbian, not sure if that makes any difference.

Linux phoscon 5.10.52-v7+ #1441 SMP Tue Aug 3 18:10:09 BST 2021 armv7l GNU/Linux

There are no new dependencies, so this shouldn’t be a problem.

Can you try the gdb command which Swoop suggested?

gdb --args /usr/bin/deCONZ -platform minimal

And once gdb runs press r and enter

fhem@fhem(2.10):~ $ uname -a
Linux fhem 5.10.52-v8+ #1441 SMP PREEMPT Tue Aug 3 18:14:03 BST 2021 aarch64 GNU/Linux
fhem@fhem(2.10):~ $ gdb --args /usr/bin/deCONZ -platform minimal
GNU gdb (Raspbian 8.2.1-2) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later …
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type “show copying” and “show warranty” for details.
This GDB was configured as “arm-linux-gnueabihf”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see:

Find the GDB manual and other documentation resources online at:

For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from /usr/bin/deCONZ…(no debugging symbols found)…done.
(gdb) r
Starting program: /usr/bin/deCONZ -platform minimal
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/arm-linux-gnueabihf/libthread_db.so.1”.
libpng warning: iCCP: known incorrect sRGB profile
[New Thread 0xf3c7a430 (LWP 6220)]
[New Thread 0xf307a430 (LWP 6262)]
[New Thread 0xf2448430 (LWP 6268)]
This plugin does not support propagateSizeHints()
This plugin does not support propagateSizeHints()
[Detaching after fork from child process 6281]
[New Thread 0xf140f430 (LWP 6306)]
[New Thread 0xf0aff430 (LWP 6307)]
[Thread 0xf0aff430 (LWP 6307) exited]

Note - I’ve restored my backup, so 2.11.5 is running in my house - otherwise Natalie would cry …

I can reproduce the secenario tomorrow /w apt-get upgrade & reboot if you want to see the gdb output of 2.12.6

let me know

Same here… the crash occurs only if Conbee II is connected via USB…

@jihlenburg Thanks for sharing the screenshot. Unfortunately, this does not tell us not the much as expected. Could you maybe repeat this, but next time when execution stops and displayed the source of the crash, enter the command bt and hit enter?

That should provide some background around this.

OK, I reverted back to 2.12.6 just to test and get some more detail for you. Output for gdb below. hope this helps solve the issue: -

gdb --args /usr/bin/deCONZ -platform minimal

GNU gdb (Raspbian 8.2.1-2) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type “show copying” and “show warranty” for details.
This GDB was configured as “arm-linux-gnueabihf”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.

For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from /usr/bin/deCONZ…(no debugging symbols found)…done.

(gdb) r
Starting program: /usr/bin/deCONZ -platform minimal
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/arm-linux-gnueabihf/libthread_db.so.1”.
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to ‘/tmp/runtime-root’
libpng warning: iCCP: known incorrect sRGB profile
[New Thread 0xf3d50430 (LWP 31037)]
[New Thread 0xf3117430 (LWP 31038)]
This plugin does not support propagateSizeHints()
This plugin does not support propagateSizeHints()
[Detaching after fork from child process 31040]
This plugin does not support propagateSizeHints()

Thread 1 “deCONZ” received signal SIGBUS, Bus error.
0xf7f5924c in zmNeighbor::zmNeighbor(char const*, unsigned int) () from /usr/lib/libdeCONZ.so.1

(gdb) bt
#0 0xf7f5924c in zmNeighbor::zmNeighbor(char const*, unsigned int) () from /usr/lib/libdeCONZ.so.1
#1 0x0008da9c in ?? ()
#2 0x000668c8 in ?? ()
#3 0x000673d4 in ?? ()
#4 0x000ce854 in ?? ()
#5 0xf7073244 in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib/arm-linux-gnueabihf/libQt5Core.so.5
#6 0x0006a4ec in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

Oddly, at the moment, deconz is running: -

systemctl status deconz
● deconz.service - deCONZ: ZigBee gateway – REST API
Loaded: loaded (/lib/systemd/system/deconz.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2021-09-01 10:20:22 BST; 4min 10s ago
Main PID: 31528 (deCONZ)
Tasks: 4 (limit: 3949)
CGroup: /system.slice/deconz.service
└─31528 /usr/bin/deCONZ -platform minimal --http-port=80

Sep 01 10:20:22 raspberrypi4 systemd[1]: Started deCONZ: ZigBee gateway – REST API.
Sep 01 10:20:22 raspberrypi4 deCONZ[31528]: QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to ‘/tmp/runtime-pi’
Sep 01 10:20:22 raspberrypi4 deCONZ[31528]: libpng warning: iCCP: known incorrect sRGB profile
Sep 01 10:20:23 raspberrypi4 deCONZ[31528]: This plugin does not support propagateSizeHints()
Sep 01 10:20:23 raspberrypi4 deCONZ[31528]: This plugin does not support propagateSizeHints()

But doesn’t see my Conbee II

deconz

Ah on checking it keeps restarting, stops after 2seconds then restarts in ~30seconds.
I’ve reverted back to 2.11.5 now.

@manup I’ve played a little bit around… I can confirm this bug while running the 64Bit kernel. Apparently, there’s a memory alignment issue.

The output of the 2 upper terminals, however, is with the 32Bit kernel while havin changed /proc/cpu/alignment to 3 on deconz version 2.12.06. The lower terminal is deconz running for 2 mins without any bus error on version 2.11.05.

No clue what has changed, but it apparently must be in deconz core. Hope, that helps nailing the issue down. If you need anything, just let me know.

1 Like

Spot on! Alignment can badly bite ARM controllers.

The younger me wrote evil code like:

    inline quint64 toU64(const void *p)
    {
        return *reinterpret_cast<const quint64*>(p);
    }

This is only used in the zmNeighbor() constructor. Fix will be in the next version.

2 Likes

Hi, and thanks for this information!
Any estimate when the next version will be available? I am eagerly waiting for support of a 4-way Zigbee switch which is supposed to be in 2.12.06, but I cannot use 2.12.06 currently because of exactly this bug.
If this is probably months away, would a 2.12.07 interim release which only fixes this bug be possible?

Thank you!

Any solution available to this problem?
I have the same problem and still wait for a suitable solution

regards

I have since reinstalled my RPi due to a defective SD Card, and switched to the 64bit Raspberry OS (not just the kernel, the whole system). It seems this also solves the problem, the error does not occur any more.