Bulbs get uncontrollable after period of time, only groups work

Hi
I’m struggling with some issues on my (bigger) network which consists of at least 80 bulbs (from IKEA, Philips, and no-name) + over 100 sensors and switches.
System is up to date, running a conbee 2 in docker with 2.17.01.
Some bulbs can’t be controlled after some time, even though they claim to be online.
The only way to control those bulbs regardless is to use them in a group. That works for whatever reason.
Power cycle of these bulbs is helping sometimes. Next step is reset and re-connect. That works in 99%. But unfortunately not forever. After some minutes they might become uncontrollable again. (but connected)
I spent a lot of time figuring out a pattern but it is hard to find one.
It seems that one single (still connected) bulb is fishy at some point and infects the whole system. If I find that bulb and reset and reconnect it, the network works fine for longer time. But it is just a matter of time until the problem starts again.
Any idea how I can debug this issue?
I’m not having source routing enabled. Should I?
NWK counter = 3, shall I adjust?
Thanks,
Thomas

Can you share some logs :)?

In #deconz you can find out how and what log levels.

Yes, sure. I collected 4secs in a text file and switched some non reacting, but connected bulbs… How shall I share?

Pastebin :slight_smile:

Logfile

Here we go.
One of the faulty bulbs is this here: 00124b0022d365e1
After I switched it on and off again it finally turned into “unreachable”. But there were others that didn’t turn “unreachable”, but are still not controllable. (Like 00124b0022d36940). All of the bulbs can be controlled still via groups.

I dont see anything obviously wrong. @de_employees can you check>?

1 Like

I´ll investigate and come back at you asap.

There are the error codes E9, A7 and D0 in the log. I would check the uasge of a USB extension, get distance for the stick to interference sources like SSD and use the original power supply if using the Raspberry Pi.

You can also update the firmware to the newest 26780700 on a native system (no Docker, VM …) like Raspbian.

Those probably show up when the bulbs drop out.

The firmware was already updated, or did you see anything in the log that doesn’t match?

I’m already using an extension, but I ordered a newer and longer one. Let’s see if this helps.
Thomas

New extension is installed, but Bulbs are still dropping out occasionally.
Digged into the logs now and found one Bulb that is still connected and reacts on group commands, but is not controllable individually.
Log throws:
Delay APS request id: 83 to 0xEDED, profile: 0x0104 cluster: 0x0006 node already has busy 1
→ What does this mean?
Thanks

the respective bulb is now marked as “unreachable”, has no visible connections but still reacts to group commands!

Have you upgraded the Ikea bulbs to the latest FW?
I used to have the same problem when they were on 1.xx.yy something.
Not anymore …

All Ikea bulbs have been updated recently and doesn’t seem to have an issue.

Still - Any idea why these issues occur? Why does a bulb still react to group commands, but not to individual commands, when it is already marked as “not connected”. And the other way around: The bulb shows reachable, but is not controllable. That’s so weird!

Erik said: https://github.com/dresden-elektronik/deconz-rest-plugin/issues/1261#issuecomment-463722469

Apparently group commands can still work even if unicast does not.

Hi all:
Quick and interesting update:
After trying multiple times to re-connect “bad devices”, I ordered some new bulbs and installed them. And guess what: That new devices are showing the same behaviour! This sounds totally weird to me.
My only thought is that there is some WIFI congestion in exact those places (outside in the garden).
What do you think?
My setup:
Zigbee channel: 15
my own 2.4 GHz WiFi channel: 7-11 (40Mhz Bandwidth)
But there are a couple of other WiFis from my neighbours sending on:
channel 6, 5, 1

Would it make sense to move the Zigbee channel and what happens if I do that? Do I loose all my connections? That would be a nightmare!

Thomas

Here is an update and some tips for anyone facing similar issues:
I tried so many things, added many more repeaters but finally came to the conclusion that there is an issue getting the signals out of my house and that connections outside of my house are simply bad because of the housings of my bulbs. So I decided to split the mesh into 2 networks. One for inside and one for outside.
So I ordered a second ConBee 2 and set up a raspberry Pi for the garden. I selected a different channel to make sure that there aren’t any interference between the 2 networks. I moved all garden bulbs to the new mesh and adjusted all my scripts in ioBroker. That finally did the trick and both networks are pretty stable now.
Despite the fact that outside I still have bad connections, both networks work well. Devices aren’t dropping since days, but I’ll keep an eye on that.
Attaching 2 pictures of the current state of the networks…