I’ve just captured logs for another occurence of what I think may be a related, but not necessarily the same problem. We have one Philips Hue downlight and a Hue motion sensor in our toilet room, and in the below instance the light failed to turn on when the motion sensor detected motion. The motion sensor is not directly linked to the light in deCONZ, but I’m using Node-RED and Home Assistant for additional logic.
I know that the logic in HA and Node-RED is not the problem, as the light changed state to “on” in Home Assistant (as you can see in the screenshot below), but the actual light stayed off - similar to my scene problem described above.
I realised that I didn’t capture the APS and APS_L2 logs above. Below is another capture with those included and I hope that can shed a bit more light on what is going on.
In this case a downlight should’ve turned on at 21:51:36:270 but it didn’t.
Below are a few handpicked log entries from the above that look suspicious to me. It seems to be retrying an APS-DATA.request 5 times without success, and we’re seeing status: BUSY and failed-status: APP_BUSY.
Just bumping this. Is there anything in the logs that would indicate where the problem lies? Is it something in my setup or configuration, is it potentially a bug in deCONZ or even faulty lights?
Also, seeing that there is a request being retried and then given up on after 5x, would it be possible to increase the number of retries to better the chances of eventual success?
@Mimiix Sorry to be a pest, but I really need to get to the bottom of this. Is there anything I can do or provide that would help/entice your staff or members of the community to have a look at this and help me debug?
I updated to v2.16.1 today and unfortunately it just happen again, lights not turning on when they should but reporting to be on.
Too bad I didn’t have all the debug logs enabled when it happened. However my logs are full of error messages anyway.
I’m seeing a lof of these:
10:49:19:354 0x0000000000000000 error APSDE-DATA.confirm: 0xD2 on task
10:49:19:355 delay sending request 26 dt 1 ms to 0x60A423FFFE8A38EA, ep: 0x0B cluster: 0x0008 onAir: 1
10:49:19:355 delay sending request 27 dt 1 ms to 0x60A423FFFE8A38EA, ep: 0x0B cluster: 0x0300 onAir: 1
10:49:19:355 delay sending request 35 dt 0 ms to 0x001788010BF80DB9, ep: 0x0B cluster: 0x0300 onAir: 1
...
10:52:18:559 0x0000000000000000 error APSDE-DATA.confirm: 0xE1 on task
10:52:18:560 delay sending request 132 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0008 onAir: 1
10:52:18:560 delay sending request 133 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0300 onAir: 1
10:52:18:562 delay sending request 132 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0008 onAir: 1
10:52:18:562 delay sending request 133 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0300 onAir: 1
...
And also a few like this:
17:25:18:567 5 running tasks, wait
17:25:18:572 5 running tasks, wait
17:25:18:578 failed to add task 156560 type: 11, too many tasks
17:25:18:578 failed to add task 156561 type: 6, too many tasks
17:25:18:578 5 running tasks, wait
17:25:18:612 5 running tasks, wait
All hinting at there being too much going on in the Zigbee network. I guess I have a few downlights (~50) and motion sensors (~15), but I need them all
Do you guys have any advice on how to address this?
The 0xE1 (MAC channel access failure) if seen multiple times hints on radio interference, this usually happens when no USB extension cable is used or other nearby devices like USB3/SSD or Bluetooth / WiFi jamming the 2.4 GHz channel. For example audio/video streaming or a close by WiFi router can lead to this. In a WiFi scanner app for Android you can check if the occupied WiFi channels collide with Zigbee, but note that Zigbee and Wifi uses very different channel numbers).
The 0xD2 means “broadcast table full” this happens if too many broadcasts/groupcasts are in flight in a short interval.
But what is the “automation” ? It s still your motion sensor in the toilet ? How many receiver it trigger ? Are you using zigbee group (and not third app group)?
I haven’t seen this error in the logs since the update
The 0xE1 (MAC channel access failure) if seen multiple times hints on radio interference, this usually happens when no USB extension cable is used or other nearby devices like USB3/SSD or Bluetooth / WiFi jamming the 2.4 GHz channel.
No USB3 devices nearby and I’m using a USB extension cable. Probably have to throw out my Google Nest WiFi mesh which doesn’t allow changing channels though…
The 0xD2 means “broadcast table full” this happens if too many broadcasts/groupcasts are in flight in a short interval.
Wondering what can be done about these 0xD2 errors. I suppose either the broadcast table size needs to be increased (if at all possible) or the number of broadcasts reduced. What influences the number/frequency of broadcasts? Would reducing the number of light groups help for example?
But what is the “automation” ? It s still your motion sensor in the toilet ?
Yes, still the same downlight in the toilet. Have the same problem in other rooms too though. Using Zigbee groups everywhere.
I realy don’t see how you can fill a broadcast table, with a simple sensor and a group with 3/4 bulbs (I don’t think you have more in your toilet).
You are sure it s not from your automation scripts ?
I don’t think the table being full is due to the toilet light specifically, but rather the toilet downlight (and other lights in the house) sometimes fail to turn on because the broadcast table is full from other stuff happening in the house. I have more than 60 Zigbee lights and 27 light groups.
I’m 100% sure that the automation is working fine because the lights always change state to “on” in deCONZ / Home Assistant. Just physically they remain off and then cannot even be turned on manually (without toggling off and then on again) because deCONZ thinks they’re already on.
10:52:18:560 delay sending request 132 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0008 onAir: 1
10:52:18:560 delay sending request 133 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0300 onAir: 1
10:52:18:562 delay sending request 132 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0008 onAir: 1
Can be for exemple the device 0x5C0272FFFE227008
Table are empty al the time, but when you make too much request they are filled faster than they are emptied (it s a queue list), mostly if you have one device that delay request.
You have 4 groups, and every time different devices in it ? and same issue on 4 ? what is the request you make on the group ?
But you are “lucky” there is a dev with the same issue than you, so probably some news about this issue.