Lights not responding sporadically (failed-status: APP_BUSY (0x02))

I’ve just captured logs for another occurence of what I think may be a related, but not necessarily the same problem. We have one Philips Hue downlight and a Hue motion sensor in our toilet room, and in the below instance the light failed to turn on when the motion sensor detected motion. The motion sensor is not directly linked to the light in deCONZ, but I’m using Node-RED and Home Assistant for additional logic.

I know that the logic in HA and Node-RED is not the problem, as the light changed state to “on” in Home Assistant (as you can see in the screenshot below), but the actual light stayed off - similar to my scene problem described above.

It looks like deCONZ actually reports the light turned on via the websocket, but then there is also an error, which I don’t know how to interpret:

16:39:37:222 Websocket 172.30.32.1:45774 send message: {"e":"changed","id":"47","r":"lights","state":{"alert":null,"bri":255,"colormode":"ct","ct":274,"on":true,"reachable":true},"t":"event","uniqueid":"00:17:88:01:09:b6:17:66-0b"} (ret = 176)
16:39:37:223 Websocket 172.30.32.1:45774 send message: {"e":"changed","id":"38","r":"groups","state":{"all_on":true,"any_on":true},"t":"event"} (ret = 88)
16:39:37:224 0x0000000000000000 error APSDE-DATA.confirm: 0xAE on task

That uniqueid 00:17:88:01:09:b6:17:66-0b is in fact the light that didn’t turn on.

Again, any clues about what might be causing this sporadic problem are highly appreciated.

We can see some “delayed” but I don’t see the same amount of request than on your previous logs

16:39:37:224 0x0000000000000000 error APSDE-DATA.confirm: 0xAE on task

This error mean table full (zigbee table)

An APSME-BIND.request or APSME.ADD-GROUP.request issued when the binding or group tables, respectively, were full.

But I don’t see spamming request on your logs …

If someone with more knowledge have an idea …

I realised that I didn’t capture the APS and APS_L2 logs above. Below is another capture with those included and I hope that can shed a bit more light on what is going on.

In this case a downlight should’ve turned on at 21:51:36:270 but it didn’t.

Below are a few handpicked log entries from the above that look suspicious to me. It seems to be retrying an APS-DATA.request 5 times without success, and we’re seeing status: BUSY and failed-status: APP_BUSY.

21:51:36:272 APS-DATA.request id: 178, status: BUSY (counter: 2)
21:51:36:272 APS-DATA.request id: 178, set state: 0x05
21:51:36:272 0x0000000000000000 error APSDE-DATA.confirm: 0xAE on task
21:51:36:273 APS-DATA.request id: 178, failed-status: APP_BUSY (0x02)
21:51:36:276 aps request id: 175 failed, erase from queue
21:51:36:334 APS-DATA.request id: 183, status: BUSY (counter: 3)
21:51:36:335 APS-DATA.request id: 183, set state: 0x05
21:51:36:335 emit artificial APSDE-DATA.confirm id: 183
21:51:36:335 0x0000000000000000 error APSDE-DATA.confirm: 0xAE on task
21:51:36:337 APS-DATA.request id: 183, failed-status: APP_BUSY (0x02)
21:51:36:356 aps request id: 178 failed, erase from queue
...

Any idea what’s happening here and why? Would the “busy” refer to deCONZ being busy or the light?

@de_employees

Just bumping this. Is there anything in the logs that would indicate where the problem lies? Is it something in my setup or configuration, is it potentially a bug in deCONZ or even faulty lights?

Also, seeing that there is a request being retried and then given up on after 5x, would it be possible to increase the number of retries to better the chances of eventual success?

@Mimiix Sorry to be a pest, but I really need to get to the bottom of this. Is there anything I can do or provide that would help/entice your staff or members of the community to have a look at this and help me debug?

There’s nothing I can do If @de_employees fail to reply.

Small update: The upcoming release v2.16.1 contains a fix which addresses this very issue to prevent the queue being to stressed.

1 Like

Great news, thanks for the update @manup!

I updated to v2.16.1 today and unfortunately it just happen again, lights not turning on when they should but reporting to be on.

Too bad I didn’t have all the debug logs enabled when it happened. However my logs are full of error messages anyway.

I’m seeing a lof of these:

10:49:19:354 0x0000000000000000 error APSDE-DATA.confirm: 0xD2 on task
10:49:19:355 delay sending request 26 dt 1 ms to 0x60A423FFFE8A38EA, ep: 0x0B cluster: 0x0008 onAir: 1
10:49:19:355 delay sending request 27 dt 1 ms to 0x60A423FFFE8A38EA, ep: 0x0B cluster: 0x0300 onAir: 1
10:49:19:355 delay sending request 35 dt 0 ms to 0x001788010BF80DB9, ep: 0x0B cluster: 0x0300 onAir: 1
...
10:52:18:559 0x0000000000000000 error APSDE-DATA.confirm: 0xE1 on task
10:52:18:560 delay sending request 132 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0008 onAir: 1
10:52:18:560 delay sending request 133 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0300 onAir: 1
10:52:18:562 delay sending request 132 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0008 onAir: 1
10:52:18:562 delay sending request 133 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0300 onAir: 1
...

And also a few like this:

17:25:18:567 5 running tasks, wait
17:25:18:572 5 running tasks, wait
17:25:18:578 failed to add task 156560 type: 11, too many tasks
17:25:18:578 failed to add task 156561 type: 6, too many tasks
17:25:18:578 5 running tasks, wait
17:25:18:612 5 running tasks, wait

All hinting at there being too much going on in the Zigbee network. I guess I have a few downlights (~50) and motion sensors (~15), but I need them all :smiley:

Do you guys have any advice on how to address this?

The APP_BUSY message doesn’t show up anymore?

The 0xD2 and 0xE1 are two particular mean errors.

The 0xE1 (MAC channel access failure) if seen multiple times hints on radio interference, this usually happens when no USB extension cable is used or other nearby devices like USB3/SSD or Bluetooth / WiFi jamming the 2.4 GHz channel. For example audio/video streaming or a close by WiFi router can lead to this. In a WiFi scanner app for Android you can check if the occupied WiFi channels collide with Zigbee, but note that Zigbee and Wifi uses very different channel numbers).

The 0xD2 means “broadcast table full” this happens if too many broadcasts/groupcasts are in flight in a short interval.

Having a lot of 0xE1 is never good, as it means not all requests are comming trough.

But what is the “automation” ? It s still your motion sensor in the toilet ? How many receiver it trigger ? Are you using zigbee group (and not third app group)?

1 Like

The APP_BUSY message doesn’t show up anymore?

I haven’t seen this error in the logs since the update :+1:

The 0xE1 (MAC channel access failure) if seen multiple times hints on radio interference, this usually happens when no USB extension cable is used or other nearby devices like USB3/SSD or Bluetooth / WiFi jamming the 2.4 GHz channel.

No USB3 devices nearby and I’m using a USB extension cable. Probably have to throw out my Google Nest WiFi mesh which doesn’t allow changing channels though… :roll_eyes:

The 0xD2 means “broadcast table full” this happens if too many broadcasts/groupcasts are in flight in a short interval.

Wondering what can be done about these 0xD2 errors. I suppose either the broadcast table size needs to be increased (if at all possible) or the number of broadcasts reduced. What influences the number/frequency of broadcasts? Would reducing the number of light groups help for example?

But what is the “automation” ? It s still your motion sensor in the toilet ?

Yes, still the same downlight in the toilet. Have the same problem in other rooms too though. Using Zigbee groups everywhere.

I realy don’t see how you can fill a broadcast table, with a simple sensor and a group with 3/4 bulbs (I don’t think you have more in your toilet).
You are sure it s not from your automation scripts ?

I don’t think the table being full is due to the toilet light specifically, but rather the toilet downlight (and other lights in the house) sometimes fail to turn on because the broadcast table is full from other stuff happening in the house. I have more than 60 Zigbee lights and 27 light groups.

I’m 100% sure that the automation is working fine because the lights always change state to “on” in deCONZ / Home Assistant. Just physically they remain off and then cannot even be turned on manually (without toggling off and then on again) because deCONZ thinks they’re already on.

Did you ever find a solution to those pesky 0xD2? I’m seeing the same symptoms in my setup.

I have the same since a power outage yesterday: none of my deconz groups are working. Adressing the devices individual still works.

Also I have the table full error. Didn’t found a way to clear the tables.

A log when I press a button to control a group of 1 light:

09:40:56:876 rule event /sensors/25/state/lastupdated: 0 -> 0
09:40:56:992 rule event /sensors/25/state/lastupdated: 0 -> 0
09:40:56:993 trigger rule 32 - Rule TOOGLE_ON
09:40:57:066 0x0000000000000000 error APSDE-DATA.confirm: 0xD2 on task
09:40:57:839 Set sensor check interval to 1000 milliseconds

if it helps here is also the http request and APS logging:

Does anybody know how to clear those tables? Or identify on which router the table is full.
I only have 4 groups, and they never work anymore.

Before my power outage (10 days ago) everything worked smooth. Since then not able to control any group.

I tried recreating the groups, no difference.

10:52:18:560 delay sending request 132 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0008 onAir: 1
10:52:18:560 delay sending request 133 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0300 onAir: 1
10:52:18:562 delay sending request 132 dt 0 ms to 0x5C0272FFFE227008, ep: 0x0B cluster: 0x0008 onAir: 1

Can be for exemple the device 0x5C0272FFFE227008

Table are empty al the time, but when you make too much request they are filled faster than they are emptied (it s a queue list), mostly if you have one device that delay request.

You have 4 groups, and every time different devices in it ? and same issue on 4 ? what is the request you make on the group ?

But you are “lucky” there is a dev with the same issue than you, so probably some news about this issue.