8 minute read

Situation

The ZHA network had trouble keeping battery-powered nodes online for a while. Xiaomi presence sensor Especially the various Xiaomi sensors has issues staying connected and required periodic repairing. These are known to have trouble with certain routers, and also not all Xiaomi sensors had issues. And they also had issues on the deCONZ network occasionally. So initially it seemed to be mainly related to those motion sensors.

I had also been seeing the following log message for a while, but it disappeared over time.

Zigbee channel 24 utilization is 99.52%!

Upto now I was getting away with re-pairing a sensor when needed. It didn’t happen often enough to warrant completely overhauling the network (migration to Zigbee2MQTT for example) and I had monitoring set up for devices going offline, so I usually was the first of the family to know when a sensor had issues.

Meanwhile, I had been working for 2 years on migrating from deCONZ to ZHA and I really wanted to finish that at some point. But to do that, the ZHA network needed to be more stable first.

There’s plenty written on the internet about getting a zigbee network stable. A forum post I found particularly useful was this one by Hedda. Reading through it, I thought I had tried everything already. USB2 extension cable anyone? Or the channel overlap diagram between Zigbee and Wifi from Metageek? Over the years the ZHA integration documentation page has been significantly expanded and contains lots of useful information.

Zigbee channel overlap

The next step would be considering a migration to Zigbee2MQTT, and meant going down a rabbit-hole in the world of zigbee firmwares and chipsets. I wasn’t really in the mood for that yet. In fact, in recent years I’ve been trying to reduce complexity. Think for example of phasing out influxdb and grafana because Home Assistant’s energy dashboard and statistics storage have improved so much. So if I have to choose between deCONZ, ZHA and Zigbee2MQTT, ZHA would be my preference.

Moving the Coordinator

In ZHA you can download “Diagnostics”. Download diagnostics In the json you get, you’ll find an energy scan per zigbee channel. This works much better than setting your own wifi and zigbee channels to not overlap using the diagram above. After all, you also have neighbors and other sources of interference. As was evident from my energy scan. Of course I had already tried putting the USB stick higher in the meter cabinet, as far away as possible from other equipment. But the meter cabinet remained a problem, so it was time to move things from the meter cabinet to the attic.

    "energy_scan": {
      ...
      "20": 99.244260188723507,
      "21": 92.0598007161209,
      "22": 94.48255331375627,
      "23": 92.0598007161209,
      "24": 78.25348754651363,
      "25": 95.26028270288712,
      ...
    },

I moved the server to the attic, and the energy scan already looks much better.

I hadn’t used ZHA over-the-air firmware upgrades for my zigbee devices before, since the UI shows a big warning that this requires a stable network. At this point I decided to try it, and it worked really well for 6-7 devices (battery powered)! 1 device needed a couple attempts, but otherwise no issues. It seems we’re making progress!

Side quest: P1 over Ethernet

The only reason why my small server needs to be in the meter cabinet is because it’s connected to the P1 port of the smart meter. Nowadays there are ethernet solutions available for this. For example this one from Marcel Zuidwijk.

It’s 5 euros cheaper than the ESPHome-based version and in hindsight I probably should have gone for that second one. The first one contains some Chinese software, and I personally prefer open firmware. Anyway, put it on its own VLAN that I also use for Chinese robot vacuum cleaners for example, and it works perfectly :).

Measuring Channel Energy Over a Longer Period

After moving the server to the attic, the channel energy values were lower, but flucuated. Because of this, it wasn’t immediately clear to me which channel would be smart to select. To properly determine which channel number would be smart to use, we actually need to know these energy values over a longer period of time.

With the following yaml configuration you create a rest sensor in Home Assistant that calls its own API to make these energy levels available in a sensor. Look up your config_entry by looking at the URL in your web browser when visiting your ZHA integration. And you can create your Bearer token in your user in Home Assistant under “Security” and then “Long-lived access tokens”.

In my environment it seemed like Home Assistant, ZHA and/or the coordinator briefly blocks when it collects the diagnostics. The following code does this every minute. While this runs it can therefore cause some delay on the network each time it retrieves the diagnostics.


  sensor:
    - platform: rest
      unique_id: zha_energy
      name: zha_energy
      unit_of_measurement: "%"
      timeout: 30
      resource_template: http://ha.home.netk.nl:8123/api/diagnostics/config_entry/b17ea026bbb888d77f3e1ebbb9189cf9
      headers:
        Authorization: >
          Bearer eyJh...[snip]...7dPg
      value_template: >-
        {% set channel = value_json.data.application_state.network_info.channel %}
        {% set energy_scan = value_json.data.energy_scan %}
        {{ energy_scan[channel|string] }}
      json_attributes_path: "$.data.energy_scan"
      json_attributes: ["11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26"]

Then you can display this in Grafana for example. This makes it easy to determine which channel number works well.

Adding a Router

Oh, but I have enough routers… Right? Every mains-powered node acts as a repeater for your mesh. Yet one router is not the same as another. In the network overview in ZHA, many red lines were visible. This was especially the case with the IKEA Trådfri LED strip drivers.

By now, a very popular device has entered the market from SMLight, the SLZB-06M. My main reason for using this is the signal strength this capable of. If this YouTuber can cover 80 meters with it, it should work well in my home too, right?

I’m a huge fan of this device. It can be powered by USB-C or POE, manufactured in Ukraine, costs only 35 euros on Aliexpress, and contains very good firmware (Home Assistant support out-of-the-box).

On the hardware side, it uses an ESP32, has optional support for ESPHome if you want, and the Zigbee chip is the same one used in the Sky Connect / ZBT-1 (Silicon Labs ERF32MG21). You can flash your own firmware of choice on the Zigbee chip, or choose from the latest router or coordinator firmware from Silicon Labs. Even Matter-over-thread support is available. It’s also easy to migrate from the ZBT-1 to the SLZB-06M with full support from Home Assistant.

This device comes in several other variations, like the SLZB-06 (without the M) that contain a Texas Instruments CC2652P chipset that’s well supported in Zigbee2MQTT for years. But since early 2025, Zigbee2MQTT also has promoted support for the Silicon Labs chipset from experimental to officially supported.

Enable Source Routing

ZHA uses something called Table Routing to determine routes in the network. This relies on router devices in your network to maintain a table of nodes they can see. Since it’s generally known that not every manufacturer implements this the same way and some simply don’t work well, it’s basically asking for trouble. Combining old Hue lamps with Xiaomi battery-powered sensors? Guaranteed issues.

The alternative is Source Routing (this is also used by default in Zigbee2MQTT). Here, the coordinator determines the route through the network. This also results in less broadcast traffic.

Cem Basoglu wrote a fantastic article about this, and enabling it is super simple. Also, little can go wrong: If the network doesn’t stabilize after a few hours and re-pairing difficult sensors doesn’t work, you just turn it off again.

  zha:
    zigpy_config:
      source_routing: true

After enabling it can take several hours before routers stop updating their tables and the network really starts relying on Source Routing. So expect nodes to become unreachable for a while, or sometimes require re-pairing to work properly on the network again.

Zigbee traffic sniffing

Sensors still kept falling off the network occasionally. Sometimes it was just an empty battery :facepalm:, but often problematic nodes that kept dropping off.

That same Cem wrote another fantastic article about how to sniff network traffic, allowing you to see exactly where things go wrong. By purchasing another SMLight SLZB-06M, installing ember-zli (see Cem’s article and the project’s Readme), entering some encryption keys from my network in Wireshark, I quickly had Wireshark seeing zigbee traffic on my network.

I took a Xiaomi sensor that Home Assistant had marked as unavailable. Instead of re-pairing it as usual, I checked with Wireshark what was going on. (The node’s address can be found in Home Assistant under the device as NWK address). First, I wanted to see if I could spot anything on the network by waking up the sensor. It was a vibration sensor, but tapping it did nothing. Briefly pressing the button did spawn some traffic on the network.

Ok, first packet is a Rejoin request. Cool. However, there’s another node on the network that clearly isn’t friends with that sensor anymore… Leave??? Come on, be nice…

That other node turns out to be an Innr E14 bulb… on the other side of the house?? Uhm.. that node did have issues before, and restarting it (disconnected power to the bulb) has helped more than once to get other sensors back. So, I restarted the bulb, pressed the button on the Xiaomi vibration sensor again (so, no resetting), and lo and behold!

This time the other SLZB-06M (acts as router in the network) responds to the Rejoin request, and the sensor manages to rejoin the network without a reset / re-pair!

Nice, so this network sniffing is useful in finding problematic routers in the network. I made a mental note to replace that light bulb in the future.

Current situation and next steps

I now have 2 SLZB-06Ms in my house. One as a router, and the other as a sniffer. Eventually I want to migrate the Sky Connect / ZBT-1 that currently functions as coordinator to one of these, because they’re so much easier to place around the house with POE. I’m also going to replace problematic routers with better routers (that Innr lamp), because I currently think this is the main cause of sensors disappearing now and then. Additionally, there are sometimes sensors that Home Assistant marks as unavailable, but are actually still working: These are moisture and motion sensors that reach Home Assistant fine when detecting something and come back online. I’m not sure yet what to do with these.

At this point however, I’ve learned something interesting, it works stable enough and there are clear diagnostic steps to take when problems occur. For now, this is a solved problem.

Leave a comment