The core of the system on the software side is a (python) library that makes it easy to publish and find attributes by name and type.  If you have a widget with a couple of switches, a light, and a dial, you simply publish four values to the home-automation network: two read-only booleans (the switches), a write-only boolean (the light), and a read-only real (the dial).  By convention, these are named in a path hierarchy, like "widget/switch1", "widget/dial", and so on.

The library takes care of all the dynamical aspects.  When someone turns the dial, your (driver) code just calls dial.set_value(v), and that's it.  If nobody is currently listening to the dial, nothing happens.  If one or more tasks anywhere on the network are listening (whether in the same application, or on a different machine entirely), they are notified (once!) that the value has changed.  When and if they care to know, they ask for the latest value and get it.  This is a little slower in terms of latency (on a scale irrelevant to humans) but wildly more efficient than the usual approaches in terms of total bandwidth, flow management, and all that practical crud that usually makes this stuff a headache.  And again, all of this happens in the library--neither end has to worry about it.

If you wanted that widget to control your bedroom lamp brightness, the code would look like this--and you could run this code on any machine on the network, regardless of where the code managing the widget or lamp were running (but it could all be in the same process just as easily):

lamp = ha.find("master_bedroom/north_lamp")
dial = ha.find("widget/dial")
async for val in dial:
    await lamp.send_value(val)

That for loop runs forever, quietly waiting for the dial to be turned, and then sets the lamp accordingly.  If the dial generates five values in the time it takes the lamp to receive and acknowledge the value, four of them are never even sent over the network (because they went stale before anyone needed them!).  And to be clear, the "async for" automatically sets up a subscription to the dial's value stream in the background, and if you exit the for loop, the subscription is closed.

Note it doesn't matter here whether the lamp or dial are on zwave, Shelly, custom, or whatever.  The above works even if the lamp or dial don't even exist at the time the code is run: the async for will wait for the dial to come online.  If you smashed the (zwave, say) widget and replaced it with a new custom device with new code on the driver side, the above loop doesn't even need to be restarted -- it will simply wait while you're changing the devices, and resume when the (new) widget/dial comes online.

Of course the interface to the dynamic values is much richer than just async iterators, but it's all pretty straightforward and obvious like that.

It would be very easy to add a nice GUI layer for a code-free experience, such as being able to simply visually connect  the "widget/dial" port to the "master_bedroom/north_lamp" port via a "wire" or whatever.  And that could run as a separate process with no need to modify any of the existing code (wouldn't even require restarting any processes besides the GUI itself).  But so far it's been so easy to code anything I've wanted I haven't needed that personally.

I did write a "monitor" which is like a hierarchical file browser for the entire HA network.  It's less than 1000 lines of python code and provides a curses-based interface which updates in real time as well as letting you set/change any settable values, graph historical plots, and so on.  This is a totally generic browser, non-specific to my particular network or devices.  Here's a snapshot of mine right now with a few things randomly opened (the ">" on the left show where the hierarchy can be opened).  Note this is just an administrative tool, not really intended to be a primary UI,...

Read more »