You manage what you measure

If I think about what I like about Wi-Fi as a networking discipline I would say that it’s how layers 1 and 2 in our domain are so interesting. 802.11 is a fascinating protocol to study and we also get to practice RF engineering. I might be known to, from time to time, tease my datacenter teammates about how “cute” it is that their signals go through these copper wires and how it’s all deterministic and stuff. Must be nice…but I digress.

One of the more contentious areas of debate (see what I did there?) is regarding how we manage our RF space. There’s a contingent that advocates static channel and power is the way to go. And then there are vendors with their Radio Resource Management (RRM) algorithms, and some of us do use those. I use RRM, even if sometimes it needs to be slapped upside the head.

(Side note: Because I started my Wi-Fi journey in an Aruba environment I thought that “RRM” was a Cisco-specific name especially since Aruba called it “ARM” at the time (and now AirMatch) but it turns out that RRM is really a generic term. I mean there’s even a Wikipedia entry about it.)

Neither side is wrong – both approaches have their benefits. At the end of the day we’re all trying to accomplish the same thing – we’re trying to provide a great user experience and that requires your RF to be clean. But what does clean really mean? And how clean is clean enough?

When I started my current job I was tasked with choosing 3 Wi-Fi Key Performance Indicators (KPIs – a very “enterprise” sort of thing) to have on a dashboard. What were the 3 metrics that I thought would be the most important to represent to the Wi-Fi user experience? That was quite a challenge, and one that I don’t feel I’ve fully resolved even 2+ years later. I knew one that I wanted was average client MCS Index, but I also didn’t have a way to get it from my mostly-Windows fleet. And I still don’t. I do track and graph average client SNR. It’s not perfect, but it is readily available. (The other two metrics are AP Uptime and average clients per AP, by the way.) So one way I’m managing and judging my RF performance is based on the metrics I had access to, even if they weren’t the right ones.

So we come to the meat of something that’s been in my head a while: with all the time we spend worrying about and managing and designing RF how do we correlate that RF performance to user experience? Are we focusing on the right things? We have a lot of metrics about RF performance, but do they really help us improve the user experience?

If you were expecting an answer to the question I’m going to have to disappoint you. I don’t have one. I’m more proposing a topic for debate. But here’s why I’m thinking about this out loud: I think we spend a lot of time focusing on RF stats because we can get them, look at them, and understand what they mean. We are assuming the impact they have on user experience based on our understanding of the protocol and our own experiences but I don’t think there’s enough data out there to prove those assumptions.

Let’s take channel utilization as an example. It’s a GREAT set of RF metrics. You can look at AP duty cycle, how much time the channel is in use by other APs and their clients to see what the impact of CCI is, and yet no one can give me a data-derived value for what an acceptable level of channel utilization is. I understand that so much of Wi-Fi is more art than science, which is to say that it’s experience based, and so there may be no way to have a universal value.

I don’t want to sound like I “don’t believe in RF tuning” or something crazy like that. RF performance absolutely matters. If you let anyone’s RRM run with out of the box settings you’re going to have a bad day. You’ll see all the radio stats be bad, and your users will be frustrated, and yes you absolutely need to adjust things. All those great RF stats will guide you and help you understand what you need to fix. As those numbers get better your users will be happier.

And let’s keep in mind that everyone’s radio resource management (RRM) algorithms are designed for the “common case” scenario. The less standard your physical space is – the further you move away from drop-ceiling office land – the more help those algorithms are going to need to achieve a good result. If your environment is completely insane (I’ve got a building like that) even the strongest advocate of RRM might say “I’m just gonna turn that off…”.

But it does make me wonder something. We can spend a lot of time tuning and dialing our RF in to be as close to perfect as we can get but what is the ROI on that effort? Where is the point of diminishing returns? What does “good enough” mean? And can we define “good enough” in a way that reduces (but doesn’t eliminate perhaps) the need for hand-tuning? Because I’m pretty sure that for a lot of engineers responsible for Wi-Fi in enterprise settings that sort of manual tuning just doesn’t scale.

This may be mostly a data science problem for the various Wi-Fi vendors. Can they extract enough data from the systems we have to infer what the user experience is and then tie that to the RF metrics they already have? I know that it’s what just about everyone is working on, given the number of analytics platforms I’m seeing these days.

Right now, whether it’s an algorithm or manually, we’re all managing our radios the same way – based on RF parameters whose impact on user experience is difficult to quantify. Sure, I can say “I changed AP foo from channel bar to channel baz and channel utilization decreased by X%”, but what can you tell me about how that change impacted the users? Can you tell me how that improved their experience? Was it disruptive?

I know these are very hard, perhaps almost impossible, questions to answer. But that doesn’t mean they aren’t the right questions to ask. Right now we manage what we can measure but does that lead to the best results? If we focus more on measuring the user experience then should that data influence how we manage our radios? And if we did that, what would happen? Feel free to share your thoughts on this!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.