Beyond the Phy - Best Practices for Testing Wi-Fi Routers, APs, and Mesh Systems

15 min read

A guide for building best-in-class Wi-Fi products in a highly competitive market

Watch our webinar with Matt Langlois covering this same topic here!

As new 802.11 technologies address coverage, efficiency, and security issues, Wi-Fi routers and APs must support a web of configuration and feature options while handling complex applications and networking technologies. Testing and validation of performance, stability and interoperability greatly improves the success of your products and services. In this guide, learn what to test, how to test it, and how to fit it into your development and deployment processes.

The Wi-Fi Situation Today

Wi-Fi is now critical infrastructure

The pandemic of 2020 radically changed the way we live, work, and seek entertainment. Wi-Fi is now considered critical infrastructure for consumers, enterprises, and public services. For end-users, working from home or attending school via web conference has imposed new challenges - and expectations - for the security, performance, and reliability of wireless networks.

Moreover, the number of connected devices in the home has increased significantly in the last 10 years. Parks Associates estimates that the average home will have 20 devices connected over Wi-Fi by 2025. With an explosion in the number of computing platforms, smart devices, and the IoT, consumers are dependent on their Wi-Fi network. They have high expectations for performance, stability, and availability. They want Wi-Fi to be available all the time on any device.

These expectations put additional pressure on operators deploying Wi-Fi gateways, access points, repeaters, and mesh systems. To the end-user, the Wi-Fi connection is the Internet, and when Wi-Fi connectivity suffers, the perception of the end-user is that they have bad broadband service from their provider. Indeed, for smart home devices alone, 40% percent of consumers reported Wi-Fi problems, with most of them preferring intensive and expensive phone support.

In a time when this connectivity is critical, Wi-Fi performance affects everyone - schools, businesses, providers, and consumers.

Ensuring quality Wi-Fi is difficult

The IEEE 802.11 specification and the series of advancements that have improved upon it over the years are among the most revolutionary technologies of our time. Despite this good work by the networking industry, Wi-Fi connectivity remains a challenge to this day.

It’s technologically challenging

Wi-Fi deployments are challenging for several reasons. Airwaves are a messy, inconsistent, difficult environment. Homes, apartments, and businesses present radically different environments when it comes to walls, coverage size, and neighbor interference. Different countries have different building designs and regulations that require different Wi-Fi solutions to be provided by each supplier, compounding the problem.

In addition, IEEE 802.11 has been around for more than 30 years. It has gone through many changes and amendments, but legacy devices still exist all over the world. Old routers and APs that don’t support new technology are still deployed. Consumer electronics from decades ago, especially gaming consoles, are still in use. Consumers expect Wi-Fi to ‘just work’ with everything they want to use, creating big interoperability challenges for chipset vendors, system integrators, and operators.

Consumers aren’t experts

The Wi-Fi consumers themselves present their own complications to successful Wi-Fi. Without a dedicated IT team or managed service provider serving an enterprise, end-users are on their own to set up, troubleshoot, and maintain their Wi-Fi networks. Network consumers are not experts, and they shouldn’t have to be.

But with access to devices and user interfaces, end-users may make changes to Wi-Fi that are non-ideal in an attempt to maximize service or fix perceived issues. Wi-Fi access products are usually placed in the home for aesthetics rather than function. This is in addition to consumer demand for powerful, fully-featured products at low cost.

The Wi-Fi universe is vast

To meet the most complex environments’ demands, Wi-Fi is designed to be extremely flexible and adaptable. Because of this, however, the number of options and features available to any Wi-Fi deployment is vast, requiring devices to support almost any mode of operation to provide seamless, interoperable connectivity. These options are non-trivial and include:

  • Security features - Wi-Fi supports many security suites - WEP, WPA/2/3, etc. Even WEP (Wired Equivalent Privacy), a thoroughly weak form of Wi-Fi security, can still be found in legacy deployments. This also includes mixed-mode support, creating a complex interoperability scenario.
  • Frequency bands - With today’s advanced wireless technology, Wi-Fi routers and APs can operate in the 2.4, 5, and even 6 GHz (with Wi-Fi 6E) spectrum range. With band steering, 802.11 radios will dynamically select which frequency band will provide the best connectivity, which can be a source of interoperability issues.
  • Channel configuration - Within a given frequency band, Wi-Fi radios will select a given channel for transmission. This involves many options, including channel width in Mhz (20, 40, 80, 160, 80+80, etc.) and the number of spatial streams used. This also depends on regional regulations, spectrum availability, and allowed transmission power, making Wi-Fi products very complex devices. But it doesn’t stop at the radio

Even with all of the complexity of wireless technology itself, Wi-Fi products must correctly implement and handle dozens of different protocols, standards, and advanced features that consumers demand. It is the higher layers that define the networking functionality - and performance - of Wi-Fi products. If any of them do not work correctly or fail in the long term, the end-user is the one who feels the pain.

Only testing the physical layer is not enough

Much emphasis is placed on Wi-Fi physical layer testing as a key indicator of how good a product is. This is because physical layer performance is the metric that many find interesting and use as a differentiator. Wi-Fi performance is the thing that gets published and talked about.

Wi-Fi physical layer testing is very important and must be done. It ensures that all 802.11 physical layer features work as expected and that the hardware can achieve the expected performance numbers in controlled environments. However, so much more goes into a system that simply testing for physical layer performance will not give you a complete picture of the end-user experience. Validation of core device features and functionality, which have just as much impact on end-user experience, is often an afterthought.

Ultimately, Wi-Fi products must be treated holistically. Black box testing of the complete system, as a finished product, is critical to understanding the user experience. Repeatable, consistent, fully automated testing of Wi-Fi products, from the phy to the app layer, will transform that complexity into happy end-users.

What should you test in Wi-Fi products?

The functionality of protocols and features

Functional testing is a term used to describe tests that exercise the features, applications, and networking protocols that a product supports. This includes code and interfaces that are intended to do something specific. Functional testing for Wi-Fi routers and APs should include:

  • Wi-Fi specific connectivity tests - It’s important to test the basic functions of 802.11 connectivity itself and test it in ways that simulate both end-user behavior and high-stress edge cases. Testing Wi-Fi beacon validity, testing band-steering, stress testing frequent association and dissociation, and validating the operation of management protocols like TR-069 or USP will help build a solid product.
  • Wi-Fi Security Modes - As mentioned above, Wi-Fi technology has a lot of features and options. This includes security modes like WPA/WPA2/WPA3-Personal and Enterprise, WEP, mixed modes, various key management and cipher suites, protected management frames (PMF), etc. Testing and verifying that devices can successfully connect, correctly negotiate modes, and take advantage of the security provided by these protocols. You should also make sure that you test even very old technologies like WEP, as they still exist in legacy devices and deployments.
  • Core connectivity protocols and applications - Every networking device that operates above layer 2 must implement a plethora of protocols and applications to stay connected to the WAN, maintain the LAN, and allow endpoints to connect and operate as expected. This includes fundamentals like IPv4 and IPv6 on the LAN and WAN; their associated auto-configuration protocols (DHCP, DHCPv6, SLAAC); WAN modes like PPPoE; IPv6 transition and tunneling technologies such as 6rd, DS-Lite, MAP-T, and MAP-E; core protocols like ARP, DNS, and IP Multicast; and application protocols like HTTP/HTTPS, SIP, and IPSec.
  • Advanced features and services - Testing should also focus on a product’s core differentiators in demand by operators and end-users. Security features like fingerprinting, parental controls, and content filtering are included here and advanced Wi-Fi specific features like band steering and roaming.

Strategies for Functional Testing

Know your device(s)

Catalog and understand the features of the products you are testing. What are the most important features? Which protocols are fundamental to their successful operation or product differentiation? Select a baseline set of tests

Based on the core functionality of your product, prioritize the things that are most important to you. Choose a set of tests that will give you a good balance of test coverage and the duration of the overall test run. Focus on the basics. While it is important that all features and protocols supported by the device work as expected, basic connectivity protocols are used most frequently and have the biggest impact on user experience. These have to be solid and reliable, all the time, in all configurations.

Investigate & understand failures

After your initial test run, use your tools to investigate and understand any failures that arise. A bug in your product might not cause them; instead, they may be due to a configuration error, a bad state from a previous test, or random packet loss. Networking is complex, so don’t hesitate to ask questions and lean on the experts behind your testing tools for help!

Repeat failed tests to normalize

Automate your testing to run through your test cases multiple times. This will help normalize random failures and give you a better understanding of your product’s stability concerning feature behavior and performance.

Look for deltas

Run your baseline tests over all of the options in the Wi-Fi universe. Run them over different connection types, channels, and bands. Look for differences. While, in theory, the different layers of the networking stack should be independent of each other, this isn’t quite so clear-cut when it comes to actual implementations.

Scale up the number of clients

Lastly, when doing this testing, features that work well with a single client connected through your product may not work correctly with multiple clients. Run your association and dissociation tests, or DHCP assignment, etc., with two, four, sixteen, or even more clients making use of your product’s features all at once. This will illustrate edge cases that may not come to the surface until deployment - causing huge headaches later on.

Device performance at the application level

Performance numbers are important test metrics for a variety of reasons. They make it possible to quickly identify regressions and functional issues within a device and deltas across devices, software versions, or configurations. Good performance testing involves validating the entire system, and throughput exercises can be used to do this with the right combinations and procedures.

As with other testing, performance works best when a baseline is determined for comparative use. Finding a baseline and then using relative numbers will help you determine if performance suffers under different configurations or repeated, long-term user behavior affects performance or reveals memory leak or fragmentation problems.

If the environment is noisy, an isolation chamber can help determine your theoretical maximum, but baseline numbers can be determined in any environment. Fixed-rate performance tests (e.g., specific rates like 100, 250, or 500 Mb) allow you to achieve a known baseline for your comparisons in an environment that is more realistic than a clean environment. They are also good for scaling up throughput step-by-step to discover edge cases in device performance.

Lastly, testing different traffic profiles is extremely important. Often there are interesting deltas between types of traffic that will go unnoticed if you only ever test one traffic type, such as IPv4 or TCP. Test combinations of IPv6, IPv4, TCP, UDP, multicast, etc., and test them over your product’s different lower-layer connections and configurations in all directions (i.e., upstream, downstream, and bi-directionally).

Strategies for performance testing

Understand your environment

If you have an isolation chamber, use one, but if you don’t, it’s easy to test without one by setting reasonable pass/fail thresholds. Used fixed-rate throughput tests to find a baseline more easily.

Focus on relative numbers and look for deltas

Focus on relative performance using your baseline. While finding max throughput is useful, you want to know if performance changes over time or when running over different configurations.

Loop & repeat tests

Run performance testing many times over each configuration. You want to normalize your results to eliminate outliers and make sure all configurations and connection types are covered well.

Scale up the number of clients

Start with throughput for a single client and then run throughput tests divided among many different clients, all pushing traffic simultaneously. Scale up slowly and find out where your product’s choke-points might be.

Device stability over time

Testing protocols, features, and performance are all fairly obvious. What may not be obvious is how protocols and features interact and how those interactions can lead to unexpected or poor behavior and performance in other areas.

Stability testing is where functional and performance testing meet. This testing will expose protocol interactions that lead to performance issues and vice-versa, making them extremely valuable toward validating what the end-user will actually experience. This is why it is important to test protocols, features, and performance together and not just in isolation.

Stability testing of a Wi-Fi gateway is done by combining functional and performance tests into a single test run and looping or repeating these tests over long periods of time, looking at both the relative performance numbers described above and changes in pass/fail results for protocol and application tests.

Matt's notes:

"We've seen many instances where performance testing alone is perfect and seemingly stable over long durations - however, if you mix in a few DHCP or association tests, things go to hell after a day or two. Performance testing is the heartbeat or the canary in the coal mine - it is what alerts you to a problem. Functional testing is the trigger for bad behavior. They go together perfectly."

Doing this frequently will expose interesting issues and is critical for regression testing. Memory leaks or memory fragmentation may cause severe performance degradation over time, and stability testing will reveal bugs in design that would be missed without monitoring performance in the presence of long-term user behavior.

Strategies for stability testing

Mix functional and performance tests

Create test runs that combine your performance tests with the core functional testing that you perform daily. Choose a series of tests that will alternate between protocol interactions and throughput testing throughout the test cycle, simulating rigorous user behavior.

Run at night or over the weekend

Stability testing, by definition, is made for long-term test cycles. Prepare test runs that will loop many times and run over the course of a night or an entire weekend. Remember: you are looking for stability issues that a user might not experience for several days or more but that you can reveal in less time with more rigorous automated testing.

Record full packet captures and test logs

Stability testing reveals issues that are often hard to catch. Make sure you can easily look through logs and packet captures for these tests. When looping these test runs many times, look for deltas in the data. When does performance drop? When do tests fail? What was happening right before it occurred?

Do not reboot the Device Under Test (DUT)

Stability tests are all about showing what might happen in the real world. While you might reboot the DUT during your more frequent functional testing to provide a clean state, you want to “shake things up” for stability testing! The purpose of stability testing is to see how stable a device is over longer periods of time. Rebooting defeats this purpose.

Adding these practices into your product cycle

While this may seem like a lot of work for one product, developing modern, high-quality Wi-Fi products are not done through a single team or a single company. There are many people involved in the entire supply chain. Hardware and software developers, QA, system integrators, and operators all touch different parts of the end product, and different teams should focus on different kinds of testing.

During daily product development

Your development teams are relying on quick results as they write, change, and update their codebase. Have development focus on unit testing for protocol and application functionality, with test runs that resolve on a per-commit or per-build basis. Use tools that make it easy to integrate with CI/CD environments with clear pass/fail results for notification and issue tracking.

During quality assurance

After gaining baseline numbers, QA can rely on nightly builds to run performance testing and normalize result data with repeated functional tests. These results help narrow down to specific issues, generating logs and captures to share results with product development.

During system integration

System integration is interested in longer-term time blocks for testing with complete test coverage. It’s here that performance and stability testing can be run most effectively, adding loops that scale up the number of test clients and combining performance with functional testing to stress the product in the presence of user behavior. Including hardware logs, software logs, and packet captures in results will help others in the supply chain identify issues more quickly and enhance the overall test cycle.

You can find a more detailed overview of how automation and testing fit into the product lifecycle in our companion article.

Learning more about Wi-Fi testing

QA Cafe’s CDRouter is designed specifically for functional, performance, and stability testing all in one solution. Talk to our experts for more information about developing test packages, best practices, and how CDRouter can help you build better Wi-Fi products, faster!