For many telco operators, testing can seem like an onerous requirement. It’s often costly and time consuming, and as telecom networks grow more complex and customer use cases and devices become increasingly fragmented, verifying service with any level of confidence is harder than ever.
Because of this high degree of complexity, testers need to achieve higher test coverage than ever before in order to maintain network quality—resulting in the relatively widespread adoption of end-to-end testing among those in the industry. Rather than testing voice protocols and Wi-Fi connectivity from a handful of user devices, testers are walking through entire systems and subsystems in the ways that users are likely to do.This has the potential to improve testing quality and coverage, resulting in improved quality of service, but it also presents some questions: in an era of increasing use case complexity, how can end-to-end testing be performed in an efficient and scalable manner?
Is end-to-end testing enough to meet service verification needs in the modern era, or are there additional steps that businesses could be taking to better address latent service issues?
The Case for End-to-End Testing
Let’s back up a step: end-to-end testing is a fairly common buzzword, but what does it actually mean? In software development, end-to-end refers to the verification not just of functionality within the app or program itself, but with regard to the apps, programs, and interfaces that it ultimately interacts with.
From a telecom perspective, this makes an intuitive kind of sense: an end user on his or her smartphone will variously access 4G, for instance, not in an abstract way, but using particular applications. Perceived service quality will depend on how well those apps function.
Even if your 4G service, or your VoIP, or mobile broadband, or anything else that you might be offering as a service provider is ostensibly “strong,” a bug or malfunction that occurs somewhere between the cell tower and Venmo can still convince a user that your network is low quality.
So, in one sense, end-to-end is a no-brainer for telcos. On the other hand, it does necessarily increase the scope of your functional testing. Your typical test engineer can only get through half a dozen use cases per day on average, which means that a test framework designed around developing a large number of use cases for verification is going to be so time and labor intensive as to seem utterly overwhelming.
This tends to lead testers away from true end-to-end testing and into more efficient forms of service verification that can be scaled up more easily. Sure, it’s possible to automate an end-to-end framework in such a way as to maintain integrity while improving your test coverage
But in many areas it’s more common to see the use of simulated tests or tests on rooted devices standing in for real end-user devices, resulting in disconnect between lab conditions and actual network usage.
If end-to-end testing stretches from the first moment of network usage for a given use case (e.g. logging into one’s email app while roaming on a 3G network) to the last (sending the email and getting confirmation of the send), it’s worth asking what, if anything, remains to be tested.
If the potential limitation of end-to-end testing that we sketched out above (i.e. that it’s too labor intensive) is the only issue that testers are likely to come across, then it seems like end-to-end per se—as opposed to the difficulty of truly implementing end-to-end frameworks—isn’t the problem. Thats certainly true to some extent; at the same time, however, it is possible to improve test coverage beyond end-to-end and gain value in the process.
Let’s look at a real-life example: several years ago, Google experimented with ways to reduce latency times for its users. It began with an end-to-end approach, redirecting users to whichever server was offering the lowest latency times at any given moment.
Strangely enough, they found that this approach didn’t yield the lowest possible latencies—in part because the queueing process for server allocation was causing slowdowns and in part because inflated latency times turned out to correlate more closely with particular interactions between the nodes than the conditions at the end points in particular.
Here, we see an example of what going beyond end-to-end could look like. Sure, this example isn’t squarely within the telecom domain, but it’s easy enough to imagine an equivalent case for a telco operator: in verifying voice functionality, for instance, you might look beyond the end-points (i.e. the two users having a phone call) and analyze signaling patterns, etc.
Many systems will continue to function even when one or more elements fail or run inefficiently, and this type of testing enables you to catch the kinds of failures or bug that aren’t immediately impacting functionality (but carry some latent risk).
At this point you might be thinking: “That’s all well and good, but test budgets are stretched thin as they are, and going beyond end-to-end hardly seems practical under current market conditions. Verifying service with current test coverage levels really has to cut it for now if I want a positive ROI.”
This is a reasonable objection. But let’s take it one step further: you can’t really stick with manual testing at all if you want a positive ROI, because reaching a critical mass of test coverage is becoming increasingly time consuming. If you want end-to-end testing, you need to automate your testing framework.
By automating your tests (using out-of-the-box devices), you can increase the number of uses cases verified each day from dozens to hundreds, resulting in better test coverage and reduced costs in the form of reduced person-hours. In this way, moving beyond end-to-end starts to become feasible: since tests can be performed quickly, you can essentially assume that your ROI goes up as your coverage improves (since better coverage should continue to improve your quality-of-service).
Thus, you put yourself in a position in which it’s feasible to analyze not just end-point results but signaling, alarms, logs, etc. in order to gain an even better command of what’s happening on your network.
Not only does this improve your odds of uncovering any latent bugs, it puts you in a position to create granular documentation that will cover not just test cases by mission critical data about the health of your services. In this way, testing beyond end-to-end becomes a strategic investment in short- and long-term quality-of-service improvements.