We’ve blogged a lot over the past few weeks about the technical requirements for Carrier Grade reliability in telecom networks. One recent post discussed the differences between “High Availability” and “Carrier Grade”, while a subsequent article outlined some of the key challenges that make it so hard to implement true six-nines (99.9999%) reliability in a telecom platform.


Given both the technical challenges and also the licensing costs associated with delivering Carrier Grade reliability, sooner or later a smart Finance person is going to ask whether the benefits actually outweigh the expense. So it’s interesting to look at some numbers that unequivocally tell us the real question is not “Can you afford to implement Carrier Grade reliability?” but rather “How could you possibly afford not to?”

 

Let’s start with the numbers that no one can dispute. The traditional standard for telecom network reliability, measured at the service level, is six-nines or 99.9999%, which translates to an average 32 seconds of downtime per year, per service.


As service providers plan for the progressive deployment of Network Functions Virtualization (NFV) in their networks, they inevitably consider the use of standard enterprise-class virtualization software for use in their NFV infrastructure. The standard reliability guarantee for such enterprise solutions is three-nines (99.9%), which implies 526 minutes of downtime per year.


So as a service provider, what will 525 minutes actually cost you, compared to 32 seconds?


I saw a report recently that estimated the cost of downtime as $11,000 per minute, per server. This represents the revenue that’s lost as a result of Service Level Agreements with customers (mostly high-value, enterprise users). If each server is down for 526 minutes per year, that’s an annual cost of $5,780,000 per server. We’ll call that $6M per server to keep the math easy.


Now, how many servers do you have in your data centers, delivering the services that your customers depend on? If you have 1,000 servers then your total annual cost of downtime is 1,000 x $6M, or 6 Billion Dollars.


And just to make it worse, that $6B represents only the revenue that’s lost as a result of customer SLAs. It doesn’t include the long-term revenue impact caused by some of those customers switching to other service providers who are promising them the service uptime that they need.


To complete the analysis, the corresponding revenue loss for service outages in a network with true six-nines Carrier Grade infrastructure will be $5,780 per server, for total of 6 Million Dollars, if we again assume 1,000 servers in the data center. That’s not negligible, but it’s only one-thousandth the cost of the first scenario and customers are much less likely to switch providers if they experience such low average downtime.


So, putting ourselves in a service provider’s shoes and assuming our hypothetical 1,000-server installation, the tradeoff is clear. We can base our NFV infrastructure on enterprise-class software designed for IT applications, incurring 6 Billion Dollars in lost revenue because of our SLAs while risking customer defections because they can’t tolerate the downtime. Or we can implement Carrier Grade infrastructure that delivers the reliability that customers been conditioned to expect, in which case downtime should only cost us 6 Million Dollars and we can expect to retain our high-value customers.


What do you think? Are these numbers realistic? We’d be delighted to head from readers with more visibility into the true cost of downtime.