So Google’s GMail service was down yesterday, apparently due to some architectural issues that failed to isolate failed routers correctly. That follows major outages in February and May. Just ask a telco, a few hours of outages per month can very quickly not a few “nines” off your “ive-nines” uptime guarantees.
Yesterday saw AT&T slammed mightily by The New York Times for frequent outages, dropped calls and data hangs for iPhone users running on its 3G network. The iPhone is drastically changinghow consumers use their mobile devices — the story details a user running through a series of typical actions — checking baseball scores, sending a Twitter, finding a restaurant, calling up a map for directions — none of which involved a call (or thankfully even network-taxing video or music streaming.
The story includes an incredibly open admission by AT&T of the challenges it is facing:
“It’s been a challenging year for us,” said John Donovan, the chief technology officer of AT&T. “Overnight we’re seeing a radical shift in how people are using their phones,” he said. “There’s just no parallel for the demand.”
Success being its own worst enemy is a common them amongst popular communications services — Twitter outages, for instance, are quite legendary.
Old-time telcos limited outages due to ace engineering, utility-funded network overbuilds and the ability to run the PSTN as a truly private network, which meant no concerns about denial of service or other attacks.
Tomorrow’s IP voice services and cloud computing applications don’t have those benefits. And the results are ugly, and likely to get uglier. An SLA (service level agreement) is great, but constantly missing its guarantees (or in the case of mass market services, going down with no guarantees at all) is hardly the way to endear oneself to one’s customers.
Could dependability and uptime be tomorrow’s killer applications? Could be.
Is it more important for your service to be wildly useful, or available? How important on a relative basis is uptime to you at time when services can’t seem to guarantee the availability of networks and services? Let us know below.