Subscribe

RIM faces angry users

Bonnie Tubbs
By Bonnie Tubbs, ITWeb telecoms editor.
Johannesburg, 14 Oct 2011

The widely publicised outage suffered by millions of BlackBerry users, spanning five continents this week, has reportedly come to an end, and the smartphone's maker has finally addressed the public about the incident.

BlackBerry maker Research In Motion (RIM) has come under fire this week, following a pervasive failure of its BlackBerry Internet Services (BIS) that started on Monday at around 11am.

In what RIM has confirmed to be its largest outage ever, many of its customers in Europe, Middle East, Africa, India, Brazil, Chile and Argentina were unable to browse the Web, use instant messaging or access core Internet functions including e-mail for the greater part of this week.

While outrage and frustration abounded, RIM failed to inform its inquiring customers, as well as the media, as to the cause of the problem or a projected resolution. Already fighting to preserve its share in the context of a burgeoning market, the outage and subsequent evasion of culpability saw the Canadian smartphone company rapidly losing traction.

Too little too late?

Yesterday saw RIM scrambling to execute damage control, beginning with a live message to its customers at around 12pm GMT from RIM founder and co-CEO Mike Lazaridis. Broadcast on its YouTube video channel and Web site, the message expressed a sincere apology to all BlackBerry customers.

“Since launching BlackBerry in 1999 it has been my goal to provide reliable, real-time communications around the world. We did not deliver on that goal this week, not even close. I apologise for the service outages this week. We have let many of you down. You expect better from us and I expect better from us,” said Lazaridis.

At that stage, he added, it was too soon to say the issue was fully resolved.

At 4pm GMT, RIM's executive team, Lazaridis and co-CEO Jim Balsillie, together with the company's CTO of software, David Yach, officially went public with a BlackBerry service update conference call. The trio addressed the public, outlining what happened, what the company did to fix it, and responded to media enquiries.

“The systems are up globally,” said RIM, offering a sincere apology to all its customers for its “inability to quickly fix [the problem]”.

Lazaridis said the protracted outage started with a hardware failure on Monday. “[The hardware failure] caused a ripple effect in our system. A dual redundant high-capacity core switch, designed to protect the infrastructure, failed and caused outages and delays... this caused a cascade failure in our system. There was a backup switch, but the backup didn't function as intended and this led to backlog of data in the system. The failure in Europe in turn overloaded systems elsewhere.

“When we restarted the system based in Europe, the data queue processing took much longer than we had expected to restore to our standard service levels.”

When questioned about the tardy response to its customers and the media, Lazaridis hummed and hawed, saying he had decided to make the video (mentioned above) at that juncture, because “[RIM had] made a lot of progress and I could afford the time to actually produce a video”.

Balsillie promptly intervened, offering to “speak for Mike”. He said Lazaridis was in direct command of the teams that were implementing the restoration of services and “nobody has gone home since Monday”.

The conference call did not, however, indicate with finality what the essence of the outage was. “We don't know why the switch failed in the particular way it did and did not fail over to its redundant pair.”

The full conference call can be accessed at http://www.rim.com/newsroom/service-update.shtml.

Share