Can Google charge for a service it mostly gives away — and that doesn't always work? That's the experiment it's conducting with Google Apps. Gmail, the email service at the heart of Google Apps, went down three times earlier this month, and Google has sent a note to customers who pay for its "Premier Edition" — typically colleges and small businesses. As Fortune notes, Google hasn't had much success breaking into the large business accounts where Microsoft rules. The tone of Google's apology speaks volumes. It's mostly apologetic, but there are overtones of Stanford-comp-sci huffiness:
As is typical of things associated with Google, these outages were the subject of much public commentary.
What are the Googlers are trying to say here?The implication is that Googlers would be happier if we all didn't talk about their services going down. It's a typical geek reaction — "Stop yelling at me long enough for me to fix the problem!" — but it's downright offensive to paying customers. And there's the problem. Google's not used to having paying customers, other than advertisers, on whom it relies for 99 percent of its revenues. Its business model has long been to provide services for free to Web users, and pay the bills with ads. The letter to Google Apps users goes on to claim:
While we're passionate about excellence, we can't promise you a future that's completely free of system interruptions.
The latter part's a truism — systems inevitably go down. Ex-Googler Sergey Solyanik disagrees with that first part. In a blog post explaining why he returned from Google to Microsoft, the development manager wrote:
Google as an organization is not geared — culturally — to delivering enterprise class reliability to its user applications. The culture part is very important here — you can spend more time fixing bugs, you can introduce processes to improve things, but it is very, very hard to change the culture. And the culture at Google values "coolness" tremendously, and the quality of service not as much. At least in the places where I worked.
Which is exactly why Google would just as soon not have "public commentary" about its outages. They don't just suggest Googlers aren't as supremely competent as they'd like you to think. They don't just suggest Google's Web-computing infrastructure isn't as world-changing as the pundits think. The breakdowns raise larger questions about Google's culture, business strategy, and management. And Google can't afford those questions. The sniffy mea culpa:
From: Google Apps Team Date: Wed, Aug 27, 2008 at 6:44 PM Subject: August SLA Credit for Google Apps Premier Customers We're committed to making Google Apps Premier Edition a service on which your organization can depend. During the first half of August, we didn't do this as well as we should have. We had three outages - on August 6, August 11, and August 15. The August 11 outage was experienced by nearly all Google Apps Premier users while the August 6 and 15 outages were minor and affected a very small number of Google Apps Premier users. As is typical of things associated with Google, these outages were the subject of much public commentary. Through this note, we want to assure you that system reliability is a top priority at Google. When outages occur, Google engineers around the world are immediately mobilized to resolve the issue. We made mistakes in August, and we're sorry. While we're passionate about excellence, we can't promise you a future that's completely free of system interruptions. Instead, we promise you rapid resolution of any production problem; and more importantly, we promise you focused discipline on preventing recurrence of the same problem. Given the production incidents that occurred in August, we'll be extending the full SLA credit to all Google Apps Premier customers for the month of August, which represents a 15-day extension of your service. SLA credits will be applied to the new service term for accounts with a renewal order pending. This credit will be applied to your account automatically so there's no action needed on your part. We've also heard your guidance around the need for better communication when outages occur. Here are three things that we're doing to make things better: We're building a dashboard to provide you with system status information. This dashboard, which we aim to make available in a few months, will enable us to share the following information during an outage: A description of the problem, with emphasis on user impact. Our belief is during the course of an outage, we should be singularly focused on solving the problem. Solving production problems involves an investigative process that's iterative. Until the problem is solved, we don't have accurate information around root cause, much less corrective action, that will be particularly useful to you. Given this practical reality, we believe that informing you that a problem exists and assuring you that we're working on resolving it is the useful thing to do. A continuously updated estimated time-to-resolution. Many of you have told us that it's important to let you know when the problem will be solved. Once again, the answer is not always immediately known. In this case, we'll provide regular updates to you as we progress through the troubleshooting process. In cases where your business requires more detailed information, we'll provide a formal incident report within 48 hours of problem resolution. This incident report will contain the following information: a. business description of the problem, with emphasis on user impact; b. technical description of the problem, with emphasis on root cause; c. actions taken to solve the problem; d. actions taken or to be taken to prevent recurrence of the problem; and e. time line of the outage. In cases where your business requires an in-depth dialogue about the outage, we'll support your internal communication process through participation in post-mortem calls with you and your management team. Once again, thanks for you continued support and understanding. Sincerely, The Google Apps Team