When I say lose control of the performance, I mean that despite everything that has been done to ensure scalability and capacity, the Web is inherently an infrastructure that is out of anyone's direct ability to manage.
This is something that needs to be accepted. And while the datacenter is only that part of an application/infrastructure/network that can be directly managed by the Web site's owners, a company has to accept that the real datacenter is the Internet. Not a datacenter that is on the Internet; the Internet as the datacenter.
Now that your head is spinning, let's step back and consider this idea for a minute. The whole concept of the Internet being the datacenter makes operations and IT folks very uncomfortable. Why? There is no way for one company to manage the Internet. As a result, the general perspective is that the Internet can't be trusted, and all that can be done is manage what can be managed directly.
Ignoring the Internet allows many organizations to leave the entire Internet out of their application or performance planning. They will measure and monitor, and they may even employ third-parties to help improve performance. When the shiny exterior is peeled back, it's pretty clear that these organizations have built their entire performance culture on the assumption that if a problem exists on the Internet, there is nothing that can be done by them to fix it.
This may be effectively true. And it is not positive way to ensure effective Web performance
Having a what-if, emergency response plan in place is never a bad idea. If a problem appears on the Internet, and it affects your Web site, what are you going to do about it? Whine and moan and point fingers? Or take actions that effectively and clearly communicate to customers the steps you are taking to make things right?
Wait. Managing the Internet through customer communication?
I argue that besides working feverishly behind the scenes to resolve the problem, customer communication is the next most critical component of any Web performance issue management plan.
Web performance issue management plan. You have one, don't you?
Well, when you get around to it, here are some concepts that should be built into the plan.
Effectively monitor your site
How can measurement and monitoring be part of issue management? Well, isn't it always good policy to detect and begin investigating problems before your customers do?
Key to the measurement plan is monitoring the parts of your application that customers use. A homepage test will not give you vital information on issues with your authentication process, and is the same as saying the car starts, while ignoring the four flat tires.
If you aren't effectively monitoring your site, your business is at risk.
Measure where the customers are
If your organization is focused on what it can control, then it will want to measure from locations that are controlled, and can provide stable, consistent, repeatable data.
Hate to break this to you, Sparky, but my Internet connection isn't an OC-48 provisioned through a large carrier with a written SLA. Real people have provider networks that are congested, under-built, and deliver bandwidth using the old best effort approach.
Some customers may have given up on wires altogether, and access the site through wireless broadband or mobile devices.
Understand how your customers use your site. Then plan your response to managing the Internet from the outside-in.
Test with what your customers use
The greatest cop-out any Web site can make is Our site is best viewed using...
I'm sorry. This isn't good enough.
Customers demand that your site work the way they want it to, not the other way around. If a customer wants to use Safari on a Mac, or Chromium on Linux, then understanding how the site performs and responds with these browsers is critical.
The one-browser/one-platform world no longer exists. If a large number of customers with one particular configuration indicate that they are having a problem with the new site, what is the proper reaction?
And why did this happen in the first place?
Monitor and respond to social media
No, this isn't just here for buzzwords and SEO. In the last year, Twitter and Facebook have become the de-facto soapboxes for people who want to announce that their favorite site isn't working. Wouldn't hurt to monitor these sites for issues that might not be detected by traditional performance monitoring.
This approach means that you have to be willing to accept responsibility when something affects your site performance or availability, even if it isn't your fault. No need to tell folks exactly what the problem is, but acknowledging that there is a legitimate issue that you recognize will go a long way toward making visitors/customers more understanding of the situation.
Get your message out effectively
Communicating about a performance issue means that the Marketing and PR teams will have to be brought in.
What? Marketing and Operations/IT working together? Yes. In a situation where there is a major outage or issue, Marketing will DEMAND to be involved. Wouldn't it be easier if these two parts of the organization knew each other and a plan for responding to critical performance issues?
If Marketing understands the degree of the problem, what it will take to fix, and what is being done about it, they can craft a message that handles any question that might come in, while acknowledging that there is an issue.
A corollary to this: If there is an issue, don't deny it exists. Denying a problem when it clear to anyone using the site that there is one is worse than saying nothing at all.
Practicing effective Web performance means a company understands that directly managing the Internet is impossible, but having a process to respond to Internet performance issues is critical. A Web performance incident plan shows that you understand that stuff happens on the Internet and you're working on it.