Monday, March 30, 2009
Its performance gains and light weight were impressive. But its invasion of my system was uncalled for.
No matter which browser you set to be the default browser, Chrome 2.0 prevents that browser from regaining control. If you click a link in another program, Chrome launches, even if you set MSIE or Firefox or Safari as the default browser.
I thought that kind of behavior was behind browser manufacturers.
Saturday, March 28, 2009
In the 10 days since its public release, MSIE8 has made a run up the charts. Courtesy of the great folks at StatCounter and their public analytics data, this growing browser share for MSIE8 can be easily followed.
In the US, prior to its release, MSIE8 RC1 was in sixth position behind even the old battleship Firefox 2.0, but ahead of Chrome 1.0.
In the week following its release, MSIE has quickly surpassed Firefox 2.0 browser share in the US. I am not really sure who these Firefox 2.0 users are, but they and the MSIE6 users must be found and encouraged to immediately upgrade.
The values for the first week don't tell the entire story. As it enters its second week of general availability, MSIE8 continues to increase its share of the browser market, moving into fourth place in StatCounters US stats, overtaking Safari 3.2.
What does t his mean? While it still has a long way to go before it comes close to approaching even the dinosaur, MSIE6, it has to be said that this growth in MSIE8 browser share has occurred without the use of Windows Update. People are making a conscious decision to switch to and use MSIE8.
Site and application designers will need to take heed - MSIE8 compatibility initiatives will have to be in place yesterday rather than some vague time in the future.
Friday, March 27, 2009
The Web performance focus for most firms is simple: How quickly can code/text/images/flash can be transferred to the desktop?
The question that needs to be asked now is: What effect does my content have on the browser and the underlying OS when it arrives at the desktop?
Emphasis is now put on the speed and efficiency of Web pages inside browsers. How much CPU/RAM does the browser consume? Are some popular pages more efficient than others? Does continuous use of a browser for 8-12 hours a day cripple a computers ability to do other tasks?
The performance measurement will include instrumenting of the browser. This will not be to capture the content performance, but the browser performance. Through extensions, plugins, accelerators, whatever browsers will be able to report the effect of long-term use of the health of the computer and how it degrades the perceived performance over time.
Most browsers provide users and developers tools to debug pages. But what if this data was made globally available? What would it tell us about the containers we use to interact with our world?
Thursday, March 26, 2009
Tonight, I figured out how to add the Resolved IP Addresses for a host to measurement data and store that information for further debugging. It was very simple - I was trying to find complex solutions to this issue.
Turns out the solution is built right into PERL: The Socket module.
My thought is that I will update the table with the test config with three new columns:
- Page information
There will likely be a new table that joins with the raw data on
And contains a comma-delimited list of all the IP addresses that the agent resolved the hostname to at the test time. This lookup will be run after the measurement, so the DNS lookup component of the measurement is not compromised.
I don't have an ETA on this, as I want to test it fairly thoroughly before I expose the data. Adding the columns to the test config table will be transparent, but agent modification will need to be verified and then rolled out to all of the folks hosting measurement locations.
What problem does this provide a solution to?
It is vital for firms who use geographic load balancing and CDNs to verify that their data is being served from location appropriate IP addresses. I will be able to tie the information collected here into the IP-Location data I collect for other purposes and help companies ensure that this is being done.
This tool, however, does not simply reside in the hands of people looking to maliciously redirect traffic for purposes I can't quite fathom - I'll admit, there is still some simple naivete in my Canadian mind.
Legitimate companies, ISPs, service providers also have this tool at their disposal for their own purposes. An useful and accepted version of this exists already in the form of content delivery networks (CDNs) and other third-parties who take a portion of a companies domain name space and use it to deliver distributed edge content and computing, web analytics, or advertising services.
But let's move this inside the firewall or into the consumer ISP space. These companies provide DNS for millions of customers. As a result, they could easily re-write DNS entries to reflect their own needs rather than those of the consumer.
In the case of corporate IT networks, there is not much that can be done - they own the wire, hardware and software being used, so they will claim it's part of the corporate IT policy that all employees sign and that will be that.
Consumers, on the other hand, should expect free and unencumbered access to the worldwide DNS network, without being intercepted or redirected by their own ISPs. And there is frankly no way to verify that this is not happening unless you run your own caching BIND server on your home network.
The alternate is to use one of the external third-party services (OpenDNS or DNS Advantage). But these services also provide Phishing and Filtering services, which means that they can easily modify and redirect an incoming request using the most basic and critical service on the Internet.
While this may sound like the rant of a paranoid, it is a concept that has practical consequences. As an organization, how do you know that you aren't on the DNS filter list of these providers, or the ISPs? If they can filter and redirect DNS requests, what else are they doing with the information? Are they providing open and trusted access to the core DNS services of the Internet?
Stepping back as far as you can into the process of going to your favorite pages, you will find that you can't get there without DNS. And if DNS can no longer be trusted, even from legitimate providers, the entire basis of the Internet dissolves.
Tuesday, March 24, 2009
This afternoon, the two GrabPERF Agents at Technorati were switched back to using their local copies of caching BIND for resolving DNS entries.
Some folks at Microsoft who stopped by to look at their results on the Search Performance Index noticed that there were one or two outliers in the results from these locations. When I investigated, the OpenDNS name servers I was using were returning odd results.
The reason I had been using OpenDNS is that the local BIND instances were seeing unusual behavior a few months back. So, consider the switch to the local BIND instances probational, pending ongoing review.
Thanks to Eric Schurman for letting me know.
NB: I do work for Gomez.
US data shows that MSIE 7.0 is in a dominant position, with Firefox 3.0 in the 25% range of market share. This trend extends into the North American data, which is heavily influenced by the US trend.
MSIE 8.0, still reports a lower distribution than Firefox 2.0. This data is most likely based on the usage of MSIE 8.0 RC1 version, as MSIE 8.0 was only released in GA last week. It is highly probable that these stats will change in the very near future with the release of MSIE 8.0 to Windows Update.
In the EU, where fear and loathing of Microsoft runs deep and true, Firefox 3.0 is approaching parity with MSIE 7.0. Also, the perennially favoured native son Opera makes a very strong showing with their 9.6 release.
Asia is a Web designers nightmare, with MSIE 6.0 continuing to be the most reported browser. This is concerning, not simply for design reasons, but for web compliance reasons. Effectively Asia has throttled Web development to an old warhorse, but to such a degree that there must be some overriding advantage to using this browser.
As an example, the statistical comparison of four Asia nations is broken out below. We'll start with Japan where MSIE 7.0 has a clear lead in the statistics.
However, when China (People's Republic), India, and South Korea are added into the analysis, the pull towards MSIE 6.0 is massive.
This trend needs to be studied in greater detail in order to understand why MSIE 6.0 is so popular. Is it because of licensing? Continued use of Windows 2000? Compromised computers? The data doesn't provide any clear or compelling reason for this trend.
Moving to Oceania shows a return to the trend of MSIE 7.0 being the predominant browser with Firefox in second place, with these two browsers showing a substantial lead over the remaining field.
South America sees MSIE 7.0 as having the largest market share, followed by MSIE 6.0 and Firefox 3.0. Effectively there are no other browsers with substantial market share at present.
These statistics show that the three most dominant browser platforms by market share are the two MSIE platforms followed by Firefox 3.0. This is likely to change with the MSIE 8.0 GA last week and its predicted release to the masses via Windows Update in the near future.
However, the release of MSIE 8.0 may not be as exponential as is predicted. Corporate IT policies, which have been slow to embrace MSIE 7.0, are likely not going to make a giant leap to MSIE 8.0 overnight. Adoption among the general population will also depend on the ability of existing Web applications to adapt to a more standards-compliant browser platform.
Noticeably absent from most of these statistics is Safari in a position to challenge the three leading browsers. This indicates that even hardcore Mac users continue to use Firefox as their primary Web application and browsing platform. StatCounter backs this up by indicating that within their data, 8.36% of visitors from the USA were on Macs, while 3.15% of visitors used Safari.
Trends to watch in the near future:
- New browser releases (Firefox 3.1, Safari 4.0) and their effect on browser distribution
- Uptake of MSIE 8.0 once it is released via Windows Update
- Browser distribution is Asia
Saturday, March 21, 2009
Friday, March 20, 2009
The methodology of the Search Performance Index is straightforward: A number of key search providers are selected and HTTP GET requests are sent that directly pull the results page of a query page that is searching for 'New York'.
This is a simple process and one that is a close approximation of the actual search process that hundreds of millions of people perform every day.
alias Response Time Success Attempts Success Rate
-------------------------- ------------- ------- -------- ------------
SEARCH - Google Blogsearch 0.2822545 27242 27244 99.9927
SEARCH - Google 0.3151932 27228 27247 99.9303
SEARCH - Live (Microsoft) 0.3840097 27245 27246 99.9963
SEARCH - Indeed 0.4112960 27240 27241 99.9963
SEARCH - Yahoo 0.4574381 24175 24175 100.0000
SEARCH - Altavista 0.4592764 23922 23922 100.0000
SEARCH - Cuil 0.6757475 23963 23967 99.9833
SEARCH - AOL 0.7822945 23913 23913 100.0000
SEARCH - Ask 0.9025220 24157 24163 99.9752
SEARCH - Technorati 0.9053472 27219 27234 99.9449
SEARCH - Amazon 1.3251402 27245 27251 99.9780
SEARCH - Baidu 1.7409345 23777 23799 99.9076
SEARCH - Blogdigger 1.8960633 25106 26354 95.2645
SEARCH - BlogLines 2.0238809 27233 27248 99.9450
SEARCH - IceRocket 2.1233684 24147 24177 99.8759
SEARCH - Blogpulse 2.4144131 27019 27247 99.1632
As can be seen in the data, there is a substantial degree of difference in the response time results, but the big three search providers (Google, Yahoo, and Microsoft Live) were in the top five. As well, all but three of the providers (Blogdigger, IceRocket and Blogpulse) had success rates (availability) of 99.9% or higher.
I don't expect to see much in the way of change in these results, but I will post them for comparison each week.
Here's what the difference between the companies boils down to: Microsoft is focusing on today's Web, and the rivals are focusing on tomorrow's.
This sums up many of the comments that I have made about browsers over the months [see Does the browser really matter?]. The browser is not a passive browser any more. It is a Web application portal/container. With MSIE8, Microsoft has not made the leap into the future web.
It has produced more of the same, and will likely continue to see its percentage of the market drop, at least until Windows 7 rolls with MSIE8 installed as the base browser.
My final comment is that Windows 7 may not be enough to save Microsoft entirely due to the economic downturn and its effect on the upgrade process within households and corporations.
Thursday, March 19, 2009
By 12:00 PDT, the number of Web sites having to put up IE 7.0 only stickers will be in the millions.
I haven't done a lot of testing of the new monster (I mainly use Firefox on Mac OS 10.5.6), but it doesn't seem any weirder than any of the other Microsoft browsers I have used.
Under the hood, it does amp up the number of connections that are used to download objects, but if your infrastructure is prepared, it should be no big deal.
The big concern lies in the rendering. If those users who can upgrade to the new browser (anyone not limited to IE 6 by a corporate IT policy written on parchment) do find there are problems, will they blame the site or the browser?
Have you tested your site to ensure that it actually works with IE 8? Or will you get burned?
What is the greatest benefit you see to your visitors having IE 8 instead of IE 6 or 7?
Is this a good thing?
As information passes through and upward through a company, it is filtered, shaped, refined down to the one salient decision point that all the executives can then discuss. The concern that I have is whether the devolution of detail within organizations stifles their ability to innovate, especially in times of stress.
Small companies have a short distance from those that create and work with the product to the senior levels. As a result, senior managers and executives are tightly tied to the details of the product, of the company, of the customers. They understand that details are important.
Mature companies discuss how their strategies and initiatives will shape an entire industry and change the way everyone does business. But how that happens is often lost as those concepts flow downward. Just as detail devolves on the way up, detail evolves on the way down.
It is nigh on impossible to participate in an industry-defining paradigm shift when your everyday activities double and triple, leading to a complete dissociation between the executive level and the worker level.
Why does this occur?
Its not that detail devolves on the way up an organization, but rather that each level needs to assure each higher level that everything is ok and that solutions can be found for those issues that may be challenging, so lets just keep pressing forward.
So the devolution of detail coupled with the culture of assurance gets too many companies in trouble.
The devil is in the details. And sometimes, the devil can be your friend.
This is a "re-print" of an article I had on Webperformance.org that I notice that a number of people search for.
See additional information on how not to use Round-Robin DNS.
The use of Round-Robin DNS for load-balancing has been around for a number of years. It is meant to serve as a way to have multiple hosts with different IP addresses represent the sam hostname. It is useful, but as dedicated load-balancing technology began to evolve, its use began to decrease, as it was not sensitive to the conditions that exist within a server-farm; it simply told clients which IP they should connect to, without any consideration of the condition of the server they were connecting to.
Recently, I have seen a few instances where companies have switched back to Round-Robin DNS without considering the pitfalls and limitations of this particular method of load-balancing. The most dangerous of these is the limitation, set out in RFC 1035, that a DNS message carried in UDP cannot exceed 512 bytes.
2.3.4. Size limits
Various objects and parameters in the DNS have size limits. They are
listed below. Some could be easily changed, others are more
labels 63 octets or less
names 255 octets or less
TTL positive values of a signed 32 bit number.
UDP messages 512 octets or less
When a UDP DNS message exceeds 512 octets/bytes, the TRUNCATED bit is included in the response, indicating to the client/resolver that not all of the answers were returned, and they should re-query using a TCP DNS message.
It's the final phrase of the previous paragraph that should set everyone's ears a-twitter: the only thing that should be requested using TCP DNS messages are zone transfers between servers. If a someone who is not one of your authorized servers is attempting to connect using TCP to port 53 of your DNS server, they may be attempting to do an illegal zone transfer of your domain information.
As a result, TCP connections to port 53 are usually blocked inbound (and outbound, if you run an ISP and you assume your users are up to no good). So, guess what happens when the truncated DNS information is re-requested over TCP? Web performance is negatively affected by between 21 and 93 seconds.
So, simply put, DNS information needs to be tightly managed and examined, to ensure that only the most appropriate information sent out when a DNS query arrives. Using Round-Robin DNS to load-balance more than 5 servers indicates that you should be examining other load-balancing schemes.
24 April 2003 â€” Round-Robin Authority Records
This morning, I spoke with a client who was experiencing extremely variable DNS lookup times on their site after they implemented a new architecture. This architecture saw them locate a full mirror of the production site in Europe, for any potential failover situations.
This looks good on paper â€” apparently the project plan ran to eleven pages â€” but they made one very serious technical faux pas: they deployed two of the authoritative name servers in the UK.
The DNS gurus in the audience are saying "So what?". This is because when you set up 4 authoritative DNS servers correctly, the query is sent to all 4 simultaneously and the one with the fastest response time is used. This usually results from the following type of configuration:
The client this morning, however, had a very different type of configuration.
When the authority record is returned in this fashion, the results are easily understood. The host name is the same for all four IP addresses, so the querying name server does what it is supposed to do in these situations: resort to the Round-RObin algorithm. Instead of querying all four name servers simultaneously, the querying name server rotates through the authoritative names.
Depending on where the authoritative name servers are located, the DNS lookup time could vary wildly. In the case of the client this morning, 50% of the DNS lookups were being routed to the UK, regardless of where the querying name server was located.
 This value varies depending on the operating system. For Windows 2000, the TCP timeout is 21 seconds; for Linux, this value is 93 seconds.
Tuesday, March 17, 2009
This is not a bad or evil thing, considering that for at least 18 months, the articles that were hosted at those sites were duplicated here in a more manageable format.
For those who have come looking for the content from those sites, it is here. The search box in the right column can help you locate it.
But for those who would like a refresher, here is a list of the most popular articles on this blog, as selected by Web performance traffic.
Web Performance Concepts Series
- Web Performance, Part I: Fundamentals
- Web Performance, Part II: What are you calling â€˜averageâ€™?
- Web Performance, Part III: Moving Beyond Average
- Web Performance, Part IV: Finding The Frequency
- Web Performance, Part V: Baseline Your Data
- Web Performance, Part VI: Benchmarking Your Site
- Web Performance, Part VII: Reliability and Consistency
- Web Performance, Part VIII: How do you define fast?
- Web Performance, Part IX: Curse of the Single Metric
Why Web Measurement Series
- Why Web Measurement, Part I: Customer Generation
- Why Web Measurement, Part II: Customer Retention
- Why Web Measurement, Part III: Business Operations
- Why Web Measurement, Part IV: Technical Operations
Web Performance Tuning
- The Dichotomy of the Web: Andy Kingâ€™s Website Optimization
- Performance Improvement From Compression
- Baseline Testing With cURL
- Compressing Web Output Using mod_deflate and Apache 2.0.x
- Compressing PHP Output
- Using Client-Side Cache Solutions And Server-Side Caching Configurations To Improve Internet Performance
- Performance Improvement From Caching and Compression
- Compressing Web Output Using mod_gzip for Apache 1.3.x and 2.0.x
- mod_gzip Compile Instructions
- Hacking mod_deflate for Apache 2.0.44 and lower
Monday, March 16, 2009
The goal of the index is to provide performance metrics for a group of search providers around the world. The results are based on a direct HTTP GET request being made for the search results page by the GrabPERF Agent.
Currently only live data is available. In the near future, archival results on a week by week basis will be made available.
If there is a search provider that has been missed, please contact the GrabPERF team.
Sunday, March 15, 2009
On a daily basis, I update the Geographic IP database that I created many years ago. Although not as powerful as some of the commercially available Geographic databases, it has more than served my purposes over the years.
One of the benefits of collecting this data is being able to extract substantial metrics on the distribution of IPV4 addresses. This post is the latest in a series of descriptions of the distribution of addresses at the moment.
Here are the Registrar Stats for IPV4 addresses as of March 14 2009.
The ARIN IPV4 address space (which includes the US) is still the largest by far, with nearly 3 times the allocated IPV4 addresses of the two next largest registrars, RIPE and APNIC. The dominance of the US is even more noticeable in the IPV4 addresses by Country table.
Belying its growing importance on the Internet stage, China has grown from fourth place in the first of these analyses to second place in this study. However, it still has a long way to go before it catches up with the US.
An interesting concept that comes out of this data is that China is making do with substantially fewer public IPV4 addresses than the US is. This means that they have wholeheartedly embraced IPV6 (unlikely) or are using the private IP space for most communications.
Saturday, March 14, 2009
What did surprise me was the number of people who are still using MSIE 6.0. I am not sure what is continuing to perpetuate the presence of this percentage of people on this antiquated browser, other than large corporations running this by mandate of the IT department.
Friday, March 13, 2009
Today, the system was offline for several hours before I noticed that the DB had failed to restart properly. Apparently the system simply decided that the InnoDB engine had gone away.
All tables have now been switched back to the good old MyISAM and all the headaches that come with that.
Thursday, March 5, 2009
The dynamic page was starting to push 20-25 seconds just for the Top 20 List. When I switched to the static list, times dropped to less than 1 second.
It's always bad when a Web performance measurement site has poor performance.
- The Agent code was streamlined and removed the connection error sub-routine. It seems that the latest versions of cURL no longer support the connection error determination (I can only imagine the madness of trying to support this on multiple OSs), so it has been removed as part of the error detection process. This change has been pushed out to four agents for testing and will be distributed to all other active agents after this is complete.
- Upgrade of cURL to 7.19.4 on four agents. The same four agents that have the new agent code have also had the underlying HTTP(S) engine (cURL) upgrade to 7.19.4. Although this supports no new features that I am aware of, it is always good to be on top of the latest release with bug-fixes.
We are also trying to determine how to capture the URL that we connect to when we take a measurement. As far as cURL is documented, it still does not appear to support this feature.
Wednesday, March 4, 2009
This is the kind of data that everyone should be interested in. And it's free. Check it out at StatCounter GlobalStats.
Tuesday, March 3, 2009
There is a methodology statement explaining more about the index below the data table.
Please add any comments or questions to this post.
The switch to InnoDB was done because of the locking issues that were occurring during long queries, especially when doing ad-hoc analysis. The row-level (versus table-level) locking of InnoDB has removed most of these issues.
I have been seeing some strange behavior with the new engine. As a result of this, I will be re-starting the database engine twice a day. There should be no degradation, as this is simply a daemon re-start, not a machine re-start.