Valleywag got it all wrong with Alexa comparison

Peter Rip used Alexa traffic charts in one of his blogs to show that web 2.0 has peaked and he got hammered by Valleywag. Unfortunately, Valleywag got it all wrong: it compared Alexa Reach (which represents the number of unique visitors per day) with Techcrunch’s sitemeter pageview number. As a matter of fact, Techcrunch’s pageviews from Alexa and pageviews from its sitemeter stats match quite well.

Here is the pageview chart from Alexa:Alexa for Techcrunch

Here is the pageview chart from Techcrunch’s sitemeter web stats:sitemeter

This example shows that the challenges one faces when comparing numbers from different sources.

Given the recent heated debates on Alexa, I would like to offer a few tips for using Alexa’s stats:

1. Understand sampling biases

Alexa collects data from its free Alexa toolbar which is used as webmaster and developer tools. As a result, its sampling panel is heavily biased towards websites targeted to webmasters and developers. There is also a regional bias as the distribution of Alexa toolbars is not proportional to the number of users in different countries. In general, Alexa is useful only when one compares sites with similar audience.

2. Know what metrics to look for

Alexa tracks “daily unique visitors” (called Alexa Reach) and “daily pageviews”. Note that many stats packages (like sitemeter or Google analytics) use “number of visits”. A visit is defined as a session; a session ends when a user is inactive on a website for 30 mins. The number of visits can be much higher than the number of unique visitors as a user can visit a website many times in a day on some websites (e.g., facebook.com where time-rich people actually spent 500 hours in six months on it).

Note that “Alexa Reach” is expressed as “relative share” (the percentage of all Internet users who visit a given site) rather than absolute number of users. For example, if a site like yahoo.com has a reach of 28%, this means that if you took random samples of one million Internet users, you would on average find that 280,000 of them visit yahoo.com in a day.

One problem with this relative share approach is that the size of the “total pie” is growing over the time. In particular, Alexa’s international base is growing much more rapidly. As a result, US websites’ Alexa numbers tend to increase slower or even show some decline even internal stats indicate the number of visitors are growing.

Here is the user base data share from Alexa. The share of Alexa toolbar users in US has dropped from 37% to 14% in the last three years.

Country 2007 2004
China 16.44% 18.46%
United States 14.28% 36.91%
Brazil 3.82% N/A
Japan 3.64% 3.80%
United Kingdom 3.11% 4.49%
Taiwan 2.91% 1.72%
Hong Kong 2.55% 4.59%

3. None can get stats 100% right

Many people are understandably not happy with Alexa’s numbers. But third-party traffic stats are inherently inaccurate, particularly for smaller sites, where the number of web users in any sample is too small to have a good margin of error. Even for large sites like Youtube, Hitwise and ComScore have very different results.

Any sampling based panel has its biases too. For example, the way ComScore gets their user panel does not give me much confidence either although companies pay hundred of thousand a year for the services.

About these ads

8 comments so far

  1. antetly on

    ONLINE – DRUGSTORE!
    PRICES of ALL MEDICINES!

    FIND THAT NECESSARY…
    VIAGRA, CIALIS, PHENTERMINE, SOMA… and other pills!

    Welcome please: pills-prices.blogspot.com

    NEW INFORMATION ABOUT PAYDAY LOANS!

    Welcome please: payday-d-loans.blogspot.com

    GOOD LUCK!

  2. Hugo on

    nice info, didnt know you had to install a toolbar. That explains the different rankings.

    Thanks

  3. John on

    Thanks for this insightful article. I always thought that Alexa figures were based on Google’s data and that they were unshakable. I was wrong. Thanks for putting this up. Do visit my blog: http://johnpmathew.blogspot.com, and let me know how I am doing!
    :)

    J

  4. [...] Many people are understandably not happy with Alexa’s numbers. But third-party traffic stats are i… [...]

  5. [...] While he took some criticism over the data he used, those arguments have been refuted by Life is an Venture. Tim Berners-Lee was spot on in this podcast when he said “Web 1.0 was all about connecting [...]

  6. sigit super on

    my alexarank is still 8 millions.huh

  7. [...] issues into account, especially if the analyst incorrectly uses statistics. (Luckily, some people get it right.) __spr_config = { pid: '4f534176396cef286c0001d7', title: 'Google Vs YouTube ?', ckw: [...]

  8. [...]   So the answers to the questions posed above depend on what you mean by “Google,” i.e. whether it is a site, a brand, or a company. The first would be only google.com; the second would include google.com, google.fr, google.com.br, google.de, etc. The last would include those properties plus YouTube, Orkut, Blogger, et al. The point is that those are important questions to ask when doing traffic analysis on the web. Be wary of analyses that fail to take these kinds of issues into account, especially if the analyst incorrectly uses statistics. (Luckily, some people get it right.) [...]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: