Archive for March, 2007|Monthly archive page

FuckedCompany and Alexa chart

Techcrunch has announced that it has acquired FuckedCompany website (April Fool 🙂 ). The Alexa chart of fuckedcompany.com is almost a reflection of the size of startup dead pool. Techcrunch is betting that the we are hitting the bottom of the curve.

graph1.png

Valleywag got it all wrong with Alexa comparison

Peter Rip used Alexa traffic charts in one of his blogs to show that web 2.0 has peaked and he got hammered by Valleywag. Unfortunately, Valleywag got it all wrong: it compared Alexa Reach (which represents the number of unique visitors per day) with Techcrunch’s sitemeter pageview number. As a matter of fact, Techcrunch’s pageviews from Alexa and pageviews from its sitemeter stats match quite well.

Here is the pageview chart from Alexa:Alexa for Techcrunch

Here is the pageview chart from Techcrunch’s sitemeter web stats:sitemeter

This example shows that the challenges one faces when comparing numbers from different sources.

Given the recent heated debates on Alexa, I would like to offer a few tips for using Alexa’s stats:

1. Understand sampling biases

Alexa collects data from its free Alexa toolbar which is used as webmaster and developer tools. As a result, its sampling panel is heavily biased towards websites targeted to webmasters and developers. There is also a regional bias as the distribution of Alexa toolbars is not proportional to the number of users in different countries. In general, Alexa is useful only when one compares sites with similar audience.

2. Know what metrics to look for

Alexa tracks “daily unique visitors” (called Alexa Reach) and “daily pageviews”. Note that many stats packages (like sitemeter or Google analytics) use “number of visits”. A visit is defined as a session; a session ends when a user is inactive on a website for 30 mins. The number of visits can be much higher than the number of unique visitors as a user can visit a website many times in a day on some websites (e.g., facebook.com where time-rich people actually spent 500 hours in six months on it).

Note that “Alexa Reach” is expressed as “relative share” (the percentage of all Internet users who visit a given site) rather than absolute number of users. For example, if a site like yahoo.com has a reach of 28%, this means that if you took random samples of one million Internet users, you would on average find that 280,000 of them visit yahoo.com in a day.

One problem with this relative share approach is that the size of the “total pie” is growing over the time. In particular, Alexa’s international base is growing much more rapidly. As a result, US websites’ Alexa numbers tend to increase slower or even show some decline even internal stats indicate the number of visitors are growing.

Here is the user base data share from Alexa. The share of Alexa toolbar users in US has dropped from 37% to 14% in the last three years.

Country 2007 2004
China 16.44% 18.46%
United States 14.28% 36.91%
Brazil 3.82% N/A
Japan 3.64% 3.80%
United Kingdom 3.11% 4.49%
Taiwan 2.91% 1.72%
Hong Kong 2.55% 4.59%

3. None can get stats 100% right

Many people are understandably not happy with Alexa’s numbers. But third-party traffic stats are inherently inaccurate, particularly for smaller sites, where the number of web users in any sample is too small to have a good margin of error. Even for large sites like Youtube, Hitwise and ComScore have very different results.

Any sampling based panel has its biases too. For example, the way ComScore gets their user panel does not give me much confidence either although companies pay hundred of thousand a year for the services.