When a company here sell its stocks on multiple markets, the prices on different markets are typically very close.

Not for the Chinese stocks! The difference can be 200% and more!
Take Air China stock for example, its price is 15 CNY on Shanghai Stock Exchange and is 7 CNY on Hong Kong Stock Exchange for the exact same stock (see graphs below). That’s 200% premium!

A-Share = Chinese stock sold on Shanghai or Shenzhen stock markets

H-Share = Chinese stock sold in Hong Kong stock exchange (some also listed in US markets as ADR).

Because of currency exchange restriction, it is more difficult to arbitrage between the two markets. The gap started to grow substantially as the Chinese stock market roared in the past year.

Here are the premiums of A-Share over H-Share for stocks also listed in US ADR (based on Aug 20 prices):

Aluminum Corp. of China (ACH): 300%

China South Airline (ZNH): 260%

China Life (LFC):165%

The gap may start to close up now. Last week, the Chinese government announced that it would allow mainland Chinese citizens to invest in the Hong Kong stock market. The Chinese stocks on Hong Kong and US ADR markets have been on a roll. The China market ETF FXI (which holds most Chinese companies on Hong Kong market) was also lifted 16% the past week despite sub-prime loan crisis.

Installing Ubuntu Feisty Fawn (7.04) on a ThinkPad T60

After much trial-and-error and googling, I finally got Ubuntu 7.04 working on my Thinkpad T60 in a way that I wanted:

  • Dual boot with Windows XP
  • Preserve the ThinkVantage button functioning which makes system recovery and updating a lot easier
  • Keep the partition of the hard disk that contains system backup (for repair and recovery)

Pre-Installation Steps:

  • Make Thinkpad rescue and recovery discs (Start->All Programs->ThinkVantage->Rescue and Recovery)
  • Backup MBR (in case you install GRUB into the MBR by accident) . Follow instructions here.
  • Make Ubuntu 7.04 LiveCD for temporary booting into Ubuntu only
  • Make Ubuntu 7.04 Alternative CD for actual installation (Note that checkbox under the “Start Download” button). This is used instead of LiveCD because I want to preserve ThinkVantage functions.
  • USB memory stick for transferring boot file

Installation Steps:

Follow step-by-step instruction here except the following steps (very important, things can become very messy if you dont)

  • In “fig 29 ntfs” step, set the Ubuntu partition”bootable” flag ON. Otherwise, the GRUB will not install onto the Ubuntu partition.
  • In “fig 65 ntfs” step, choose NOT to install GRUB onto MBR (select NO to the question in that step)
  • In the next screen after you select NO to install GRUB on MBR, you are asked to enter the partition to install GRUB. You should type (hd0,1) to indicate first hard drive and second partition (the Ubuntu partition). If you have multiple hard drives, you need to figure this out by yourself. DO NOT use the /dev/sda3 option as indicated on the screen. That does not work. If you do that, GRUB will be install onto the MBR.
  • Boot the laptop with LiveCD and run gparted. Set the Window partition as bootable.
  • Follow instruction here to make boot file for Ubuntu. Run “dd if=/dev/sda3 of=ubuntu.bin bs=512 count=1″
  • Copy ubuntu.bin to USB stick or use the shared partition as discussed here.
  • Boot up in Window XP and copy ubuntu.bin from USB or shared partition to C:
  • Update window bootloader as discussed here.
  • You re done!


  • ATI graphic card. It is a bit tricky to get it work. Follow the instructions here.
  • WLAN connection. My T60 uses an Atheros chip and it works out of box with network manager. Other T60 uses Intel chip and it also works out of box. The wireless connection is not as reliably as in Windwos. The network manager disconnects often and I cannot connect to some WLAN APs (such as the one in our library).

Final thoughts: after this experience of installing Ubuntu on T60, I start to appreciate what Windows has to offer that I never realized …

FuckedCompany and Alexa chart

Techcrunch has announced that it has acquired FuckedCompany website (April Fool :-) ). The Alexa chart of is almost a reflection of the size of startup dead pool. Techcrunch is betting that the we are hitting the bottom of the curve.


Valleywag got it all wrong with Alexa comparison

Peter Rip used Alexa traffic charts in one of his blogs to show that web 2.0 has peaked and he got hammered by Valleywag. Unfortunately, Valleywag got it all wrong: it compared Alexa Reach (which represents the number of unique visitors per day) with Techcrunch’s sitemeter pageview number. As a matter of fact, Techcrunch’s pageviews from Alexa and pageviews from its sitemeter stats match quite well.

Here is the pageview chart from Alexa:Alexa for Techcrunch

Here is the pageview chart from Techcrunch’s sitemeter web stats:sitemeter

This example shows that the challenges one faces when comparing numbers from different sources.

Given the recent heated debates on Alexa, I would like to offer a few tips for using Alexa’s stats:

1. Understand sampling biases

Alexa collects data from its free Alexa toolbar which is used as webmaster and developer tools. As a result, its sampling panel is heavily biased towards websites targeted to webmasters and developers. There is also a regional bias as the distribution of Alexa toolbars is not proportional to the number of users in different countries. In general, Alexa is useful only when one compares sites with similar audience.

2. Know what metrics to look for

Alexa tracks “daily unique visitors” (called Alexa Reach) and “daily pageviews”. Note that many stats packages (like sitemeter or Google analytics) use “number of visits”. A visit is defined as a session; a session ends when a user is inactive on a website for 30 mins. The number of visits can be much higher than the number of unique visitors as a user can visit a website many times in a day on some websites (e.g., where time-rich people actually spent 500 hours in six months on it).

Note that “Alexa Reach” is expressed as “relative share” (the percentage of all Internet users who visit a given site) rather than absolute number of users. For example, if a site like has a reach of 28%, this means that if you took random samples of one million Internet users, you would on average find that 280,000 of them visit in a day.

One problem with this relative share approach is that the size of the “total pie” is growing over the time. In particular, Alexa’s international base is growing much more rapidly. As a result, US websites’ Alexa numbers tend to increase slower or even show some decline even internal stats indicate the number of visitors are growing.

Here is the user base data share from Alexa. The share of Alexa toolbar users in US has dropped from 37% to 14% in the last three years.

Country 2007 2004
China 16.44% 18.46%
United States 14.28% 36.91%
Brazil 3.82% N/A
Japan 3.64% 3.80%
United Kingdom 3.11% 4.49%
Taiwan 2.91% 1.72%
Hong Kong 2.55% 4.59%

3. None can get stats 100% right

Many people are understandably not happy with Alexa’s numbers. But third-party traffic stats are inherently inaccurate, particularly for smaller sites, where the number of web users in any sample is too small to have a good margin of error. Even for large sites like Youtube, Hitwise and ComScore have very different results.

Any sampling based panel has its biases too. For example, the way ComScore gets their user panel does not give me much confidence either although companies pay hundred of thousand a year for the services.

Measuring Internet traffic: where are the biases?

There have been quite a few discussions on traffic measurement lately. The general consensus is that all of them have some sort of problems. It would be an interesting exercise to see where are the biases and how we may be able to compensate for them.

ComScore and Hitwise are two leading paid services. They use two different approaches: ComScore is “Panel based” and Hitwise is “ISP based”.

1. ComScore

ComScore has over 2 million users who have installed ComSore’s data collection software on their computers (although their US panel sample is 120K in the US and global panel is 500K outside the US). Their users are randomly selected. ComScore recruits them over the web by offering virus protection scanning, web acceleration or sweepstakes prizes under a number of channels (e.g., PermissionResearch, OpinionSquare and Marketscore).

ComScore’s demographic tends to be skewed toward naive Internet users as more sophisticated users are less likely to install ComScore’s toolbar. Serious security issues have been raised with their software. If you are interested in the details of how ComScore collects user data and the security implications, I would recommend you to read the articles by Stanford, Cornell and Forbes.

2. Hitwise

Hitwise gets its user data from ISPs that it has partnered with. According to Hitwise it have over 10 million US and 25 million worldwide users.

While Hitwise has a much larger and diverse pool of sample users, its data partners are mostly small ISPs and has much more dial-up users in the data set.

In general, Hitiwse’s data tends to be more skewed towards home use and underestimates broadband or work use.

3. Alexa

Alexa offers a free traffic data service and is a subsidiary of Alexa collects information from over 20 million users who have installed the “Alexa Toolbar”. The Alexa toolbar is available on Internet Explorer and an extension (Status Bar) can be used for Firefox.

Alexa toolbar is offered as a webmaster tool and its user panel is biased towards techies/geeks and webmasters in particular. Alexa’s number can not be used to compare two sites with very different demographic.

Another problem with Alexa is that its numbers are relative shares (percentage of the total population). Because Alexa’s international base is growing much more rapidly, US websites’ Alexa numbers tend to increase slower than their internal stats or even show some decline.

4. Compete and Quantcast

Compete and Quantcast are two smaller free services. tries to combine toolbar panel and ISP data whereas Quantcast requires websites to install a tracking pixel. For some websites, they offer good numbers whiles for others, their numbers can be way off. It is still unproven that their approaches offer better results. You can read detailed discussions from Venturebeat, Matt Cutts and Traffick.

Picking winners: people vs market

Update: Marc Andreessen comes to the same conclusion based on his personal expeience with 30-40 startups.

Picking winners in the venture business is always difficult. There is much debate as to whether one should bet on the people or the market. Prof Steve Kaplan of the University of Chicago, in his study titled “Back the Horse or the Jockey”, clearly comes down on the Horse side . His findings can be summarized as:

  • A bad management team does not necessarily kill a good idea, but a bad idea is rarely overcome by a good management team.
  • You can change teams much easier than you can change businesses and still win.

I am sure that a lot of people will disagree with him. In the end, you need both the right market and the right people to win. You can find his report here.

Youtube vs Napster

Youtube and Napster have a lot of similarities. So how come Youtube succeeded whereas Napster failed?

As seen in many consumer web businesses, sometimes small difference makes all the difference. In the case of Youtube and Napster, it may lie in the following facts:

  • Music has always been sold as products (CDs)
  • TV programmes have traditionally been a service supported by advertising

This difference is important. Since music can never be supported with advertising (a short ad before a song? no way!), the music industry had to kill Napster as it directly impacted on their topline.

Youtube is more a complementary service. It has a lot of popular copyrighted TV programmes (like Daily Show) but that does not directly hit the TV programme’s ad revenue much. Media companies may see Youtube as another distribution channel for monetizing additional ad revenue.

But the copyright owners still hold all the cards. If they decide that they do not like Youtube after all, they may still kill it. Time will tell.

Interesting HitWise data points about Google

If you read the press, listen to the friends in the tech industry, you would have an impression that Google is taking over the web in every area. However, the data below from Hitwise shows a different picture.

Look at Gmail. It is an excellent product that has received raving reviews almost everywhere. But it has only 2% of market share in Email services. Yahoo Mail and Hotmail have 42% and 22% respectively.

The same with online maps. Mapquest and Yahoo Map have 56% and 20% market share whereas Google map and Google Earth have 7.5% and 2%.

Outside the echo chamber of the techies, people may not appreciate things like AJAX as much as we do. I tried to get my wife to move from Y!Mail to Gmail but she just would not do it (even I opened a Gmail account for her).

What Google is clearly doing well is search. It has close to half of the market and is good at making money out of eyeballs with their PPC ads.

Hitwise Data

Profitable podcasting

There are a lot of people doing podcasting but the question is how to make money from it. The story below may give you some clues. was a Chinese language podcasting service started by Ken Carroll, a Dublin native who has been running English schools in Shanghai. When he was introduced to podcasting, he immediately realized its potential to scale his business to global audience and started podcasting with Chinese lessons. A year later, is now one of the most widely downloaded podcasts on Yahoo Podcasts with more than 5 million downloads.

The interesting part is that the podcasting is free but the transcripts of podcasts is available for $9 a month. It also sells flashcards and exercises as a premium service. Most subscribers are currently from US. The Internet operation also scales much better than his brick-and-mortar schools and is already profitable.

So, if you are providing an education service, podcasting may be the way to scale your business.

Which subjects generate more money per pageview?

Advertising is the main revenue sources for many websites but not all web pages are created equal. How much money can one generate per page view? It really depends on the contents of the web pages and the viewers. In general, the income per thousand pageview (CPM) can range from $1-$50 for most websites. Search, travel and local search seem to generate high CPM.

Here is a list I complied based on a recent article in Business 2.0 on making money with blogs and publicly available information (some numbers were mentioned in interviews and blogs and may not be accurate. For Google, only revenue from their own websites is included (60% of their total revenue).

Name Monthly Pageviews Monthly Revenue eCPM
Google 10B $333M $33 27B $25M $1 500M $500K $1
Dogster/Catster 15M $100K $7 23M $83K $4
Gawker/Gizmodo 66M $250K $4
Techcrunch 2M $60K $30
Rocketboom 9M $340K $43 5M $83K $16 40M $800K $20
CNET 2.8B $30M $11 200M $8.3M $42 100M $4.2M $42 110M $5M $45

eCPM = effective income per thousand pageviews

Let me know if you have access to information on pageviews and revenue of other websites.


