Testing, 1, 2, 3

Do you remember the story of New Coke?

Its pretty interesting. In 1985 Pepsi had been beating Coca Cola in lots of taste tests. In addition, Coke’s market share had been slipping over the previous 15 years. After doing a lot of research and internal testing (on 200,000 consumers), Coke decided the problem was that the drink was not sweet enough.

So Coke changed the formula to what we now know as "New Coke". There was a huge consumer reaction.. People freaked out. And 79 days later, Coca Cola was forced to bring back the original flavor.

Malcom Gladwell wrote about this in his book "Blink". After talking with a bunch of market researchers, he found the real reason why New Coke failed. And that’s because Coca Cola did their testing incorrectly.

It turns out that Coke was testing based on the "sip test". Which meant you would just take a few sips of different colas in a blind test, and see which one you liked best. But is that how you drink a Coke? By taking 2 sips and then stopping?

Of course not! You drink the entire can!

I have a book by Sergio Zyman, who was the Chief Marketing Officer at Coke and oversaw the testing and the rollout of New Coke. He writes:

"In blind tests, consumers also told us that they preferred the taste of Pepsi to Coke, basically because Pepsi is much sweeter. At first try, people would get a smoother taste on a sip-by-sip basis".

At first try?? On a sip-by-sip basis?? And this is from the Chief Marketing Officer of Coca Cola??

Isn’t that hilarious?!

One of the biggest companies in the world made such a gigantic error – potentially a "bet the farm" error, all because they were testing for the wrong action. They tested against sips, instead of testing against how consumers actually consumed the drink.

Unfortunately, this kind of error is made online all time.

When you’re designing split tests, you need to make sure you design them properly. From the reports by many marketers (including some "famous" split testing companies), I have seen many incorrectly designed tests where they draw incorrect conclusions.

Here’s some common testing errors to keep an eye out for:

1. Not testing against enough traffic.

The maxim of 40 actions is ridiculous. 40 actions of what? 40 clicks? 40 optins? 40 $1.95 trial signups? 40 $1,000 sales?

The higher the effort required for the user to take the action, the less actions will be required for an accurate test. If you’re testing against clicks, 40 clicks is virtually meaningless. If you’re testing against something costing $1,000, that’s a large dollar amount, and 40 actions is probably a reliable indicator. The amount of traffic you need for each test depends on what you’re actually testing.

Also, keep an eye on the variation during the test. If a page gets in front and stays there for a while, its probably going to win. If the winners switch places frequently, then you need to let it run longer.

The volatility of your traffic can make a big difference as well. Google PPC traffic is far more stable than traffic driven by 1000 different affiliates driving banner, newsletters, PPC and other types of traffic.

2. Testing optin pages against optins

If you’re testing an optin page where you’re asking for someone’s email address, you want pick the page that gets the most optins possible, right?

WRONG!

For every aspect of your site (creative, landing page, order page), you must test against sales. So you always should pick the optin page that delivers the most sales. The SECOND number you look at is optins, for pages where the sales volume is equal. Many split testers won’t track two variables properly, but you still need to find a way to track it.

If you select an optin page which delivers high numbers of optins, but low sales, you can really damage your website’s performance. How do I know? I’ve done it on one of my own sites. (Yeah, I fixed it.)

3. Testing unimportant variables

I’ve never seen Taguchi work in a split test, however one thing that I found very useful from Taguchi is called "Design of Experiments". DOE is basically a framework for designing tests. Its actually quite simple when you understand it, I would encourage anyone who is running split tests to learn it.

The most important thing DOE teaches is to make sure that you’re testing against the page elements which have the greatest impact. Don’t waste time testing colors and font sizes. Test your offer, price, and headlines. Then look at any other elements on your site that make a big difference like upsells and downsells.

And most of all, don’t run tests like Coca Cola in 1985! Make sure whatever you’re testing, that its as close to the real environment as possible.

Get more tips from Adrian at adrianstips.com
Adrian is interested in meeting readers at Adtech San Francisco.. Drop him a line if you will be there.