<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:coop="http://www.google.com/coop/namespace"
		>
<channel>
	<title>Comments on: The bad science of A/B and multivariate testing for e-commerce</title>
	<atom:link href="http://marketplanb.com/blog/index.php/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/feed/" rel="self" type="application/rss+xml" />
	<link>http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/</link>
	<description>Strategies and tactics for online marketing and ecommerce</description>
	<lastBuildDate>Wed, 26 Mar 2008 19:37:21 -0700</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Multivariate Testing - Overdoing It? &#124; Data SystemsPlus</title>
		<link>http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/comment-page-1/#comment-1123</link>
		<dc:creator>Multivariate Testing - Overdoing It? &#124; Data SystemsPlus</dc:creator>
		<pubDate>Mon, 28 Jan 2008 19:32:47 +0000</pubDate>
		<guid isPermaLink="false">http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/#comment-1123</guid>
		<description>[...] Why? The first is simple math. Tom Lindmeier explains it better than I could - but the bottom line, without big, stable traffic numbers and a steady, measured conversion rate, your multivariate testing results are not much better than choosing a &#8216;winner&#8217; sales page at random. Any freshman statistics student learns that for a statistical observation to be reliable, it needs to be derived from a sample size large enough to ensure reliability. If you are launching a new product or site with zero traffic to start - you are making a mistake if you are making copy writing decisions based on statistics based on just a few hundred visitors and a handful of sales. Keep in mind also, the more variables you test, the more observations you will need for a valid test. [...]</description>
		<content:encoded><![CDATA[<p>[...] Why? The first is simple math. Tom Lindmeier explains it better than I could &#8211; but the bottom line, without big, stable traffic numbers and a steady, measured conversion rate, your multivariate testing results are not much better than choosing a &#8216;winner&#8217; sales page at random. Any freshman statistics student learns that for a statistical observation to be reliable, it needs to be derived from a sample size large enough to ensure reliability. If you are launching a new product or site with zero traffic to start &#8211; you are making a mistake if you are making copy writing decisions based on statistics based on just a few hundred visitors and a handful of sales. Keep in mind also, the more variables you test, the more observations you will need for a valid test. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tom Lindmeier</title>
		<link>http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/comment-page-1/#comment-779</link>
		<dc:creator>Tom Lindmeier</dc:creator>
		<pubDate>Sat, 22 Dec 2007 15:23:42 +0000</pubDate>
		<guid isPermaLink="false">http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/#comment-779</guid>
		<description>Billy,
I agree with you for the most part. I recognize the need to be careful about throwing numbers around unless we state the exact nature of the test. I&#039;m not saying that the Google Optimizer is lacking. (The reference to 1M page views is on page 14 of the &lt;a title=&quot;Demo&quot; href=&quot;http://services.google.com/websiteoptimizer/&quot;&gt;Overview Demo&lt;/a&gt;). But I am saying that small and medium sized businesses need to moderate expectations of the number of tests they can run when they have moderate traffic. They are faced with running one test at a time over longer periods.
Here&#039;s an example: Lets say you&#039;re running a home page offer test and a shopping cart test and both are multivariate with 4 treatments each. Even if you assign the lions share of page views to the control group on the home page test, the factor for your cart test is diluted to a point that you cannot run the cart test and possibly not even the home page test unless it is A/B.
Overall conversion rates for most e-commerce businesses fall into the 2% to 4% range and that is how I came up with the 20-25M sample size.</description>
		<content:encoded><![CDATA[<p>Billy,<br />
I agree with you for the most part. I recognize the need to be careful about throwing numbers around unless we state the exact nature of the test. I&#8217;m not saying that the Google Optimizer is lacking. (The reference to 1M page views is on page 14 of the <a title="Demo" href="http://services.google.com/websiteoptimizer/">Overview Demo</a>). But I am saying that small and medium sized businesses need to moderate expectations of the number of tests they can run when they have moderate traffic. They are faced with running one test at a time over longer periods.<br />
Here&#8217;s an example: Lets say you&#8217;re running a home page offer test and a shopping cart test and both are multivariate with 4 treatments each. Even if you assign the lions share of page views to the control group on the home page test, the factor for your cart test is diluted to a point that you cannot run the cart test and possibly not even the home page test unless it is A/B.<br />
Overall conversion rates for most e-commerce businesses fall into the 2% to 4% range and that is how I came up with the 20-25M sample size.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Billy Shih</title>
		<link>http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/comment-page-1/#comment-774</link>
		<dc:creator>Billy Shih</dc:creator>
		<pubDate>Fri, 21 Dec 2007 19:39:08 +0000</pubDate>
		<guid isPermaLink="false">http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/#comment-774</guid>
		<description>Disclaimer: I am an optimization analyst at Widemile and we specialize in multivariate and split testing.

I can see where you are coming from with your issues, but I think there are a lot of misconceptions and mistakes going on with multivariate and a/b testing since it is a very nascent industry in the online world.  I agree that sample sizes are important and that it is unrealistic for every business to be doing large multivariate tests or even split tests, but there are some problems with some of the things you suggest.

Any test that lasts 6 months to a year is probably invalid in itself because that is too long of a time period.  If you sell Ski&#039;s, you probably get different types of traffic in the Winter than in Spring, so optimizing your site over that long of a period will skew your results in different directions.  The longer your test is, the more noise you introduce to your results, which at a certain point makes your results not statistically relevant.  Even for non-seasonal products, assuming that 6 months to 1 year has no significant traffic changes is probably unreasonable since most companies are doing PPC and other advertising changes along with SEO over that time.

Also, multivariate testing requires a certain amount of conversion traffic only 99.9% of the time.  This is simply because conversions are much harder to get, so the sample size of page views is almost never a problem. Also, you can not say that you took into account for conversion traffic with such a narrow number from 20-25M.  My company has tested sites with conversion rates as low as .1% all the way up to 30%, which is a huge difference in the amount of traffic required.  You have to remember that some multivariate tests take longer than others also.  A very basic one could take a site one week, while a very complicated one could take the same site 4 weeks or more.

While we rarely work with small businesses, many of our clients are medium sized.  Not every business fits but if they do, its because of their conversion traffic.  We have not had problems optimizing their pages using split and multivariate testing and we get statistically relevant results within a month or less typically.

Google&#039;s tool, while not perfect, does a great job at driving real results.  I&#039;m not sure where you got the 1M page views a week number from, nor the context of it, but from what I&#039;ve seen, their calculations of how long it takes to run a statistically significant test have been accurate.  Since their tool is free, and only helps to boost their AdWords revenue, Google has no incentive to give people a tool that gives them bad results.  

Also, I can&#039;t speak for other companies, but it is in our best interest to create long term conversion lifts for our clients.  Many times I have pushed clients tests to run a test longer simply because we need to get solid results.  I think you should give Google Optimizer another try, if you haven&#039;t already and see if you still think the same way.  As long as you follow their guidelines of not testing too many things in regards to your traffic, it works.</description>
		<content:encoded><![CDATA[<p>Disclaimer: I am an optimization analyst at Widemile and we specialize in multivariate and split testing.</p>
<p>I can see where you are coming from with your issues, but I think there are a lot of misconceptions and mistakes going on with multivariate and a/b testing since it is a very nascent industry in the online world.  I agree that sample sizes are important and that it is unrealistic for every business to be doing large multivariate tests or even split tests, but there are some problems with some of the things you suggest.</p>
<p>Any test that lasts 6 months to a year is probably invalid in itself because that is too long of a time period.  If you sell Ski&#8217;s, you probably get different types of traffic in the Winter than in Spring, so optimizing your site over that long of a period will skew your results in different directions.  The longer your test is, the more noise you introduce to your results, which at a certain point makes your results not statistically relevant.  Even for non-seasonal products, assuming that 6 months to 1 year has no significant traffic changes is probably unreasonable since most companies are doing PPC and other advertising changes along with SEO over that time.</p>
<p>Also, multivariate testing requires a certain amount of conversion traffic only 99.9% of the time.  This is simply because conversions are much harder to get, so the sample size of page views is almost never a problem. Also, you can not say that you took into account for conversion traffic with such a narrow number from 20-25M.  My company has tested sites with conversion rates as low as .1% all the way up to 30%, which is a huge difference in the amount of traffic required.  You have to remember that some multivariate tests take longer than others also.  A very basic one could take a site one week, while a very complicated one could take the same site 4 weeks or more.</p>
<p>While we rarely work with small businesses, many of our clients are medium sized.  Not every business fits but if they do, its because of their conversion traffic.  We have not had problems optimizing their pages using split and multivariate testing and we get statistically relevant results within a month or less typically.</p>
<p>Google&#8217;s tool, while not perfect, does a great job at driving real results.  I&#8217;m not sure where you got the 1M page views a week number from, nor the context of it, but from what I&#8217;ve seen, their calculations of how long it takes to run a statistically significant test have been accurate.  Since their tool is free, and only helps to boost their AdWords revenue, Google has no incentive to give people a tool that gives them bad results.  </p>
<p>Also, I can&#8217;t speak for other companies, but it is in our best interest to create long term conversion lifts for our clients.  Many times I have pushed clients tests to run a test longer simply because we need to get solid results.  I think you should give Google Optimizer another try, if you haven&#8217;t already and see if you still think the same way.  As long as you follow their guidelines of not testing too many things in regards to your traffic, it works.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tom Lindmeier</title>
		<link>http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/comment-page-1/#comment-763</link>
		<dc:creator>Tom Lindmeier</dc:creator>
		<pubDate>Thu, 20 Dec 2007 19:51:58 +0000</pubDate>
		<guid isPermaLink="false">http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/#comment-763</guid>
		<description>Craig, Thanks for the visit.
I assume that you&#039;re referring to site optimization relating to site design, cart functionality, etc. Not landing page testing.
Actually I&#039;m saying that tests could take up to six months or a year.  The 20-25M number takes conversion into account so you are not dealing with an inordinately small number of orders. The lower your expected conversion rate, the higher your test count should be. Visits rather than page views would be your criteria for test sample size (depending on your test). Page views or click-thrus would be the basis for landing page tests.
Yes, an A/B test is better but may require a second or third round of testing so it could take more than a year to test one hypothesis. Nobody wants to hear that because we all want to move swiftly when out-flanking the competition.</description>
		<content:encoded><![CDATA[<p>Craig, Thanks for the visit.<br />
I assume that you&#8217;re referring to site optimization relating to site design, cart functionality, etc. Not landing page testing.<br />
Actually I&#8217;m saying that tests could take up to six months or a year.  The 20-25M number takes conversion into account so you are not dealing with an inordinately small number of orders. The lower your expected conversion rate, the higher your test count should be. Visits rather than page views would be your criteria for test sample size (depending on your test). Page views or click-thrus would be the basis for landing page tests.<br />
Yes, an A/B test is better but may require a second or third round of testing so it could take more than a year to test one hypothesis. Nobody wants to hear that because we all want to move swiftly when out-flanking the competition.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Craig</title>
		<link>http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/comment-page-1/#comment-762</link>
		<dc:creator>Craig</dc:creator>
		<pubDate>Thu, 20 Dec 2007 16:01:08 +0000</pubDate>
		<guid isPermaLink="false">http://marketplanb.com/blog/2007/12/19/the-bad-science-of-ab-and-multivariate-testing-for-e-commerce/#comment-762</guid>
		<description>I think you&#039;re assuming tests only take a week.  Most experts recommend tests over a month.  Also, doesn&#039;t significance depend on the conversion rate, relative difference in treatments, and number of treatments as much as it does on traffic?  Lastly, if you don&#039;t have a lot of traffic, isn&#039;t a simple AB test still fine?</description>
		<content:encoded><![CDATA[<p>I think you&#8217;re assuming tests only take a week.  Most experts recommend tests over a month.  Also, doesn&#8217;t significance depend on the conversion rate, relative difference in treatments, and number of treatments as much as it does on traffic?  Lastly, if you don&#8217;t have a lot of traffic, isn&#8217;t a simple AB test still fine?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

