IS A/B Testing Completely Useless?

Not entirely. 

A/B and Multivariate testing can be truly useful in the right situations when defined by desired outcomes and the proper inputs. It can also be incredibly harmful when placing too much emphasis on its results, and when reading into the wrong data sets.

I’d like to share my experiences with A/B testing, proven results, and general opinion on the matter. I’d also love to hear your experiences with the same; any successes and/or failures you may have experienced with similar initiatives, or if you think I’m just blowing hot air. Please comment and let me know what you think.

When A/B testing is useful

On static items where you can clearly define an acquisition formula.

You have two calls to action that you want to test. Or you want to find out if an orange button performs better than a green one. And you’re marketing this to similar sets of acquisition targets. The nature of A/B (and multivariate) testing is very fickle. True insights need to be gleaned in the most elemental of forms, ideally in a vacuum.

By vacuum, I mean a controlled environment: The nature of optimizations are based on a unique formula set that starts at the acquisition level. In a perfect situation, the RIGHT targets are driven to the best possible landing page with the best possible calls-to-action, and the best possible conversion mechanisms (sign up buttons, forms, etc). Each and every point in this interaction defines a certain formula – and when that changes (your calls to action, your forms, your targeting, etc) your formula has changed. 1+1=2 in all ecosystems, not just the nature of physics. This is no different in business and marketing. If you want to achieve a desired outcome (or result in mathematical terms) you must ensure that you adjust your formula accordingly.

Therefore, keep your formula simple if you want to understand it. If you’re a marketer you’re not a data scientist – and vice versa.

Do you want to find out if one CTA or form/button performs better than another with similar audience sets? Then A/B testing is for you.

When A/B testing is harmful

This happens when you have too much data. When a set of data is too large, it becomes rapidly unmanageable for the human brain to make any sense of. While, systems and analytical databases have advanced to a level in which we can see “any view, into any segment” our brains have not.

Because our brains are not as advanced as computers and cannot easily digest complicated formulae, A/B testing becomes very harmful when you granulate at an extremely fine level. There is no possible way that you can define an uptake in user engagement (for the purpose of scaling) by one formulaic output that does not apply to ALL of your targets. Especially if you are not a robot.

At the elemental level – if you are testing items with different audiences, additional CTAs, new buttons and forms your formula is already broken. How can you comprehend a calculation on true uptake from ALL of these inputs? 

I read an excellent article from the folks at VERO yesterday. It illustrates how human interaction is the key opportunity to recognize data deficiencies, not the other way around. If you put some thought to this idea, you’ll realize that it’s true.

When you take human interaction out of a system, you’re removing key opportunities to see what really happens along the way. You miss stories, experiences, and struggles – and that’s often where the real insights are hiding.


To VERO’s point, and my professional experience – it’s likely that your organizations CSR is the one pointing out product deficiencies based on conversations with clients and partners. Not because of minute optimizations in on page strategies.

Additionally, it’s philosophically hard to comprehend why it would be beneficial to place “many feet forward”. I was always taught to place my “best foot first” and optimize from those results (in every aspect of my life).

A/B and Multivariate testing are not useless. But it’s imperative to set yourself up to with the right formulas for success or an outcome can be meaningless and non-actionable. To define your best “formulae” there are three things you should always ask yourself; What are the goals I am trying to achieve, how can I keep these tests organized, and how does this translate to the actual consumer experience.