If you came here from the related talk, thanks for watching! If not, thanks all the same, and you’ll probably be able to use this as a starting point to do more reading about whatever part of AB testing piques your curiosity.
There was a lot of info I wasn’t able to squeeze into my talk for the 2019 AB Testing Summit, including proper reference or sourcing. I’ve supplied that info here, along with a few comments I hope will be useful to those looking to do further reading about Bayesian AB Testing in Marketing.
Also, if you have a question that won’t fit in Twitter, please leave a comment and we can have a more detailed discussion that will be easier for people with similar questions in the future to find.
You can find the slides here.
Introduction to Bayesian Methods in Marketing
As a first look at Bayes Theorem, you could do a lot worse than An Intuitive (and Short) Explanation of Bayes’ Theorem
We almost immediately get into considering where Bayesian Priors come from and the validity of these processes, which is hotly debated. By extension, we need to consider several sources to get an idea an informed perspective.
If you don’t read any other source I provide, 5 Reasons to Go Bayesian in AB Testing – Debunked is probably the fastest way to get see a well argued perspective that diverges from the generally accepted hype. If you want to learn more, the author, Georgi Georgiev, is knowledgeable and prolific. There are several other great articles and whitepapers that explain how Bayesian methods may not always live up to the accompanying spin, which were immensely useful in building my understanding and preparing this talk. We’ll mention some more of his work later, but the one blog link above and two following white papers offer a good overview of the contention.
- Issues with Current Bayesian Approaches to A/B Testing in Conversion Rate Optimization
- The Google Optimize Statistical Engine and Approach
The overwhelming majority of discussion about Bayesian methods in AB testing was dismissed as coming from sources with a clear conflict of interest paired with an almost complete lack of transparency, and in some cases, presented with total incoherence. The pro-Bayesian materials we looked at that were useful were from referred journals or technical whitepapers with a high degree of transparency, for which the vendors should be commended.
- Bayesian Statistics and Marketing – Published back in 2003, this lays a lot of the groundwork about how Bayesian methods work and why they are useful to marketers, even though it predates the prevalence of AB testing that online businesses enjoy today.
- Bayesian A/B Testing at VWO
- The New Stats Engine (at Optimizely)
- If you know where I can get my hands on a Google Optimize white paper let me know.
We also looked at a non-Bayesian modern CRO tool that claims to deliver similar advantages of those claimed by Bayesian tools. We wanted to establish if the claimed advantages of Bayesian tools are only possible with Bayesian tools, or if non-Bayesian methods that are appropriately advanced and customized for CRO can deliver them as well. It may not be a huge surprise that this method was created by the aforementioned Georgi Georgiev.
Introducing Bayes Visually With Python
Introduction to Bayesian Inference is a great first look at how doing your own Bayesian analysis could work, and even if that isn’t your goal, it provides a very helpful visual interpretation to how Bayes works in marketing.
The 3d graphs were based on, like most of the code holding the internet together, a Stack Overflow answer.
Ease of Explanation
Sorry, but I’m going to include a bit of a rant that I didn’t have time for in the talk. Skip to the TLDR if you’d like.
The whole “frequentist methods are ridiculous because [some variety of absurd claims about how frequentists aren’t allowed to discuss results or use their brains except in the strictest possible interpretation of the math they use]” such as those included by Google in their Optimize documentation are a text book example of fallacious “straw man” arguments.
Simply put, you make up a version of your enemy that doesn’t reflect reality, but completely supports your argument, then argue against that instead of what the actual enemy says. I say enemy instead of interlocutor or opponent because I consider this kind of straw man accusation defamatory. The harm in this case is not just done to the victim who is falsely accused of whatever failings the fictional straw man has, but the public trying to use these tools and understand the documentation.
This argument has been raised and conclusively shut down repeatedly, perhaps most devastatingly by A “Bayesian Bear” rejoinder practically writes itself… where a mathematician gives a hilarious example of what happens if you make a Bayesian straw man to compare to the Frequentist one made by unscrupulous Bayesians.
TLDR: If you hold mathematicians to the strictest possible requirements of their assumptions and definitions, the only people who will understand are other mathematicians, and even they will think you are a jerk that is just wasting people’s time. For Bayesians to pretend Frequentist methods are the only part of statistics facing this issue reduces the credence we can give other claims made by this particular subset of Bayesian proponents.
Optional Stopping and Bayesian Suitability for CRO
There’s a lot of material out there about optional stopping, but here are some excellent reads on the topic that go deeper than your average blog post without turning into heavy math and coding.
- Bayesian AB Testing is Not Immune to Optional Stopping Issues
- How Not To Run an A/B Test
- Is Bayesian A/B Testing Immune to Peeking? Not Exactly
Faster CRO Results
Information about specific products comes from either the aforementioned whitepapers or website site documentation accessed over the month of May 2019.
Recommended Non-Bayesian Products
Optimizely, VWO, and Google Optimize, even once the hype is taken away, all make compelling cases to any org without the data science resources to do their own modelling. There are some non-Bayesian alternatives, however, that allow users to compare Bayesian tool results to traditional or more advanced frequentist menthods, for free or very little investment.
Google Analytics users can take advantage of Stéphane Hamels’ DaVinci Tools. This Chrome extension, even in its free version, makes a million and one quality of life improvements to the Analytics UI, as well as adding some useful features. One of these is the ability to take a t-test by clicking a few times on any Google Analytics report. If you also use Optimize, your experiment data is already in GA in a format that makes this very easy.
For a more sophisticated approach, have a look at the Agile A/B Testing Calculator included with Georgi Georgiev’s Analytics Toolkit. It’s very inexpensive, and Google users can connect it to Analytics/Optimize easily.
Sorry for the Google-centric view but it’s what I spend most of my time on. If there are other tools out there you feel provide a good way to double check what the Bayesian “Big Three” are recommending, I’d love to hear about it.
Please let me know in the comments or on Twitter, or reach out with any other questions or comments!