The notion of an A/B test is premised on the fundamentally flawed assumption that there exists one version of some treatment that is better on average for all users. Analytics practitioners should reject the assumptions of homogeneity and start designing systems that allow for (and encourage) non-binary outcomes of tests.
In the past few weeks, two really interesting articles about non-standard interpretations of A/B tests have been published. One, from Uber’s engineering blog, is about calculating “quantile treatment effects” and
…