There is at least one caveat with multi-armed bandit testing. It assumes that the site/app remains constant over the entire experiment. This is often not the case or feasible, especially for websites with large teams deploying constantly.
When your site is constantly changing in other ways, dynamically changing odds can cause a skew because you could give more of A than B during a dependent change, so you have to normalize for that somehow. A/B testing doesn't have this issue because the odds are constant over time.
When your site is constantly changing in other ways, dynamically changing odds can cause a skew because you could give more of A than B during a dependent change, so you have to normalize for that somehow. A/B testing doesn't have this issue because the odds are constant over time.