Many types of A/B tests are designed to increase conversion - to get a user to buy something, or signup, etc. I have personally (and I'm sure lots of folks on this site) been involved in A/B tests that specifically test what many would consider "dark patterns" to increase conversion.
Just take a look at Booking.com, which is famous for their A/B testing. Right now I get a popup banner when I hit that which says "Welcome back! It's always a pleasure to see you! Sign in to see deals of up to 50% off." I guarantee the text in that banner has been A/B tested 9 ways to Sunday. I'd even bet they tested the percentage amount (i.e. whether it was 50%, 30% etc.) Of course "up to 50%" could mean 0%, which it probably is in most cases. And the whole purpose of that banner is to get you to authenticate and sign in, so they can track you better.
So yes, it most definitely will apply to certain forms of A/B testing. That also appears to be the point.
so, users would much rather get ads they aren't interested in? the point of a/b testing is to show you something you want to see and might be interested in purchasing.
the point of a/b testing is to show you something you want to see
Not always. A/B testing is the reason that we now have weapons-grade clickbait headlines, and those terrible little grids of ads at the bottom of blog posts. Neither of those are good things.
If they kill A/B testing then probably a generation of startups that modeled themselves after Google and their diaspora won't know how to design products anymore.
Good riddance! It’s high time companies stop gravitating to the local maxima for every decision. As a user, I want thoughtfully developed experiences; not everything has to be a news feed.
A/B testing is an important super basic step to improving the user experience. Without it how would you know what users are looking for? It's important to test the right factors though. I can see it being done wrong and winding up detrimental to the user experience, but not doing it all is definitely not a solution.
How essential is it though really? How did we ever manage before it became a thing?
Obviously a good number of A/B tests are pretty innocent, but if it's non-trivial to differentiate between them and https://en.wikipedia.org/wiki/Nudge_theory then I'm 100% for completely ditching A/B tests.
I think the medium, data collection, and scale matter. It's never been so affective or efficient as it is now (and will become).
Gathering data from a million people on which shade of red makes them more likely to click a button is entirely different today due to the scale, how cheap it is to setup, and how cheap it is to tweak. This data can then be used to "nudge" people towards a direction that you benefit from (and they may or may not benefit from, and society at large may or may not benefit from). At scale, these very small nudges can have an impact. The unregulated methods we use for this keep improving (AI).
Not to throw shade, but there's a reason why Amazon has been hiring behavioural psychologists. We should be aware and thinking about this.
I disagree in that I feel the higher effectiveness results in better UI.
Perhaps we need to simultaneously inform people through better education at the same time for how to resist the urge to spend borrowed money whenever possible?
Incentives of the publisher and the consumer aren't always aligned. Publisher might want you to spend / use the mobile app (tracking) / budge your political leaning / confuse you with disinformation / etc. The consumer / user is simply outgunned, and it's getting more and more lopsided. Regulation is inevitable.
This isn't just about good UI. Not everyone is using these sort of behavioural tests to present a better UI. It's also about influence (micro influence). I'm not sure you're seeing the whole picture.
It sounds like conspiracy, but Obama and Cameron had "Nudge Units", and that was 5 years ago.
I think it's good policy and will also be really funny, which is why I think it should become law. It'll force Silicon Valley to learn empathy overnight.
How would this even work? Say I decided to buy two billboards with different designs selling the same product, and put a different phone number at the bottom of each. Looking at my phone bill at the end of the month, I have a count of how many responses I received from each billboard.
> BEHAVIORAL OR PSYCHOLOGICAL EXPERIMENTS OR RESEARCH—
> The term "behavioral or psychological experiments or research" means the study, including through human experimentation, of overt or observable actions and mental phenomena inferred from behavior, including interactions between and among individuals and the activities of social groups.
Are you even serious? We're not talking about A/B testing prescription drugs with placebos. We're talking about testing different images. Different colors for buttons.
"It has published details of a vast experiment in which it manipulated information posted on 689,000 users' home pages and found it could make people feel more positive or negative through a process of "emotional contagion".
But it's really not a one-off. This has become modern day marketing tactics. Guarantee someone gave a presentation today to a bunch of execs about how to manipulate a percentage of your users to achieve [x] goal by lightly "nudging" them.
Saying that A/B testing is just different colors for buttons is intentionally ignoring the past 10 years of facebooks development process. Every single aspect of the platform is AB tested and that platform has a big effect on peoples lives.
I'm still confused about why I'm supposed to be upset that Facebook A/B tests their features on their users. It seems to me that if they're allowed to do either A or B, they're allowed to measure the influence of A vs B. I don't see where the outrage is.
You shouldn't be upset about that. You should be upset that Facebook is performing tests on its users to optimize against the interests of those same users, without letting the user know what they're doing.
As I said in another comment this is about consent.
It's not that the testing the conversion rates of button a versus button b is in and of itself immoral, it's that experimenting on people without their informed consent, under any circumstances, is. I'm intimately familiar with FBs platform as a developer and a user and its my intuition that 9/10 people aren't aware of the degree to which they are being experimented on via multivariate testing and I think a reasonable person would say they have a right to be informed of this.
Another note is that after years of using the platform I can tell that when non-technical people DO become aware of the fact that their experience using the application is sometimes fundamentally different from others because they're in a non-control bucket they generally react pretty negatively to the notion. Sure, some of this is the standard "users always hate every UI change no matter what it is" syndrome but I've noted a lot of "this is creepy and i wonder how much it's been happening before" which is, imo, a super legitimate response, and shouldn't be disregarded because its inconvenient for fb to get consent.
Consent only applies for things you wouldn't be allowed to do without consent in the first place. What if Walmart decided to have the greeters at half of their stores be rude to customers and compare sales numbers? Would that require advance consent? Clearly it wouldn't because there is no law against bad service. The fact that the click whores who call themselves journalists (who are also competitors of FB) call it "psychological experiments" to scare non-technical people is irrelevant.
"What if Walmart decided to have the greeters at half of their stores be rude to customers and compare sales numbers? Would that require advance consent? "
To me, this could definitely qualify as "psychological experiments" if it were intentional as you describe. Most likely a failed and useless experiment though, but that's due to the medium and the difficulty to implement correctly (how would you guarantee none of your greeters step out of line? What if you wanted to quickly evolve and modify the experiment?).
The fact is that it's much easier to run these sort of experiments on a web site than it is in meat space. It can also be much subtler and far more specific. It would be impossible to manipulate the variations in the real world as efficiently (or at all) like you can online.
The ability to actually do this stuff efficiently and at scale is pretty recent, and we ought to consider and deliberate over the consequences.
Feature experiments are also a thing that exists. I want to deploy a new widget, and need to check that it works, and hasn't done something unexpected that drives users away. Experiments are how you do it.
How about these: Are corner stores allowed to experiment with pricing? Are restaurants allowed to experiment with new menus? These are experiments involving humans. Are you just asking for poorly designed experiments?
What you're asking for is companies to launch once and never know if it worked. And indeed, software used to be like that, and it sucked...
> Feature experiments are also a thing that exists. I want to deploy a new widget, and need to check that it works, and hasn't done something unexpected that drives users away. Experiments are how you do it.
Experiments and experiments on live non-consenting users are two different things.
> How about these: Are corner stores allowed to experiment with pricing? Are restaurants allowed to experiment with new menus? These are experiments involving humans. Are you just asking for poorly designed experiments?
Let a corner store charge different people different prices and let me know how far you get. The also have to deal with consequences fro their experiments, if a customer sees the price of an item has double in an experiment they're unlikely to come back, there's an asymmetry issue and not coming back is often not an option you have in an environment with lock-in and network effects.
> What you're asking for is companies to launch once and never know if it worked. And indeed, software used to be like that, and it sucked...
Yes, developers had to think through design decisions, stick to well defined HIG's and use controlled test groups, truly a dark age.
Explicit consent is already given for feature changes just by using the site. How does the act of gathering scientifically valid information on those features substantively change the dynamic such that extra consent is required? It doesn't seem to me that it does.
These mega websites should probably be held to a different standard to the rest. No ones life is changing when I try out different colors but some of the stuff facebook is testing is very unethical.
I think the standard shouldn't be size, but type of software. Facebook is a platform. People expect (reasonably or not) some element of stability in a platform. I don't want even a small platform doing tests on me and my data.
But if it's a game, or a blog? Knock yourself out, no matter how big it is.
I agree with you but personally I dont see how its so onerous for blizzard or rockstar to tell me in plain language what it intends to do with its behavior tracking (or really that its tracking my behavior at all.) For me this is about consent, and I'm willing to consent to things that I'm made aware of. I mean, I'm a software developer too, I know there are legitimate use-cases here.
The alternative is to not know at all that this is possible and accidentally design something that makes people more negative.
Knowing this is possible and how to measure the effect, lets them detect when they accidentally do it and reverse course.
Making it illegal to figure out the negative impacts of your decisions will make it harder to avoid them.
It would be much better to require disclosure when these negative impacts are detected and require that this information can only ever be used in the best interest of the user.
Please find me one real scientist who would argue it's ethical to encourage some people to commit suicide to find out how to avoid encouraging people to commit suicide.
Just because the outcome of the research is potentially valuable doesn't mean it's ethical to conduct it on people, especially without their consent.
More accurately, expose people in a controlled setting to what they're already being exposed to, to find out its impact. But this sounds pretty neutral.
I wholly disagree. Facebook only shows some posts algorithmically, and emotionally, they were probably more or less neutral. They chose to expose some people to predominantly sad and depressing posts, which is not "what they're already being exposed to", and it's without the more positive posts to balance it out.
They explicitly created a situation to depress people, which could definitely increase the likelihood of suicide, particularly if they happened to randomly select someone who already was predisposed to that for other reasons.
I would argue someone at Facebook should've been brought up on criminal charges for this "experiment".
>Facebook only shows some posts algorithmically, and emotionally, they were probably more or less neutral.
I'm assuming you mean in aggregate?
> They chose to expose some people to predominantly sad and depressing posts, which is not "what they're already being exposed to"
Can you source your "predominantly" here?
If their algorithm is operating randomly, it stands to reason that some amount of people will get a "predominantly" negative feed from time to time. So in this sense, some people were unwittingly being exposed to a predominately negative feed. So it seems reasonable to understand the results of this.
If their experiment resulted in people seeing negativity far beyond what is a possible outcome from their algorithm, then you might have a point about it being unethical.
Criminal Negligence/Manslaughter. You don't have to break a specific law, you just you have to be culpable in something that could foresee-ably have a reasonable chance for someone to be harmed.
Criminally negligent homicide, and manslaughter are specific laws. Conviction generally requires proof that the subject's actions were the proximate cause (a legal term with its own case law) of the victim's death.
I'm aware of only one case where someone was convicted of such a crime in the US without being physically involved in the death: Michelle Carter, who directly and repeatedly encouraged her boyfriend to kill himself, and goaded him into continuing what was ultimately a successful suicide attempt when he started to back out. Despite her active encouragement and unambiguous intent, the legal theory was controversial and the case has seen multiple appeals.
I find it quite unlikely that a court will accept the argument that intentionally making someone sad is the proximate cause of their death by suicide, even if done to a large number of people at the same time. Were that argument accepted, it could be applied to other situations affecting the emotions of many people just as easily, such as producing a sad song or movie.
Indeed. And note that these laws are generally state laws in the US, not federal. So just in the US, there's a lot of possible versions of the law/jurisdiction Facebook could face.
I didn't think if that, the jurisdiction would be where the person died, so potentially anywhere in the world with an extradition treaty could charge them.
If their feature A is OK to do in isolation, and could conceivably lead someone to commit suicide, then I don't see the problem with measuring that impact vs some feature B.
This seems to be ignoring consent. I'd bet you if I went around asking people whether or not they realized their FB app experience was being consistently multivariate tested i'd be on the street a while before someone said yes.
This is FAR different from product testing, say in the hardware world, where you tell people you want them to come test a product, or in the design world where you show them various things and quiz them on their feelings. In these situations they all know they're being tested on.
So no, this isn't "the least" user hostile thing you can do. Doing things without content is basically the prerequisite for hostility here.
> Doing things without content is basically the prerequisite for hostility here.
Do you understand how often websites change without asking the user? Websites are constantly being updated, algorithms tweaked, features being added and taken away. You seem to be taking offense to the fact that they're providing a different experience to different subsets of the userbase? Is that what you're trying to ban? What could that possibly accomplish?
If you don't have A/B testing, then websites are just going to do it the old fashioned way: collect data, make the feature change, compare the data. What does this solve?
Look this clearly is not in the spirit of the argument I'm making. Why would I advocate for rules that restrict a websites ability to change? Again, im talking about consent when it comes to how your behavior is going to be used.
Further I would say that the parent post isn't even about this. It's about protecting the consumer and yes, I would go so far as to say that if the "change" that the websites want to do has violates the rights of the user then yeah they should be restricted in their ability to do so!
No company in any domain rolls out products globally all at once. McDonalds introduces new menu items in test markets, tv shows start as pilots, software is deployed gradually and as it’s deployed it’s usually measured and rolled back if it’s not working as expected. AB tests are hardly any different. Smart companies experiment.
Why do users need to be explicitly informed of AB tests but not about other new gradual feature roll outs?
Frankly I think when you use a web site you are giving consent for your behavior on that site to be analyzed. I wouldn’t act indignant at traditional retailers attempting to learn from my shopping behavior in their stores so that they can improve their shopping experience. That’s just how businesses work.
> Again, im talking about consent when it comes to how your behavior is going to be used.
Why? To make a physical analogy, you're on their property. You're in their store, walking around perusing their wares, using their tools, so of course they have the complete right to watch you.
There is no way to legislate this, your only option is to raise a stink about it and hope that they'll be more transparent in the future. You can't "require" companies to tell you how they're using your data. Once you've consented to your data being collected, that's it.
It is very rare to see this happening with existing features that the users use and like.
And, they do have consent to change the site at any point. The opposite would be for websites to never be allowed to do any kind of update because they didn't have user consent beforehand.
"Better" for who? The website owners interests are rarely in alignment with my interests. They want increased sales, higher user engagement, etc. I want less engagement and the ability to make informed choices on products.
Let's say there's a product listing with it's list of features and this list has been extensively tweaked to maximize sales. As a result of that tweaking they took away a line item, that would have caused me to not buy it, say an annoying LED status indicator. This is good for the website owner but bad for me because I've lost the ability to make an informed decision. It's asymmetric manipulation and I'd regard it as immoral.
For these companies "improve their experience" = optimizing for maximum time spent with the product even when it's not in the best interest of the people using the product. i.e. Netflix autoplay, algorithmic newsfeed that tends to show more outrage-inducing content, and artificial notifications that aren't from people you know, but are engineered to get you back into the product unnecessarily. They just want to capture as much of your attention as possible because it increases the number of ads they can show/sell. That's not improving user experience, that's hijacking attention to maximize profit. Considering attention is the primary instrument nature has given us for crafting our lives, that's a pretty user-hostile thing to do.
I hope everyone down voting this realizes that A/B testing has an "infinite" resolution.
Most people when they hear A/B testing think of something like switching the color of a button, or what have you. But when you start A/B testing a complex series of permutations of components in aggregate, specifically designed to target certain psychological profiles, you can actually start to learn a lot about the group you're testing.
To pretend that A/B testing is some simple "what image do you like more?" game is entirely disingenuous and is so characteristic of the attitude people in tech have when dealing with people or any kind of social side effects of systems they build.
We absolutely do need to be very careful with this. Careful with how we test people, careful with how the information is recorded, and careful with what we do with the data.
I'm going to start calling A/B testing "psychological side channel attacks" and maybe then places like HN will appreciate what's happening more.
I think people on HN generally have a good understanding of what A/B testing is: "if I change X, will users do more of Y?"
Of course Y is usually something that makes the site owner money. Sometimes A, B, both, or the entire business model of the site is unethical. In such cases, they would still be unethical if a comparison test was not run. In cases where none of them are unethical, I have trouble imagining a realistic scenario in which the act of running a comparison test makes it unethical.
>I think people on HN generally have a good understanding of what A/B testing is
Not only do I not believe this, as most people on HN tend to have a very superficial understanding of whatever tech is being discussed, there is always a dismissal of any social consequences that may happen as a result of using any kind of technology.
>I have trouble imagining a realistic scenario in which the act of running a comparison test makes it unethical.
I can't tell if you're being serious or not. You don't need any kind of testing framework for a few famous examples to satisfy your "realistic" qualification. I see a lot of bland contrarian stuff on HN, but I'm kind of speechless right now.
I'm entirely serious. To cite one of the most famous examples, Facebook ran a sort of A/B test where its algorithm was adjusted to attempt to make users happy or sad.
There's a fairly strong case to be made that intentionally making a large number of people sad just to see if you can is unethical. There's a somewhat weaker case to be made that manipulating the happy group was also an unethical distortion of their reality. I fail to see an ethical problem with the fact that it was an A/B comparison. Instead, A and probably B would be unethical to attempt under any conditions without consent.
Wait, does this refer to A/B testing?