Why I don’t test different designs at the same time

A year ago I posted this on LinkedIn:

I tell my students to avoid select boxes.

Because it’s often better to use radio buttons.

But students often say “But it’ll make the page too long”.

Yep, but that doesn’t necessarily mean it’s bad UX.

See the page I designed to let users select a course. Huge list of radio buttons.

But no issues in user research whatsoever.

Does this mean you should always use radio buttons?

No.

But most designers would balk at a design like this even though it worked perfectly well for users.

Here’s the screenshot I shared with the post:

Long list of radio buttons for users to select a course

It got a lot of comments - one of which was:

“What other options did you test? Just because your design worked, doesn’t mean it’s the best!”

Here’s what I said in response at the time:

I rarely test two solutions at once.

Don’t get me wrong, I consider many options. But only one gets tested - unless it fails - then I try another. That’s because there’s usually a clear winner.

Plus testing two versions is full of pitfalls.

[…]

I’ve worked with quite a few product managers and designers who suggest testing multiple versions.

It sounds sensible, right?

Do more work.
Test more things.
Let users decide.

But more work does not always mean a better result.

Here’s why (according to UX expert, Caroline Jarrett who wrote about it in “Designing comparative evaluations”):

Reason #1: You won’t get a clear answer

You’re hoping for “Version A wins!” but what you’ll actually get is:

Parts of A are better, parts of B are better, and there’s probably a Version C that would beat them both.

Not the clear direction you were looking for.

Reason #2: Your results will get contaminated

If you test both versions with the same participants, they’ll learn from the first one.

So some users may prefer the second version because they already understood the task - even when that version was objectively worse.

Reason #3: You need a lot more participants

Comparative tests need 3x your normal participant numbers.

You need to balance who sees what first, and if you’re testing separate groups, you need even more people for the statistics to be meaningful.

Reason #4: Your big differences look identical to users

What seems different to you looks identical to users.

So the differences you’re testing might not even register.

But most importantly is that testing two versions is almost always totally unnecessary.

Instead:

Design one version properly
Learn what’s wrong
Fix it

Less work = better process

If you’d like to learn how to design forms that users fly through using patterns that are a result of designing one version properly, learning what’s wrong and then fixing it, then:

https://formdesignmastery.com