Jason Carmel is a colleague of mine I learn a ton from. His expertise is in web site optimization--running experiments where he tests versions of web pages against each other to see which performs best. (Not to be confused with search engine optimization, improving a site's visibillity in search engines.)
Jason is a fairly unflappable guy. Nonetheless I recently started making an effort to get his goat. He gives me just enough encouragement that I keep going. The gist of my teasing is that optimization is nothing but a mechanical exercise to determine whether a red button works better than a blue button. "Glad to hear that red button worked out better by 2.84 percent Jason. The sum of your creative energy has produced yet another quarter million in revenue. You must really love your life, man. Hey, have you thought of trying one of those animated GIFs instead of a regular button?"
Fortunately, Jason is twice as nice as me, as well as twice as smart. He takes my ribbing well--and responds thoughtfully to the serious question underlying my teasing: We know optimization can move big numbers in terms of revenue, but can it do more than simply tweak pages to bump up conversion? Can it vet creative concepts? Can it maximize the creation of mutual value between businesses and customers? Can it help create more engaging experiences?
(image credit, found via Lee)
The short answer here, according to Jason, is that it depends, partly on what you're trying to achieve. If all you're focused on is moving business value measures, you're probably putting lipstick on a pig. But testing against value creation has the potential to uncover game-changing opportunities.
Here's an email exchange between Jason and me, in which he explains in a little more detail:
RYAN:
That whole web site optimization thing—isn’t it really just putting lipstick on a pig?
JASON:
I think “I hate you so much” might be a succinct way of responding, but I'll provide a little more detail:
Web Site Optimization is exactly like putting lipstick on a pig, but only if you start out with a pig. And if you are starting out with a pig, your opportunities for improving things are limited, and you’d be using the wrong tool to fix the problem. We are talking here about the concept of a “local maximum” which is a fancy, math term applied to mean “the best something can be within a limited dynamic.” Consider the aforementioned pig’s ability to fly, which, metaphorically speaking, is not particularly developed. We could take a pig and genetically modify it to be more aerodynamic. We could investigate building pig hang-gliders and attempt to train the smartest pigs to use them. But even in the best case, with the most aerodynamic pig, benefiting from the best training, and using the best pig flying technology, it will never fly as well as a bird. The best case flying scenario for a pig (the pig’s local maximum as far as flight is concerned) is nowhere near as effective as a bird’s. In that scenario, you’d be better off exchanging the pig for a bird at the start, rather than waste any time or effort teaching a pig to fly better.
Applied to the web: if a site sucks so much- if the goals and purpose are unclear, if the information architecture looks like my desk (at the moment), if the navigation is counterintuitive and the messaging has absolutely no intersect with the audience, then no amount of optimization in the world will make it right. The local maximum of that crappy site is too low for any optimization to matter. Or (even worse) you’d need the infinite number of monkeys to stop typing Shakespeare and to start applying experiments to your site to get the right combination where testing would make a real difference. Neither is very efficient. If your site is the pixilated equivalent of a pig, you need much more elemental help from a user experience expert first (know any?). Until you fix the fatal flaw(s) in a site, anything else you do will be throwing good money after bad.
Site Side Optimization works well in circumstances where the local maximum is high, but for some reason, the site is not achieving it. This can be due to single points of failure on the site, like a specific conversion path or page underperforming, or because the audience needs to be targeted more specifically, or because the existing content is stale/irrelevant. In each of these cases, experimental testing can make a huge difference. Optimization also works exceptionally well (and this is far more interesting to me) when applied as a method of trying out a new (and potentially risky) idea that could radically change and significantly improve an experience. In both of these examples, the basic site is healthy, and the optimization program serves as a tool to reach its fullest potential.
RYAN:
But what I keep looking for is the way to test birds against pigs, not in the sense of which flies better, because as a user experience expert I do have the capability to predict the winner of that contest—but when I don’t have a clear sense of the best conceptual solution. For example, maybe I just can’t decide between eggs and bacon. Can optimization help design a better breakfast, or only decide between pulp and no-pulp in the OJ?
JASON:
Optimization can test more conceptual ideas, but it will be really hard to unpack the WHY after we determine which one wins. Most sites aren’t deciding between bacon and eggs, but rather between the bacon, eggs and hashbrowns with coffee or the granola, fruit and yogurt with yerba matte. If the former wins in a test, I don’t know whether it’s because of the bacon (which will usually win over everything) or the coffee, or because the person deciding had granola for the past three days, and would have taken ANYTHING other than more granola.
The other trick about testing high concepts in a website format is that you would have to build each solution to test them, which is usually more expensive than testing out wireframes or front-end prototypes in front of a more controlled audience.
RYAN:
First, you seem to be suggesting that a test win for the bacon breakfast might not imply extensibility for bacon breakfasts in general: That because, lacking control, the results might be idiosyncratic, they might not therefore apply broadly. Next week you might get a different result. My question is, why does that matter? And why does “knowing why” the bacon breakfast worked matter, as long as you know it worked.
JASON:
It is definitely a question I get from clients a lot. Why do I care about the individual elements of a variant- if the variant as a whole makes us more money, let’s just launch it and move on. I can’t fault the sentiment, but knowing why the bacon worked could lead to better tests, more focused messaging and (even more) cash money. I want to know that it’s the bacon by itself that is the motivating factor outside of all the other influences. Let’s say that we ran a test breakfast against a bowl of Total cereal and we tested bacon with powdered eggs as the experimental variant. Now let’s assume that bacon and powdered eggs lost to the control by 1%. You could take the position that we would do better to serve Total because we want to avoid losing that 1%, and you would be right. But what if you knew that the bacon by itself actually improved the breakfast by 15%, but the powdered eggs were so crappy that they hurt the breakfast by 16%, so you netted the 1% loss? If I controlled for all the variables in my breakfast, that knowledge would a) help me make a better breakfast overall (just serve bacon), and b) will also prevent me from throwing away a positive variable simply because it was paired with a really negative one.
I imagine you run into that a lot with prototyping and wonder how you deal with it in the UX world. If a subject totally fails at a task, are you ever afraid of overcorrecting a prototype to account for it? Do you ever throw the baby out with the bathwater? How do you control for that?
RYAN:
A great question. One of the answers is that in usability testing, you're looking for usability problems. So as long as your test participants are representative of your users as a whole, major failures are, practically speaking, never anomalous. If your user population is one million, and one of the eight people participating in your test has trouble understanding some aspect of the interface, what are the odds they're the only person who's going to have that problem?
The other thing I wonder about is what happens when what you’re trying to accomplish is harder to measure than conversion (e.g. brand lift) or if you want to measure it over time (e.g. engagement). Especially in social media, it’s quality that matters, not quantity. You want to know how valuable your user-generated videos are more than you want to know how many of them you have. Can web site optimization help you get to answers?
JASON:
Ah, you and your social media. When are you going to come to terms with the fact that this whole thing is a fad? The future is in email, Ryan, and lots of ‘em. Mark my words.
Absent a more qualitative tie-in with optimization (surveys, satisfaction scores, etc.) you will be hard-pressed to get good data about branding or the impact of social media. But I’m not saying you shouldn’t optimize for branding or social media. I’m saying you need to get that qualitative kicker. I’ve done a few branding tests, and I think they provide some interesting feedback. But I’ve never optimized where a KPI has to be judged on quality (e.g., good comments vs. troll comments) or off the site entirely (e.g., buzz in the blogosphere). Sounds fun.



You guys are so cute.
Posted by: Justin | November 26, 2008 at 02:46 PM
Interesting comments guys. Sure seems like there is a great divide between driving optimization and concept. You bring up some interesting points regarding the nuances between the two. This post ironically foreshadowed some of the rationale behind the major design jump from Google to Twitter this Spring.
http://stopdesign.com/archive/2009/03/20/goodbye-google.html
Posted by: Brett Schwager | June 03, 2009 at 01:00 PM