Keywords

Featured Here

  • Alltop, confirmation that I kick ass

and There

  • Communities and Networks Connection

How Work Looks

  • www.flickr.com
Blog powered by TypePad
Member since 06/2004

Creative Commons

Metadata

October 21, 2008

Mining the Business Value of the Social Web: Behavioral Metadata

As an information architect, I work with metadata a lot. I help define interfaces based on information about the content. For example, an object on a home page might be the "newest" object in a system, or it might be a rotating series of "newest" objects by each "author" with "home page" authority. See my other posts on quality and self-organization in the presentation layer if you want more on that.

I work at a company with a strong focus on performance-driven design, and so I work constantly with web analysts, who specialize in understanding behavior on web sites and using that behavioral data to drive business decisions, and I work constantly with designers, researchers, and information architects whose focus is creating innovative, valuable stuff people want to use.

I'm working where I work partly because I think there's tremendous potential in the intersection of information architecture and web analytics, in particular on the social web.

While some companies have been implementing sophisticated web analytics programs for years, many are just starting to measure behavior on their sites. Very frequently, there's lots of data collected but no real plan for taking action based on it.

At the same time, many companies are just beginning to experiment with social media marketing. They're getting out there and participating, and their customers are creating content, but that content is hard to present meaningfully, and partly because of that most social media marketing efforts fail to realize their full promise.

And naturally, the new focus on measurement and data raises the question of how the success of those social media marketing efforts should be measured. I've posted in the past about measuring communities, and there are a lot of people out there contributing to that discussion. What I think is frequently overlooked, though, is behavioral data helping to create value in social systems. Not data about value, but data creating value.

I made this handy chart to help explain what I mean. Click the image for a more-legible view:

 

When users interact with web sites, information about their behavior accumulates--information attached both to the user as object and content as object. That information about people and the things they do--what's called behavioral or tacit metadata--tells a lot about people and content. And in complex, emergent and self-organizing systems, behavioral metadata is especially powerful, because in those kinds of systems it's a challenge on the one hand to present content in meaningful ways and on the other hand to generate business intelligence to drive smart decisions.

As content and people increasingly cross freely between domains in the emerging standards-based social universe, I think we'll see a shift in business models from today's focus on owning content to a new focus on owning the metadata about the content and the people interacting with the content. I'll risk a prediction: In a short time, behavioral metadata, the information about what people do with content, will be more valuable than the content itself.

What do you think?

July 18, 2008

Samantha Starmer on Metadata and Human Relationships

Samantha Starmer of REI entertainingly titled her talk at last week's event Single Athletic Female Seeks Single Slender Male: The Marriage of Metadata and Social Media. Her talk surveyed a number of the challenges and opportunities of working with metadata in social computing.

Here's the 15-minute video:

When Samantha posts her slides I'll add them here.

March 18, 2008

Self-Organizing for Discovery: Relatedness in User-Generated Content

The quality of user-generated content varies widely. As I discussed in an earlier post, it's possible to separate the wheat from the chaff using combinations of explicit and implicit metadata. But once you've identified the good stuff, you start to find more and more of it. User-generated content on successful sites accumulates in real time--lots of it. How do you present it in meaningful ways?  How do you keep the presentation of "best" content fresh? How do you make it findable, rememberable, parsable? You need to set it up to self-organize, and creating a folksonomy is a great way to start.

In a traditional folksonomy (there are several uncommon kinds I won't get into now), users add "tags" or labels to individual content objects. These tags become the basis for a living, breathing categorization scheme that informs search and navigation. On sites like Flickr, folksonomy is used in powerful ways to organize photos into a multi-level hierarchy, which can be filtered by "interestingness" (Flickr's quality concept), by location, by camera, and more to produce dazzlingly multifaceted content organization. Take a look at this page, a Flickr "tag cluster" filtered by interestingness:

 

One strength of tagging systems is that they can organize content across an unlimited number of pivots (though the value of that capability, in terms of informing navigation, decreases as the number of pivots increases). For example, an apple can be tagged with both "fruit" and "red," making it findable within category schemes based on either food type or color.

This is really wonderful stuff. But folksonomy as rendered by tags has its limitations, especially in contexts where there are fewer content objects or less incentive for users to take action to tag them.

In such situations, creating dynamic relationships between objects based on combinations of explicit and implicit metadata adds new layers of meaning, helping users discover content of interest.

There are lots of ways to accomplish this. I'll describe a few in this post, but what it all boils down to is increasing discoverability by grouping objects and presenting them in association with each other. If you're interested in one battery charger, for example, there's a decent chance you'll be interested in another. But from there it gets a little more complicated.

 

Basic Similarity

When I say basic similarity, I actually have in mind a specific kind of rule governing the association of content objects, namely, that they share an attribute. For example, when I view a video on a social media site, the system might suggest other videos I might want to see based on a common tag, a shared word in the title, or a common creator.

But in systems with user-generated content, there are often a huge number of objects. Most often, there needs to be a threshold of similarity applied in order to narrow the number of similar items, such as a certain number of common tags applied, shared tags within taxonomic groupings, or association within purchase patterns.

 

Complex Similarity

Basic similarity is rarely enough. Imagine shopping, for example, for a camera lens. Looking at a detail page for a particular lens, you see a list of "related items." If this list were to include every other lens on Amazon.com, you'd have a gigantic list that wouldn't be helpful. Likewise with a list of all Canon products. But a list with multiple shared attributes, such as "lens" and "Canon" is potentially more useful. But you can take it even further than that by layering in implicit metadata--information provided by people.

In this screenshot from Amazon, there are implicit and explicit metadata layers added to the basic similarity construct. In this case, the user-driven similarity is among search queries. The set of similar queries describes a set of user sessions in which purchases were completed from within the objects returned by the searches. The objects purchased.are therefore similar.

You might wonder: Couldn't you get to this set of relationships using simple metadata from within a controlled vocabulary? The answer is no, because the similarity is ultimately constructed of value judgements by humans. People interested in the same stuff as me decided to buy these items. That's a layer of social information you can only get with robust behavioral metadata. Here's an abstract picture of how this looks:

 

Each of the ovals represents a content object, and each line represents a set of shared attributes. The attributes shared are the same in every case.

 

Complementarity

Complementarity is not the same thing as similarity, and it's very useful to think specifically about the difference. Objects that are complementary are not "like each other" in the sense that similar objects are. Instead, they sort of... "go together."

But what does it mean for objects to go together? How can we understand the relationships between peanut butter and jelly, peanut butter and honey, peanut butter and bananas? Each peanut butter complement has a relationship with peanut butter, but they don't share the same relationship with each other (I'm sure someone out there eats honey and jelly sandwiches, but that's not complementarity, it's surrealism). 

So the relationships among complementary items are differently structured than the relationships among similar items. Whereas items similar to a given object are also generally similar to each other, that's not generally the case among complementary items. You can picture webs of similarity, but complementarity, from a structural perspective, looks more like a spokes on a wheel.

Here's an example from Amazon to illustrate the point:

This illustration is from the detail page of a camera lens. The lens is the primary content object on this page. Each of the items in this list are secondary content objects--each "goes with" the lens. The secondary objects are related to the primary object in a singular relationship, and they are meaningfully related to each other mainly by virtue of their parallel relationships with the primary object.

The lens is the hub of the wheel, and each of the items pictured above is connected to the hub via a spoke.

But if complementarity is a wheel-shape, how do we understand the nature of the spokes in a way that allows us to build complementary relationships into a web site architecture?

Complementarity is about supporting a core function of, completing, or adding value to an object. So to architect complementarity we need to understand a type of "aboutness," the thing that the object is good for.

Behavioral metadata doesn't tell the whole story of complementarity. Here's a case where hybrids of taxonomy and folksonomy come in handy.

 

Preference Among Similar

Adding yet another layer of social data to groups of similar objects, you can create another kind of value. In this example from Amazon, similar items are ranked by strength of correlation between object views and purchases. Here an implicit judgement, expressed through sales conversion, shows which of the similar objects is most preferred by other users.

Very useful for comparison shopping, especially among groups of complex, similar, or specialized objects (like digital cameras). Here's what this type of relatedness looks like in an abstract architectural view:

 

The objects are identified as similar by virtue of their shared attributes. The percentage indicated on each object indicates its percentage of total purchases within the group as a whole.

Preference needn't always be based on implicit metadata like sales conversation, though. Here's an example of preference among similar from YouTube that uses ratings. After a video plays all the way through, the YouTube video player offers up some suggestions about what to watch next. The suggestions are similar to the video that just played, in this case on the basis of their shared authorship and title words. Preference is expressed via ratings, so the suggestions are the top-rated similar videos. 

And this makes perfect sense: If you watch a video all the way through, there's a decent chance you liked it and would be interested in discovering similar videos of high quality.

 

Affinity Recommendations

Simply put, affinity recommendations are recommendations based on people with whom you have preferences in common. The logic goes: We both loved Friday the 13th Parts 1 through 6; you've seen Halloween IV and liked it; therefore there's a decent chance I'll like Halloween IV.

Netflix has made a huge investment in its recommendation engine, and affinity recommendations are a huge part of how it works. Netflix has recognized that choosing a movie to rent is very often a socially-driven activity. Faced with thousands of choices, we turn to friends for advice. But the best recommendations aren't made just by friends--they're made by people with whom we share a common taste in movies. Netflix makes the degree of commonality explicit. Here's how it looks:

 

Netflix has built in a number of social features around explicit relationships with the people we know, and they're constantly tinkering. But affinity recommendations aren't always situated within existing relationships. In many cases the shared preferences are enough (including on Netflix, in the absence of "friends").

Here's what this looks like in the abstract:

 

 

In this diagram, the big bubbles represent people. The people are color-coded to indicate a profile of preferences. In the Netflix example, these preferences are explicit--ratings of particular movies. (I'm not sure whether Netflix also looks at rating patterns within classes of similar movies--but they certainly could if they needed to build a more extrapolated flavor of affinity. My guess is that they have sufficient volume of ratings that they don't need to extrapolate.) But preferences needn't be explicit in all situations. Preferences can also be gleaned through behavioral metadata and through algorithmic combinations of explicit and implicit metadata.

In this diagram, the two people represented by red bubbles share preferences for objects represented by small bubbles A, B, C, D, and E. Because person 2 also liked objects F and G, the system can present affinity recommendations of objects F and G to person 1.

Obviously, related ness gets pretty complicated at this level. For example, if person 1 has already expressed a non-preference for objects F and G, they'll be annoyed if you keep recommending them. So you need to build controls for that kind of scenario.

Nonetheless, the payoff for a strong system of affinity recommendations can be huge, in terms of overall perceived quality, conversion, and social collateral. If you're working with a system that includes a strong base of dedicated users and many content objects, you can add a lot of value.

 

The Devil in the Details

As with all social systems, even the most carefully-built system is likely to function a little different than you imagine after you let a bunch of unpredictable humans play with it for a while.

Keep close tabs on the health of your related items engine. Plan for and retain budget to tweak ongoingly. Establish KPI's to measure system health, and run A / B tests to optimize performance.

Above all, as always, have fun with your metadata!

January 11, 2008

Tag Clouds Are Bad (Usually)

Tag clouds started showing up three or four years ago on sites that use folksonomies to organize and describe content. I've never liked them, but only recently have I been asked to justify that sensibility. So here's my rationale:

Here's one from Flickr, one of my most favoritest web sites:

 

Tag Clouds Are Unreadable (Usually)

I've heard more than one designer complain about how unreadable tag clouds are, and even better, how "ugly." 

I'd have to agree: Tag clouds are absolutely both unreadable and, yes, ugly. Seriously, I want to avert my eyes! And I have good vision--if I find tag clouds illegible, I can hardly be the only one. I haven't seen an actual study of the accessibility of tag clouds. Anyone?

 

Tag Clouds Convey Meaningless Information (Usually)

Legibility aside, let's take a closer look. The idea of the tag cloud is that it adds layers of information to the simple list of tags applied to the content objects in the system. The bigger the font size of an individual tag in the cloud, the greater the number of content objects with the tag attached. At a glance, you can get a sense of the topical landscape of the system, including where the most content is concentrated.

But is the relative volume of objects attached to individual tags helpful? Meaningful? I guess maybe, a little, in some cases. Is it worth the legibility trade-off? I have to think, at least most of the time, it's not.

Individual site visitors have an individual thing of most interest at a given time. In most cases, the task is to help them find that thing as easily as possible. The relative volume of tag use doesn't help, and the varying font sizes in a tag cloud make it harder to scan, decreasing ease of use.

 

Tag Clouds Obscure the Difference Between Noise and Signal (Usually)

One of the strengths of folksonomies is partially in redundancy of tags. Folksonomies accommodate many alternative ways of describing, as opposed to controlled vocabularies, which rely on universal understandability of "official" labels. Folksonomies even address issues related to common misspelling of tags, and they grow and change organically, responding to emerging vocabularies. These virtues are particularly valuable in the context of search.

But in a tag cloud format, that strength turns into a weakness. All that redundancy becomes visual noise, and the topical landscape turns out to be skewed toward the most ambiguous tags, which appear in many forms, and the least ambiguous, which appear largest. Not only do they not convey meaning about the system, they actually mislead.

 

Alphabetization Is Not Helpful (Usually)

That's not all. The alphabetical ordering of tags in the tag cloud is effectively meaningless. Alphabetized lists only make sense when users know the name of the thing they're looking for. And in a folksonomic system, agreement about descriptive terms isn't the goal. So in most cases, alphabetizing tag clouds is the same as randomizing them. No meaning is added.

I looked around for a tag cloud ordered by currency of use, but I couldn't find one. That kind of approach would at least add meaning--and I still think that particular piece of meaning wouldn't, most of the time, be relevant.

 

There Are Exceptions (Theoretically)

Having said all that, I also think it's perfectly reasonable to think there could be cases where a tag cloud might be rendered legibly, conveying meaningful and relevant information to a highly-acclimated audience. And given rocket propulsion, pigs could fly.

But maybe that's not fair. There really could be outlying cases of tag clouds making sense, perhaps with some kind of filter reducing noise, sufficient white space aiding legibility, a display of relative popularity of individual tags adding value, and in a context where either labels are familiar or currency is the higher-priority piece of information. Or maybe someone's done tag clouds differently, with different rules governing how tags are displayed.

Anyone have a great example of a tag cloud working perfectly? I'd really love to know about it!

December 11, 2007

Separating the Wheat from the Chaff: Quality in Emergent Social Content

Users create tons of terrible content!

We must not say so. We'll say instead, users create content that ranges widely in terms of its degree of interest to other users.

I've been doing a ton of work on quality the past few weeks, in several different contexts. Consider the following challenges:

  1. Our design for a client's corporate group blog includes a prominent list of "Most Popular" posts. In this context, where the blog is fully integrated with the main site via primary navigation, we expect many non-regular blog readers to wander over to the blog landing page, and we want to increase the likelihood that they'll see something both compelling and fresh.
  2. Another client's technical support forums include thousands of posts, many of them addressing the same or similar issues, and only one of them providing the single best solution for a particular user's problem at a particular time. The most common handful of solutions address 90% of user's issues.
  3. A client's social media site uses a set of algorithms based on user behavior to rank user-created content objects, and the best are exposed on the home page of the site. However, their exposure on the home page garners them more attention than average, strengthening their ranking and producing static "best" lists. The rich get richer.
  4. A client's community site is wide-open, allowing posting on a range of subject areas. The home page of the site shows the most popular posts, but the popular posts have tended to be less than germane to the business rationale behind the site.

Let me take a step back, though, and talk a bit about the general approaches to organizing emergent user-generated content:

 

Emergent Content

Emergent content is content that shows up automatically, based on a set of rules. No single person decides what content is visible and what content isn't. Instead, content objects appear in various contexts based on attributes encoded in their metadata. Sometimes this means the good stuff shows up, sometimes it means stuff shows up based on similarity to other stuff, sometimes it means complentariness, and so on.

In the business, we call it "The Magic of the Web."

 

Algorithmic Approaches

In many contexts, especially on web sites with huge volumes of user-created content, quality is derived algorithmically, based on user behavior. And user behavior can be thought of as including explicit behavior like voting and rating and implicit behavior as well. For example, the number of people viewing a video object can reflect its quality--and beyond that, quality measures can include views of the entire video, comments on the video, time on page for the video, qualified page views, and more. And other measures can reflect negatively on the video's quality--incomplete views, time on page below a qualifying threshold, bounce rate, and more. 

Bits and pieces of behavioral metadata stick to content objects--less like fingerprints than like the folded corners, weakened binding, marginalia, and coffee stains on an often-loaned book--and provide a basis for us to construct the rules that govern how the objects behave.

Algorithmic approaches to measuring quality take into account a combination of these measures in ways appropriate to the type of object and culture of the site. Which measures are more important (and how much more important) depend on the context of user needs and business requirements--every site is unique. "Quality" is quantifiable only with the right formula.

The most bad-ass example of algorithmically-derived quality is, no doubt, the Explore category on Flickr. Based on its concept of "Interestingness," Flickr finds photos that are interesting and shows them in views filterable by date, camera, location, tag, and tag cluster.

What's so amazing about the Flickr Explore feature is that it's constantly refreshed and the photos are always, indeed, interesting. Flickr has an advantage: Millions of content objects. Naturally, the best hundred are pretty darn good.

 

Editorial Approaches

As appealing as is the idea the bottom-up, self-organizing content architecture, editorial participation has a key role to play in many branded online communities. First and foremost, an editorial presence can provide a human face for the company, a point of social contact between the community and the brand. An editorial presence can also model appropriate behavior, highlight the kind of contributions that best serve the needs of the community, and speak for the brand without being perceived as a constraining or intervening voice. It just has to be done with a delicate touch.

A while back I wrote about a quality problem on the social media site Treemo. Since I wrote that post, Treemo has rearchitected its home page, which used to highlight popular content, to feature editorially-selected content. The result is a much stronger affordance on the home page for content that aligns with the site's mission. Over time, that affordance will help improve the overall quality of content on the site--though Treemo's too-simplistic separate display of most viewed, favorited, and commented still render horrors upon the unsuspecting eye.

 

Hybrid Approaches

Combination approaches can offer the benefits of both algorithmic and editorial quality measures. Typically, an initial level of quality valuation emerges from user behavior, qualifying a top tier of content for editorial review.

A great example of the hybrid approach is JPGmag.com. JPG is a social media site, which announces a theme for a period of time and invites users to submit photos related to the theme. Users view photos one at a time and vote whether each photos is "good for (theme)." The experience is compelling, partly because the emergence of photos is affected by the votes of other users, so the photos that aren't voted up tend to disappear quickly. On the surface, the interaction is extremely simple. But there's some good, solid magic going on behind the scenes.

Photos that rise to the top based on user voting are qualified for review by the editorial panel, which selects photos for inclusion in the theme-based print edition of the magazine.

 

theme-based voting

 

Beyond the Top-10 List

Ranking content objects based on quality makes possible some powerful uses of quality content--including but greatly improving on the typical idea of the "top 10" list. Quality can be superimposed on other metadata to produce content presentation that are both highly faceted and of high quality. For example, you can search Flickr for photos tagged "gorilla" within a geographic range limited to Rwanda, then sort the results by interestingness, and even filter by the type of camera you own. You can see the best photos of Rwanda's gorillas taken with your particular camera model, and see how your own adventure travel pictures measure up.

Amazon's customer reviews page is another great example. Customer-written product reviews are ranked based on a simple "usefulness" vote by users. Amazon highlights the most helpful reviews above and below a rating threshold. The most helpful favorable and unfavorable reviews appear side by side, an instant conversation that socializes the shopping experience, adds credibility to the buying process, and creates a unique value for online shoppers.

 

 

Quality in Action

All of the challenges I described earlier require a solution with some kind of quality measure. Over the past three or four years, I've found that my information architecture work is less about labeling and organizing content and more about creating mechanisms to apply descriptors and rules to govern emergence. As user-created content becomes more important, the IA work happens at a remove. It also feels more critical to get it right and to design flexible systems that can adapt to unpredictable user behavior.

Likewise, new possibilities to create value have emerged--and quality is just the beginning. Let me know what you think, and stay tuned for more posts on this topic.

November 30, 2007

What Is a Blog?

It's a perplexing question, when you think about it, and it gets more perplexing all the time. From a technical perspective, a blog is just a web site. Blogs tend to have certain features, but those features aren't unique to blogs, and not all blogs have them. Blogs are usually, but not always, published by individuals, but not all web sites published by individuals are blogs. Blogs are frequently-updated, but not all frequently-updated web sites are blogs. These days, blogs are not always even web sites per se--they are integrated with instant messaging, social media API's, mobile devices, and so on. Is your IM-driven Twitter stream, transmitted through your blog via RSS actually a blog when I view it on my cell phone?

The seemingly easy question of what exactly is a blog turns out to actually be pretty interesting.

And Lee and Sachi at Commoncraft have busted out a nice, basic answer. What I really like here is the focus away from technology and toward authorship, utility, and relationships. This is great stuff for a person who's unfamiliar with blogs and needs to grasp the basics:

 

 

What a great introduction to blogs! Give those two a hand.

Of course, for those familiar with blogs, it's not hard to see that there's much more to the story than Lee and Sachi included in this video. (They understand the complexities as well as anyone.)

You can see blogs as a set of conventions or features:

  • Posts appear in reverse-chronological order (newest at the top of the page). Except when there's a persistent initial post or welcome page.
  • People can comment on posts. Except when comments are disabled.
  • Incoming links are displayed along with comments. Except when they aren't.
  • There's a persistent link to an "About" page that introduces the blog and / or author(s). Except when there isn't.
  • Posts are archived either by category or date. Except when they are organized by tags, or not archived at all.
  • You can subscribe by RSS and / or email. Except when not.
  • Blogs link to each other. Most of the time, but not always.
  • A list of links to other blogs (the blogroll) is included in the margin. Usually, that is.
  • Blogs are standalone sites. Unless they're integrated into a bigger site, excerpted, or aggregated.

These conventions have too many exceptions to meaningfully differentiate blogs from other kinds of web sites and services. So here's another way to look at it:

From a social architecture perspective, blogs are structured around a singular voice, even when that "voice" is multivocal and multifaceted. The voice is the container. Within that container, the individual post is the primary content object, and the secondary (comment) objects that attach to posts are important mainly as attributes of post objects.

Likewise the author or authors of a blog are secondary objects, experientially subordinate to individual posts and important mainly as attributes of posts. In other words, in the context of a blog, what you say is more important than who you are.

You can contrast that structure with the structure of social networks (Lee and Sachi have a video about this topic too), where the individual profile (or "about page") is the primary content object, and blog posts by the author are secondary content objects that are important mainly as attributes of their author. Here identity is primary, and blog posts, along with other types of secondary objects, are important mainly as attributes of identity. In other words, who you are is the thing--what you say is how you construct and transmit who you are.

For me, most of all, blogs are social. They include commenting, subscription, and trackback. You can contact the author. They are human, authentic, a little rough around the edges. If none of the above, then it ain't a blog. Architectural considerations aside, you know it when you see it. A blog is, ultimately, an aesthetic.

What do you think?

August 04, 2007

Measuring Community Success: Two Contrarian KPIs for Web 2.0

We've seen a shift recently in how some of the major players in Web measurement are evaluating success. Nielsen/Netratings switched from counting page views to counting time spent--to the derision of some in the analytics community.

Time spent has problems, it's true. But I haven't heard a lot of great suggestions about how you do measure success on participatory web sites, social media sites, or online communities. The truth is, the devil's in the details, there is no silver bullet, and most of all, it depends what you're trying to accomplish.

Yes, everything in web analysis should track back to your goals. Repeat that ten times.

Nonetheless, I will offer up a couple ideas about how measurement in online community is different than on brochure sites, and how standard measures, to the extent there are any, fall short in typical online communities.

My suggestion is basically that we need to think differently about success, and abandon some of the old-school, quantity-focused measures, like number of posts, number of comments, and number of registered users, for an approach that emphasizes quality. Here are two examples of what I mean:

 

Return Visitor Sign-In, Not New User Registrations

Sign-in is more important than return visit--it's even more important than registration. Participants in online communities typically register to access content they can't access otherwise, to take an action they can't take without registering, or to streamline a future process (e.g. saving credit card information).

So, participants typically register for the promise of a reward. Sign-in is different. When a registered user returns to the site and authenticates, you are a little closer to knowing you gave that person something valuable. In other words, when the cousins get up for more, you can be pretty sure they like your mashed potatoes. If you want to gauge the success of your meal, measure the number of second helpings, not the number of people at the table.

 

Qualified Content Consumption, Not Number of Posts

I've written in some detail about the value of so-called "passive" participation (I personally wouldn't call information-seeking online a "passive" activity). In short, content consumption tends to be greatly undervalued relative to content creation. While content creation introduces the potential for value, content consumption represents the realization of that value. Tracking content consumption, then, helps you learn not just how much stuff you have, but, more importantly, how good is your stuff.

Part of the problem with quantity measures in social media is that all content is not created equal. The best content, while it's created only once, creates value many times over, while the worst content can actually be detrimental to the participant experience. So the number of content objects on your site tells you much less than does the number of times those objects are consumed--in total, on average, and individually.

But how do you know content consumption when you see it? Content detail page views can be misleading, but I differ from those who'll tell you it's all about time on site in Web 2.0. Instead, count page views qualified by either a time-based or action-based threshold. 15 seconds on a detail page is a decent place to start, but it doesn't account for the effect of tabbed browsing, which can skew time-based measures upward.

Here are some examples of action-based measures for qualified content consumption:

  • file downloads
  • play embedded media
  • copy embed code
  • forward via email
  • scroll to bottom of page
  • print
  • favorite

There are many more, and there's a lot more to say on this topic. Drop me a note in the comments with your thoughts, ideas, and questions.

June 29, 2007

Tagging the Semantic Web with ESP

Go to espgame.org

Here's an interesting bit of online sociability: an anonymous, cooperative game that adds tags to millions of online photos, filling in a part of the metadata required to realize the Semantic Web.

The ESP Game pairs you up anonymously with another player, and shows you both an image. You both type words describing the image, and when one of your words matches a word from the other player, you get a new image. You try to match tags on 15 images in two and a half minutes.

I played for a few minutes this morning, and I have to say I found it much harder than I expected and very addictive. Basically I totally sucked at it...

...and felt strongly compelled to try to improve my score.

The social element is interesting, too. Even though the other player is entirely anonymous, and I had no way to communicate with them, I felt an obligation to do my best, so as not to "let down" my playing partner. Cooperation is a powerful motivator.

I written in the past about using positive, intrinsic motivation to encourage desired behaviors in community. This is a different kind of example than the one I cited earlier, which focused on displaying the community's behavior around an individual's content. Here the reward is primarily personal, and secondarily social.

An interesting tactic to fill in the Semantic Web's metadata gap. I'm resoundingly unconvinced, though, that this kind of approach could ever generate the consistent, complete, and ongoing tagging required to provide the human-added data element. The game is kind of fun, but I won't be back. I've got plenty of more important things to do.

In contrast, consider Flickr. People tag in Flickr to organize photos, to publicize them, and to re-find them. On Flickr the value of the tagging activity is vested in the tagged object itself, rather than in the entertainment value of the tagging activity. This strikes me as much more promising.

June 05, 2007

The Lurker Myth: Measuring the Value of Passive Participation in Community

In conversations about measurement of community sites, I frequently find myself championing the value of passive participation. Intuitively, it's easy to feel like "lurkers" are somehow taking advantage of a community without doing their fair share. In fact, not only do I believe so-called passive participation is the very lifeblood of community, but that there are better ways to think about whom your users are.

My basic philosophy on this point boils down to this axiom:

Active participation creates potential value, and passive participation realizes that value. Most people do both.

In other words, active and passive participation in and of themselves are equally valuable. While it's tempting to "reward" content contributions, which often reflect a higher degree of engagement, with a higher valuation, active and passive participation actually have a symbiotic relationship. In some situations, such as on a technical support forum, "passive" participation by people who need help accessing helpful content is the whole point.

 

 

It's a nice way of making the point that you need to measure, value, and optimize for passive participation. But of course it's not quite so simple. A single piece of content can be accessed many times, returning value over and over. In terms of individual actions, adding content is more important than accessing it. And some contributions create more potential value than others.

 

 

In this view, you can rank actions like posting a video highest, commenting on the video next highest, rating a video one notch lower, and viewing a video one notch lower still. That kind of engagement-based ranking can help you compile an overall index of community engagement, as I discussed in an earlier post.

This kind of relative view of the value of actions also provides a different lens to look at your users. While it's tempting to think of users as either content contributors or content consumers, participants or lurkers, the truth is nowhere near that tidy. 

This is the old-school way of looking at user segments in online community:

 

 

The basic idea here is that a core group of highly engaged users adds the content that fuels the community. The problem is that this model isn't accurate.

In fact, the majority of your online community users probably both produce and consume content:

 

 

So it's not necessarily useful to think about your participants as either "contributors" or "lurkers." It makes much more sense to think of them in terms of their degree of engagement. Using the community engagement index, you can identify and reward highly-engaged participants. You can reach out to less engaged participants with invitations to get more involved. Using behavioral targeting, you can even serve up content reflecting users' degrees of engagement to orient new participants, deepen engagement, and encourage exemplary behavior.

June 01, 2007

The Basics of Online Community Measurement and Analysis

There are a number of ways to think about measuring the value of your online community, and ROI is only one of them. ROI is, frankly, a head-scratcher, and most people in the online community world throw up their hands when it comes to thinking about how to dollarize their community initiatives. The reasoning goes: "We can't tie community participation to purchases, so instead we'll talk about community participation in terms of branding." I will, however, make a suggestion about how to show ROI that will work for some people in some contexts. 

Online Community Brand Lift Survey

The idea here is to get a sense of how exposure to a branded community experience affects brand perception. There are few on-site behaviors that give reliable qualitative measurements of brand perception, so analyzing visitor behavior isn't the best answer. In the usability lab, the tendency of participants to seek to please the test facilitator (the Hawthorne Effect) casts doubts on measurements of brand perception. 

A series of surveys are the answer. Start with an entry survey that gives you a measurement of brand perception prior to exposure to the community site, and follow up with an exit survey measuring brand perception after exposure to the community site. 

You have to do this right for it to work. Find a qualified researcher in your organization, hire a partner, or get yourself an intern graduate student with a background that includes research using surveys. You want to present the entry and exit surveys to different groups of people, for example, and make sure the survey is written in a way that gathers exactly the information you want to have. It's not rocket science, but it's a little harder than, say, Algebra. 

Another critical consideration for your exit survey is to think about what site behaviors qualify users to participate. In other words, How engaged with the community site, at a minimum, do users need to be for you to expect the experience to influence their perception of your brand? Obviously, folks who land on your home page by accident and immediately leave shouldn't be seeing your exit survey--their perception of your brand hasn't likely been influenced. But how much engagement is reasonable to expect? Over 20 seconds on the home page? Fully completed profile, 5 return visits, posted original content? 

What is engagement, anyway, and how do you know it when you see it? 

Online Community Engagement Index

Here's an approach to measuring participants' degree of engagement with your site. Start by making a list of all the important trackable actions on your site. Here are some examples: 

  • Home page view 
  • Detail page view 
  • Profile view 
  • Download file 
  • View video 
  • Register 
  • Add comment

You can also include measures like time on site, number of page views, and more. And you should include important combinations of actions, like registration followed by return visit.

Next, rank the actions in order of degree of engagement. For example, registration usually reflects less engagement than completing a profile, posting video content reflects more engagement than commenting on a video, and so on. 

If you want to get fancy about it, you can do a weighted ranking, where for example uploading a photo for your profile counts 5 times more than filling in a where-do-you-live field. 

For a simpler approach, you can group actions into 3-5 categories reflecting different degrees of engagement. (This is a relative measure, so precise numbers about degree of engagement are less important than consistently measuring engagement over time and above all taking action to increase engagement.) 

Then, you can tally up your users' scores. There are lots of ways to slice up the index--median by month, week, day, quarter, year; median among content contributors, registered users, lurkers; proportion of users above and below certain thresholds. Community engagement is a rich measure that can teach you a lot about your site. 

And where the learning really gets significant is as you look at the changes over time and across iterations of the site. You can get powerful feedback about how your site management affects the experience of your community. 

One could say engagement is a good thing in and of itself, and I think that's generally true. But when it comes time to pitch for budget dollars, "engagement" doesn't necessarily get you past the skeptical VP. 

The Holy Grail: Return on Community Investment

Here's a way to think about ROI in terms of the relationship between engagement and dollarized conversion. The logic is to show conversion relative to engagement, that is, the degree to which people are more valuable to your business the more engaged they are with your community. 

If your community site has an e-commerce component, or your community is woven throughout your larger site experience, it's easy to look at the relationships between engagement and conversion rate, average order value, and so on. If not, the story gets a little more complicated. 

In the consulting business, we're increasingly working with clients to coordinate constellations of web sites, and tracking behavior across those constellations is part of the leading edge in web analysis. Here's a conceptual picture of how some of this works: 

Click to enlarge.   

Understanding the relationship between community engagement and dollarized conversion enables you to calculate the value of a customer who's engaged with the community compared to the value of a customer who isn't. From there you can look at the total number of community participants, total up their relative value, and subtract the cost of the community. Poof! ROI. 

Some words of caution. This approach doesn't tell you: 

  • The long-term value of engagement. 
  • The value of brand lift produced by community. 
  • The value of community participants' influence on non-participants.

These are important caveats. It's critical to remember that for community, ROI isn't the whole picture.

And, this approach isn't trivial to implement. You'll need solid in-house expertise or a qualified partner.

Nonetheless, contrary to what some might tell you, and despite the complexity, it IS possible to calculate ROI for some community web sites when you need to show the VP that dollar-sign bottom line. It's a powerful basis for goal setting--and for holding yourself accountable.

For additional reading about site analysis and measurement, I highly recommend the blog of my respected colleague, Anil Batra, along with that of analytics luminary Avinash Kaushik. For in-depth reading, check out Actionable Web Analytics, a new book from (full disclosure) colleagues of mine at ZAAZ.

My Photo

Subscribe by Email

  • Enter your email address:

    Delivered by FeedBurner

Voices

links worth saving