Testing your content: what's the best approach?

Testing your content: what's the best approach?

Testing your content: what's the best approach?

Testing your content: what's the best approach?

Matt Fenwick

Content Strategist, True North Content

There’s one way to know for sure if content works: ask the people we’re writing it for.

When we’re making decisions about content, it’s so easy to get caught in duelling opinions. To get attached. Whether we work in a technical area, corporate or communications, there’s a way of writing that makes sense to us. So when we're working on content with people from other teams, everyone digs in and defends their version.

Evidence calls a cease-fire. Instead of trying to argue the other side round to our opinion, we ask what the data tells us about what users need. It shifts the conversation from what we think to what we can see out there.

When we look for that evidence, plenty of quantitative approaches will indicate whether content works. Just a few examples include:

Quantitative approaches can be useful: they're easy to scale up and capture trends across a whole site. Google analytics will show behaviour of people interacting with our content. Where do they enter the site? What do they do next?

Quantitative approaches tell us the 'what', but don’t tell us why. Why are people leaving our site after five seconds? What’s going on for them?

To understand whether your content works for people, there’s no substitute for user-testing.

So how do you do it?

I’ll run through three approaches to user-testing content that I’ve found. And share some thoughts on which one works best for me.

The Cloze test

The Cloze test uses a written comprehension exercise.

1. Take a passage of text


I’ll follow the version of the Cloze test as laid out by Jakob Nielsen. Let's start with a simple sentence:

“Yesterday, I went down to the supermarket to buy some fruit, toilet paper and laundry detergent.”

2. Remove words


Now remove every sixth word and replace it with a blank (you can remove words less often if you like):

“Yesterday I went to the _________  to buy some fruit, toilet ______ and laundry detergent.”

3. Test with users


Give the text to a user and ask them to fill in the blanks. Let’s say the user writes:

“Yesterday, I went down to the grocery store to buy some fruit, toilet paper and laundry detergent.”

4. Score their response


Count how many times the user inserts the “correct” word.

When you’re marking the answers, look for whether the user has captured the intent of the passage, rather than used exactly the same words as you. I’m writing this in Australia, where we say “supermarket”. But if someone wrote “grocery store” instead, that’s still consistent with the logic of the sentence, so you’d mark it as correct.

If the user gets more than 60% right, the Cloze test says that the content is reasonably clear.

Does Cloze testing work?


The Cloze test has plenty of support, and has a solid pedigree in teaching english as a second language. But it has some downsides for testing content.

The result looks definitive: the word the user chooses either matches our version or it doesn’t. But a match doesn’t get us any closer to knowing whether the content actually works. Here’s why:

  1. By removing words randomly, you affect context and meaning. On the most basic level, people aren’t using the content. They’re using a patchy reproduction that’s been stripped of information.
  2. It tests people’s ability to fill in the blanks, not understand the language in the context of a task. It doesn’t tell us whether the content is useful.

Let’s tease out the last point — filling in blanks versus being useful. Can you guess the next shape in the series?

An image showing a series of shapes as part of a question to guess the next shape in the sequence. The shapes shown are a square, triangle, circle, then a vertical line followed by a square, triangle and a question mark.


It's a circle, right? So the pattern is clear. But is this row of shapes useful? Does it help you to do anything?

That’s why the Cloze test is attractive but artificial. It looks scientific, without indicating how people use your content in the real world.

The tap test

This test observes user behaviour as they’re reading the text. It's pretty straightforward to run. You ask a user to read a passage of text and tap a steady beat as they go. The parts they slow down on are the parts that are hard to read.

Does the tap test work?

I don't think so. There are two problems with the tap test:

1. It’s unnatural

When we’re testing, we want to observe users in the wild, behaving as they would normally. The data from the test reflects how they’ll interact with the content in the real world.

Asking them to tap distorts the environment. Nobody taps a steady beat while they’re doing something else (except for drummers!). Introducing this task requires extra concentration and distracts the user from the text — the very thing we’re trying to measure.

2. It’s misleading

Slowing down could indicate that the text is difficult to read, but that's not the only explanation. There are plenty of other times when we might slow down:

  • when we’re sad
  • when the thing we’re reading about is important
  • when the thing we’re reading about is unusual
  • when we’re overwhelmed by beauty

When I applied for a visa, the form instructions were pretty straightforward. But I still read everything three times because I really wanted to get it right. A tap test would say the content was problematic. My experience says otherwise.

Like the Cloze test, the tap test is attractive because it gives us data, but it’s ultimately unhelpful. It doesn’t get us any closer to answering the question: can people actually use this content?

Task-based testing

That leaves us with task-based testing: giving users some content and asking them to complete a task using that content.

The technique I use is adapted from Meghan Casey’s excellent book: Content Strategy Toolkit.

1. Define a task with detail

Frame the task in a way that makes the content necessary. User-testing tasks that evaluate information architectures and interfaces are often too broad to test the content. Here’s an example:

“find information on bin collections”

The test ends when the user lands on the page. As Angela Colter put it in one of the first articles on testing content:

“It’s tempting to ask these questions, but they won’t help you assess whether your content is appropriate for your audience.”

We want to go further: using something like:

‘you’ve noticed that your bin was not collected on time. Find out what to do about this.”

In the second version, the question draws in information from the content.

2. Give them the actual product — or close to it

Structure and content work together. Content informs how people use the structure, and vice versa. So don’t just hand them a print-out of plain text in a Word document. Give them a high-fidelity prototype (wireframes with actual content), or the website itself.

You can print out web pages. I use an iPad because it’s shiny, and saves wrangling bits of paper.

3. Capture their experience

There are two parts to this: talk-aloud, and markup.

Talk aloud

Talk aloud is the classic user-testing technique: ask people to verbalize their experiences as they go. This article from Nielsen Norman Group has good advice on how to do this.  

Markup

Ask users to highlight any sections they find confusing. Other versions of this test ask people to markup sections that they find useful AND sections that they find confusing. I prefer to keep it simple and just capture the confusing parts:

  1. This is usability testing, so we’re not as interested in likes/dislikes. We want to know “can people use this?” Concentrate on pinpointing problems.
  2. Any additional activity — even talk-aloud protocol — introduces an element of artificiality. There’s a risk that you distort behaviour. Minimise the complexity of the additional work you’re asking users to do.

If you like, you can add a reflection question at the end of the test that does capture the emotional response, or brand experience.

Once you have the feedback from sessions, collate it. If there’s a passage with a lot of red highlighter, you know that content has problems.

It's time to test your content

Task-based testing wins, because it’s accurate and honest. The insights get you pretty close to predicting how people will respond to your content in real life. There’s no secret formula and no shortcuts. Just users and you.

There’s one way to know for sure if content works: ask the people we’re writing it for.

When we’re making decisions about content, it’s so easy to get caught in duelling opinions. To get attached. Whether we work in a technical area, corporate or communications, there’s a way of writing that makes sense to us. So when we're working on content with people from other teams, everyone digs in and defends their version.

Evidence calls a cease-fire. Instead of trying to argue the other side round to our opinion, we ask what the data tells us about what users need. It shifts the conversation from what we think to what we can see out there.

When we look for that evidence, plenty of quantitative approaches will indicate whether content works. Just a few examples include:

Quantitative approaches can be useful: they're easy to scale up and capture trends across a whole site. Google analytics will show behaviour of people interacting with our content. Where do they enter the site? What do they do next?

Quantitative approaches tell us the 'what', but don’t tell us why. Why are people leaving our site after five seconds? What’s going on for them?

To understand whether your content works for people, there’s no substitute for user-testing.

So how do you do it?

I’ll run through three approaches to user-testing content that I’ve found. And share some thoughts on which one works best for me.

The Cloze test

The Cloze test uses a written comprehension exercise.

1. Take a passage of text


I’ll follow the version of the Cloze test as laid out by Jakob Nielsen. Let's start with a simple sentence:

“Yesterday, I went down to the supermarket to buy some fruit, toilet paper and laundry detergent.”

2. Remove words


Now remove every sixth word and replace it with a blank (you can remove words less often if you like):

“Yesterday I went to the _________  to buy some fruit, toilet ______ and laundry detergent.”

3. Test with users


Give the text to a user and ask them to fill in the blanks. Let’s say the user writes:

“Yesterday, I went down to the grocery store to buy some fruit, toilet paper and laundry detergent.”

4. Score their response


Count how many times the user inserts the “correct” word.

When you’re marking the answers, look for whether the user has captured the intent of the passage, rather than used exactly the same words as you. I’m writing this in Australia, where we say “supermarket”. But if someone wrote “grocery store” instead, that’s still consistent with the logic of the sentence, so you’d mark it as correct.

If the user gets more than 60% right, the Cloze test says that the content is reasonably clear.

Does Cloze testing work?


The Cloze test has plenty of support, and has a solid pedigree in teaching english as a second language. But it has some downsides for testing content.

The result looks definitive: the word the user chooses either matches our version or it doesn’t. But a match doesn’t get us any closer to knowing whether the content actually works. Here’s why:

  1. By removing words randomly, you affect context and meaning. On the most basic level, people aren’t using the content. They’re using a patchy reproduction that’s been stripped of information.
  2. It tests people’s ability to fill in the blanks, not understand the language in the context of a task. It doesn’t tell us whether the content is useful.

Let’s tease out the last point — filling in blanks versus being useful. Can you guess the next shape in the series?

An image showing a series of shapes as part of a question to guess the next shape in the sequence. The shapes shown are a square, triangle, circle, then a vertical line followed by a square, triangle and a question mark.


It's a circle, right? So the pattern is clear. But is this row of shapes useful? Does it help you to do anything?

That’s why the Cloze test is attractive but artificial. It looks scientific, without indicating how people use your content in the real world.

The tap test

This test observes user behaviour as they’re reading the text. It's pretty straightforward to run. You ask a user to read a passage of text and tap a steady beat as they go. The parts they slow down on are the parts that are hard to read.

Does the tap test work?

I don't think so. There are two problems with the tap test:

1. It’s unnatural

When we’re testing, we want to observe users in the wild, behaving as they would normally. The data from the test reflects how they’ll interact with the content in the real world.

Asking them to tap distorts the environment. Nobody taps a steady beat while they’re doing something else (except for drummers!). Introducing this task requires extra concentration and distracts the user from the text — the very thing we’re trying to measure.

2. It’s misleading

Slowing down could indicate that the text is difficult to read, but that's not the only explanation. There are plenty of other times when we might slow down:

  • when we’re sad
  • when the thing we’re reading about is important
  • when the thing we’re reading about is unusual
  • when we’re overwhelmed by beauty

When I applied for a visa, the form instructions were pretty straightforward. But I still read everything three times because I really wanted to get it right. A tap test would say the content was problematic. My experience says otherwise.

Like the Cloze test, the tap test is attractive because it gives us data, but it’s ultimately unhelpful. It doesn’t get us any closer to answering the question: can people actually use this content?

Task-based testing

That leaves us with task-based testing: giving users some content and asking them to complete a task using that content.

The technique I use is adapted from Meghan Casey’s excellent book: Content Strategy Toolkit.

1. Define a task with detail

Frame the task in a way that makes the content necessary. User-testing tasks that evaluate information architectures and interfaces are often too broad to test the content. Here’s an example:

“find information on bin collections”

The test ends when the user lands on the page. As Angela Colter put it in one of the first articles on testing content:

“It’s tempting to ask these questions, but they won’t help you assess whether your content is appropriate for your audience.”

We want to go further: using something like:

‘you’ve noticed that your bin was not collected on time. Find out what to do about this.”

In the second version, the question draws in information from the content.

2. Give them the actual product — or close to it

Structure and content work together. Content informs how people use the structure, and vice versa. So don’t just hand them a print-out of plain text in a Word document. Give them a high-fidelity prototype (wireframes with actual content), or the website itself.

You can print out web pages. I use an iPad because it’s shiny, and saves wrangling bits of paper.

3. Capture their experience

There are two parts to this: talk-aloud, and markup.

Talk aloud

Talk aloud is the classic user-testing technique: ask people to verbalize their experiences as they go. This article from Nielsen Norman Group has good advice on how to do this.  

Markup

Ask users to highlight any sections they find confusing. Other versions of this test ask people to markup sections that they find useful AND sections that they find confusing. I prefer to keep it simple and just capture the confusing parts:

  1. This is usability testing, so we’re not as interested in likes/dislikes. We want to know “can people use this?” Concentrate on pinpointing problems.
  2. Any additional activity — even talk-aloud protocol — introduces an element of artificiality. There’s a risk that you distort behaviour. Minimise the complexity of the additional work you’re asking users to do.

If you like, you can add a reflection question at the end of the test that does capture the emotional response, or brand experience.

Once you have the feedback from sessions, collate it. If there’s a passage with a lot of red highlighter, you know that content has problems.

It's time to test your content

Task-based testing wins, because it’s accurate and honest. The insights get you pretty close to predicting how people will respond to your content in real life. There’s no secret formula and no shortcuts. Just users and you.

Webinar Recording

Lightning fast content design 101

Find what your users want from you without leaving your kitchen table.

March 9, 2017

6:52 am

Register now

Webinar Recording

Lightning fast content design 101

Find what your users want from you without leaving your kitchen table.

March 9, 2017

6:52 am

Watch now
No items found.

About the author

Matt Fenwick

Matt Fenwick is a content strategist for government and peak bodies. He enjoys big hairy content projects, like rewriting government travel information for most of the countries in the world. Matt runs True North Content and lives in Canberra, Australia, which is the one city where kangaroos really do roam the streets at night. You can connect with him on LinkedIn.

Related posts you might like