SEO Content Strategy
February 27th 2020

How to Classify User Intent at Scale

As a Machine Learning Research Engineer at MarketMuse, I spend my days immersed in computer code and numbers. But for this post, let’s put that all aside as we look into the exciting field of user intent classification. 

The ability to classify user intent at scale has enormous potential for marketers everywhere. In this post, I touch briefly on what user intent is, the benefits classification can bring, how some people are approaching the problem, and the strategy MarketMuse is taking.

What Is User Intent

We use human language to express our intention. In the course of normal conversation, this is rarely a problem. However, in the case of search queries, where a searcher typically uses as few words as possible, intent can be difficult to discern. 

We’ve written extensively on the subject of search intent, so I’ll skip the preliminaries and get right to the heart of the matter. Behind every search phrase, there is an intent. The ability of content to satisfy that intent is a major factor in its placement in the search results. A well-written and topically-rich article will perform poorly in search if the intention is not adequately addressed.

It’s for this reason that we’re looking to classify user intent at scale.

The Benefits of User Intent Classification

Classifying user intent will significantly impact the MarketMuse platform. There are a number of ways in which those using MarketMuse can potentially benefit. 

For example, we maintain a historical SERP service where we store the features of 60 million topics. Adding user intent into the mix allows us to observe correlations between changes in features and intent. We can analyze top-performing pages in the SERP that have maintained their standing and see how they have adapted in response to changing intent.

We also maintain a knowledge graph service with over 50 million topics that powers our MarketMuse Content Briefs, providing topical recommendations. Intent classification will help refine those recommendations. The same principle applies to our questions engine.

These improvements have a trickle-down effect. Because our content briefs drive MarketMuse First Draft, our natural language generation (NLG) engine, any improvements to the briefs will be reflected in the NLG output.

Current State of User Intent Research

There’s not a lot of research on user intent as it relates to Search. Most of the studies revolve around general user intent for the client. For example, you’re developing a chatbot for a website and want to classify user intent so you can answer or direct a visitor to the appropriate resource. 

Slowly, interest is developing in trying to classify user intent at scale for Search. A couple of popular examples, of which you may be aware, come from Kane Jamison at Content Harmony and Hamlet Batista at RankSense.

In the case of Content Harmony, they take a heuristic approach to classifying search intent. Looking at the SERP features for a certain query, they make a determination as to its type. For example, if they see a shopping list it must be a transactional query or if they see a news item it must be informational.

As they have acknowledged in their blog post, this method had many drawbacks. It’s volatile. Any changes Google makes to the format of their search features can cause misclassification of intent. Likewise with seasonal changes to the intent of a query, such as Black Friday or Christmas. According to Content Harmony, with their approach, you need to keep an eye on changes in the rankings from season to season to detect those shifts in user intent.

The approach that Hamlet Batista discusses in his Search Engine Journal article is similar to our strategy. So I’ll discuss how we’re going about classifying search intent in a scalable manner.

The MarketMuse Approach to User Intent Classification

At MarketMuse, we’re taking advantage of the recent advances in machine learning and natural language generation to develop a reliable classifier that can be employed at scale. We’ve created a deep learning user intent classification model based on a dataset of thousands of queries. Our model takes an active learning approach, continually collecting new data and improving its abilities.

Our approach takes to heart Occam’s Razor, a problem-solving principle stating that “entities should not be multiplied without necessity.” To keep things simple, we’ll first use the search queries themselves to determine if they can be classified.

Looking at the thousands of MarketMuse Content Briefs generated, we notice that the search intent typically falls into one of these categories: 

  • Comparison
  • Local
  • Informational
  • Transactional

So that’s what we’re starting with first.

The question remains is whether it will be that simple. Is the model strong enough to classify based just on a keyword phrase? If not, there are other things we can take into consideration.

Of course, we need to account for seasonality. From experience, we know that Google changes the user intent for many search queries, based on the season. One of the main parameters we will keep track of is user intent over long periods of time. That will be very helpful in our classification because for many queries, the time of the season directly influences search intent.

Perhaps the most exciting aspect of our model is that it will continuously learn and improve. Typically, models are trained to perform a function and that’s it. When the model stops working, they need to retrain the entire thing all over again with a new set of data.

With our approach, the model is not static. It’s always feeding on new data, learning from its mistakes and correcting itself. That means we can respond to those changes in search intent as they vary over time.

Summary

Searchers use keyword phrases to express their intent. The ability to recognize that intent, and its changes over time, can significantly affect content strategy and how content is created. At MarketMuse, we are pursuing intent classification with the help of artificial intelligence. Specifically, we’re using a machine learning model that is continually training itself, learning from its mistakes, adjusting and making improvements.

Ahmed Dawod

Written by Ahmed Dawod ahmed_n1