AI Marketing
July 2nd 2020

The Natural Language Generation Landscape

The commercial application of natural language generation (NLG) is still in its infancy. Unlike the crowded martech environment, with over 7,000 participants, the NLG landscape is very sparse. In this post, we look at organizations using NLG to craft both long and short-form content, create narratives from structured data, and convert text to speech.

Long-form Content Generation (750+ words)

MarketMuse First Draft is the first and only platform to offer long-form content created using natural language generation. We generate long-form text with the help of deep learning neural networks plus MarketMuse Content Briefs.

These MarketMuse Content Briefs are the exact same ones given to human writers to help them craft better content. The briefs provide a detailed framework from which to create content. With their topics, questions, and subtitles, MarketMuse Content Briefs provide context for the NLG engine to generate relevant text.

Example output from MarketMuse First Draft natural language generation.

The result is a first draft of content that hits all the essential KPIs while requiring minimal editing.

Text Generation (less than 750 words)

For our purpose, we’re defining anything less than 750 words as simple text generation. There are certain situations where a shorter form narrative is more appropriate – email and web copywriting, for instance.

Two offerings fall into this category, although their purposes are radically different.

Articoolo is aimed at publishers who need website articles up to 500 words. All that’s required is a topic of two to five words and the desired word count. While encouraging, the value of such short and superficial articles is limited.

Phrasee has done an excellent job in tailoring its product offerings for specific use cases requiring text that’s short in length. These are high-value situations that benefit from high-impact and concise copy.

Phrasee Email is used for email subject lines, preheader, headlines, subhead copy, and calls to action. Phrasee Push is used for mobile app push messages. Phrasee Social is used for crafting Facebook and Instagram messages while Phrasee Everywhere helps with AdWord, landing page, and display ad copy.

Article Rewriters and Mixers

The most primitive form of computer-assisted article generation is article rewriting, also known as article spinning. In existence for over ten years, SEOs use article spinners to quickly produce massive quantities of low-quality content for linking networks. 

This is not natural language generation.

Humans rarely visit these sites or read these pages. These are blog networks designed to exploit Google PageRank so certain pages could rank well in search.

The premise of article spinning is simple. Take an original piece of text and substitute different words to create a new version. Early attempts suffered from poor word choice when selecting substitutions. 

Here’s the above paragraph, run through an article spinner.

“The reason of text rewriting is straightforward. Take a unique bit of content and substitute various words to make new unique content. Early endeavors experienced poor word decision while choosing substitutions.”

While grammatically correct, it’s awkward. Improvements using this approach so far have been minimal. Some do use Google’s Natural Language Processing API to conduct syntax analysis, identifying parts of speech (PoS) and extracting tokens and sentences. However, the quality of the output continues to be weak, and the target market for these products remains the same. 

A few companies working in this space include WordAi, SEO Article Generator, AI Spinner, and Chimp Rewriter. No doubt, there are many more, but none of them are any good. Although they may try to position themselves as such, these products have little to do with artificial intelligence or natural language processing.

Article mixers are another class of content generator that have little to do with natural language generation, despite how they may be marketed. As the name implies, article mixing involves mixing sentences from topically-related pages, weaving them into a narrative, and substituting specific phrases using synonyms.

There are issues here at both the macro and micro levels. There’s no real overall structure to these pieces. Even at the sentence level, the choices made seem somewhat arbitrary.

Two companies whose products fit into the category of article mixing are Article Forge and AI Writer.

Structured Data Narrative 

Applications in this category take sets of highly-structured data and turn them into a narrative. The Associated Press produces nearly 4,000 company earnings articles quarterly with the help of artificial intelligence. Ecommerce sites can also create product descriptions, category stories, and newsletters using this method. 

There are numerous use cases for this approach, as long as you have the structured data to support it. That’s the critical factor in making this work at scale. In the earnings report example, the overall story is quite simple and never changes. What makes each story different are the variables. Here’s an example of an Apple earning’s report from Associated Press.

Here are some brands working in this space:

These platforms use either a template-based approach or create documents dynamically. The simplest is a gap-fill approach where data is filled in the gaps within the template.

Web templating languages, scripts or rules-producing text, is a step up from simple gap filling. But without sophisticated linguistic capabilities, it struggles to generate high-quality text.

Word-level grammatical functions make it relatively easier to write complex templates as they can deal with orthography, morphology, morphophonology, and their exceptions. But make no mistake, generating quality output in this way remains a significant challenge.

Text to Speech

Text to speech converts written text into natural-sounding audio in a variety of languages. They can be used in chatbot and voice assistant interaction, turning digital ebooks into audiobooks and interacting with in-car navigation systems.

Recently, companies have been using deep neural networks to synthesize speech that is nearly identical to human recordings. Human-like speech patterns, intonation, and articulation significantly reduce listening fatigue when interacting with AI systems.

A handful of well-known organizations dominate this area:

Summary

In the last couple of years, natural language generation has primarily focused on text-to-speech and generating narratives from highly-structured data. With MarketMuse First Draft, marketers can now take advantage of NLG to produce long-form content.

Stephen Jeske

Written by Stephen Jeske stephenjeske