Market Research using Natural Language Processing

Natural language processing is a very powerful tool that enables us at Monte to analyze large volumes of text to look for hidden patterns and “data mine” it to summarize opinions and feelings. We use NLP to understand trends, market segmentation, and compute correlations between brands and specific terms, such as product quality or specific feature words.

Sentiment analysis of social media comments data by Monte

Figure 1 is a plot of feelings measured in social media comments containing one of these three words: birthday, ethnography, or funeral. It’s one kind of natural language processing called sentiment (feelings) analysis. This plot shows the consistency of higher feeling/sentiment score for birthday than ethnography or funeral, which scores below 0 (neutral).

In language research, we treat text like a data source and this allows us to conduct market studies and draw inferences about market drivers and brand perceptions. We can answer questions such as “How is the public perception of our industry changing over time?” When a brand name is discussed widely enough in public forums, we can use big data methods to extract “opinion signals” from the subtle ways people write about products on social media.

Public sentiment analysis market research data by Monte

Figure 2 is an example of researching comments that include a brand name, in this case Jeep, and measuring the feelings expressed in those social media statements. This plot is also a time series showing public response to several positive events, and much lower scores after Springsteen’s DUI was announced.

Replacing Surveys with NLP & Data Mining

This NLP-based opinion signal lets us derive an understanding of how ad campaigns, events, tradeshows, and PR are impacting the public. We make this data understood by producing well-designed visualizations, including semantic network diagrams, multidimensional plots, and employing best practices from data science.

Semantic network diagram word map by Monte

Figure 3 is a semantic network diagram showing correlations between words for an Olympic sponsor during the 2021 Olympic Games in Japan. Nodes are scaled by word frequency, positive sentiment words are colored in purple, and negative sentiment words are colored orange. We use these plots to see the “whole mind” of social expression and understand how topics are linked together. See Figure 4 below for a nice example.

Correlation network semantic cluster by Monte

Figure 4 is a close-up of a correlation network with a semantic cluster illustrating Coca-Cola’s desired sponsorship association with the Olympic games but also the undesirable link with protests against human rights violations in China, the host of the 2024 Olympic games.

Sentiment analysis of Olympic sponsorships by Monte

Figure 5 shows sentiment analysis of an Olympic sponsor brand (Samsung) vs a non-sponsoring competitor (Apple) during the Olympic and Paralympic 2021 games in Japan. Samsung appears to have experienced a lift during the period spanning the games. This research was the first to analyze sponsorship from a socially interactive response perspective.

Comparing our Sentiment Models to Benchmark Business Metrics

Our proprietary NLP methods for investigating consumer sentiment provide insights that are actionable on short timescales, whereas the University of Michigan Consumer Sentiment Index is only published once per month. We validate our monthly mean values against this benchmark, however, to monitor our monthly performance.

Figure 6 shows that our proprietary sentiment scoring system can take comments including a single term (economy) and provide a very similar time-varying response to the U of M Sentiment Index. This NLP time series analysis could be enhanced by factoring in more terms, to make it more closely track with the Consumer Confidence Index.

Diverse Text Sources

We analyze data mined from social media platforms and websites such as Twitter, Reddit, and MarketWatch. Furthermore, we have published academic research on natural language processing for marketing purposes with collaborator Dr. Jun Min. Publishing NLP research ensures that we remain at the forefront of both technology and market direction. In 2021 our presentation to the Society For Marketing Advancements was awarded best paper. Our most recent peer-reviewed manuscript was published by the American Marketing Association and analyzed the social impact of sponsoring the Olympics by Samsung, Visa, and Coca-Cola.

Language Research is Business Science

The market research and analytics we provide, sometimes called “business science,” gives our clients the advantage of studying the broader mind of the market – monitoring key topics in the public ‘conversation’ and being able to recognize when opinions shift, permitting them to respond quickly to positive events and maximize their benefit, as well as when the news is negative, to mitigate the impact on their brand. We give them a new pair of eyes, which can “see into the mind” of the public reflected in patterns of word use. Using NLP, we can score sentiment on a simple positive/negative scale or assign categories of sentiment and score how many words align with each category. For example, Innovation is a topic of interest to us, so we can review the feelings correlated with tweets containing this word.

Natural language processing data categorized by types of feelings

Figure 7 shows how natural language processing enables us to categorize text by the types of feelings expressed.

Machine Learning for Language Analysis

By utilizing the most appropriate methods in machine learning for text analysis, we can evaluate content from external sources, like customer or social comments, or from company internal text sources, such as sales calls or customer service. Of course, data without interpretation is not useful, and at Monte we take great pride in developing highly effective visualizations. We have decades of experience creating powerful graphics, and ensure the output of our analytical tools is understood and interpretable by our clients.

Figure 8 depicts our artificial intelligence/machine learning capabilities applied to language analysis. In this case we use machine learning to predict whether a movie review is good or bad. This analysis allows us to identify the most important terms that distinguish positive ratings from negative ratings.

By incorporating this research into our creative services, we develop highly effective product names, headlines, and taglines. By advising startups on how to structure their brand creation processes and employing these proprietary technologies, we keep research budgets to a minimum while providing deep insights from rich data sources. In addition to computational methods, we also employ the human intuition developed over our company’s 20+ years of consulting in marketing and sales, primarily in business-to-business environments.

More than Research: Creating with Technology

We develop new technologies to give our clients a creative edge as well as deep insights: generative tools that help us discover word candidates for naming products and companies. The example seen here is a network diagram showing words related to food as input terms. Our software looks for related word use patterns derived from 6 billion terms, that’s big data by anyone’s definition.

Figure 9 shows word relationships based on word embeddings – this proprietary technology allows us to rapidly generate highly relevant name candidates faster than any other firm. Network plots show relationships between ideas and let us quickly follow concepts into the namespace.

We can see, for example, which terms bridge communities of ideas and which concepts activate large regions of the mind through rich semantic connectivity. “Cake” is deeply connected to both ingredients and to similar foods like shortbread. Semantic networks such as this are one of the tools we employ when devising and evaluating tag lines and slogans. This ensures that we include both broad research into how ideas are connected as well as current social media assessments of sentiment. We thus consider both short-term and long-term patterns and associations at the societal level.

Custom Lexicon Development

Custom lexicon development is a service to generate scoring mechanisms to research specific aspects of product design or service quality. For example, below is part of a network of terms related to modern electronics product design and interface. This type of lexicon is a special area of language processing and enables us to engineer metrics that identify how the public feels about specific product features, such as useability or design aesthetic. Contact us to discuss your custom language analysis and research needs.

Figure 10 shows part of a design quality term network. These terms are colored by sentiment class, showing positive design words in purple, negative design words in teal, and neutral in yellow.

If you want to pick the absolute most powerful words for your product name or tagline, or wish to understand the “social mind” of your market using text mining, contact us today to schedule a call with Matt. He’s the mad scientist behind this language processing research and will determine if these methods offer potential insights for your business or organization.