Next time you review a restaurant on Yelp, you might want to choose your words carefully.
Online review sites like Yelp are a gold mine for valuable yet mostly untapped information about businesses, according to a team of researchers at the University of Maryland, College Park, who are using online reviews to predict whether a restaurant will close its doors within the next three months.
The model isn’t perfect yet, but it did improve the ability to predict whether a business would stay open, said Shawn Mankad, an assistant professor of business analytics who worked on the study.
Mankad’s team compiled a database of 130,000 reviews, written over a period of nine years, of about 2,600 restaurants in Washington, D.C. Researchers then identified 450 of those that had closed.
Using text-mining techniques, Mankad created an algorithm that scans each review to identify its major themes, such as customer service, food quality and atmosphere.
Then, Mankad — along with associate professor Anandasivam Gopal and doctoral student Jorge Mejia — translated those themes into variables. They used their model to test whether the variables were correlated with some kind of economic outcome — specifically, whether the restaurant would close within three months.
“There’s a long history of using the 5-star rating of a restaurant to make predictions, but we wanted to see if there was some additional value you could bring by looking at the actual text of a review,” Mankad said. “And we found that looking at the semantic structure [of online reviews] really improves your forecasting ability.”
“Constructing the variables, putting it into a predictive model — this is something that has never been done before,” he added.
But the model isn’t foolproof. After all, it doesn’t take into account other factors that would influence a restaurant’s success, such as managerial issues or local competition. The restaurant industry is ferociously competitive, margins are tight, and failures are more common than successes.
For every 100 restaurants that the model predicted would close, 70 of them actually did. That’s a false-positive rate of 30 percent, Mankad said, adding that his team is working to improve that rate.
The software developed for this project (the working paper is titled “More Than Just Words: Using Latent Semantic Analysis in Online Reviews to Explain Restaurant Closures”) is not commercially available, and at this point, there are no concrete plans to bring it to market.
The researchers are, however, working with a handful of analytics companies that might be interested in testing and fine-tuning the model.
Investment firms might be interested in using the model, for example, to gauge market conditions in a particular location, Mankad said.
“I think the biggest takeaway for me is that there’s actually value in all this free data that’s online,” he said. “All the stuff that’s generated on Twitter, Yelp, OpenTable, Foursquare and so on — what good is all that? Is it actually informative for any kind of economic outcome? This is just one study showing there is some value in it, and that you can use that information to make decisions.”