Pundits Vs. Machine: Predicting Controversies In The Presidential Race
Predictions are for psychics — and in this very unpredictable political season they might do a better job than the pundits. But what about a computer? I set out to see how well it could predict which controversies around the candidates were likely to re-emerge over the course of a month. And two human pundits have agreed to compete against the machine.
Meet the Contestants
The computer is run by Quid, a data analytics firm that uses proprietary software to search, visualize and analyze text. Since the computer can't speak, Dan Buczaczer, Quid's head of marketing, is going to speak for it and explain how it "thinks."
"Quid uses proprietary software to search, visualize, and then analyze massive amounts of text," Buczaczer says. "In this case, what we're talking about today, that massive amount of text happens to comes from news sources and blogs written about a particular topic, in this case the presidential election."
And Buczaczer is talking about some 300,000 U.S. blogs and publications amounting to nearly 7.5 million articles — everything that has been written about Hillary Clinton and Donald Trump since they announced they were running for president.
Quid's computers sifted through all that coverage to find every controversy that has plagued each candidate. "Two-thirds of them were Donald Trump, one-third Hillary Clinton," Buczaczer says. "So Trump is the winner in terms of overall number of controversies generated." Though Clinton had fewer controversies, they still generated as much coverage as Trump's.
The computer sifted through all them for patterns — like, which ones kept re-appearing.
"We kind of mapped it against both reoccurrence and importance — what sort of an impact did it have at its peak?" Buczaczer says. "In a lot of ways this was probably the heaviest part of computation around what we think is going to show up again and again for each candidate."
Buczaczer and his computer have done some forecasting about which controversies will get the most coverage for each candidate between Sept. 12 and Oct. 12. But I'm not going to tell you what they are — not yet.
In the right corner: Jonah Goldberg, who writes for the conservative National Review.
"This is a very intimidating thing to do because obviously you're putting me up against Skynet," Goldberg says.
Skynet is the computer in the Terminator movies that ends civilization as we know it. While Quid was searching through millions of articles looking for patterns — Goldberg has a much more poetic way of predicting the future.
"It's always safer to bet making predictions that are in line with the character and personality of the people you're making predictions about," Goldberg says. "Like in Aesop's fable, it's really easy to predict that the scorpion is going to sting the frog — because that's the scorpion's nature."
In the left corner: Simon Maloy, a political writer at the liberal Salon.
Maloy agrees that the personalities of the candidates are part of his prediction process. But so is the media itself. It has to draw readers and viewers and create buzz, which is why the media likes controversies.
"I think media organizations understand that there's value in writing a story that people will fight about and people will talk about. I think that absolutely factors into it," Maloy says.
The Rules of the Game
Each contestant will predict which controversies will get the most coverage over a month-long period. They each gave a list of predictions for each candidate. The list is ranked in order of which controversy will get the most coverage.
On Oct. 12, Quid's computers will search the Web to find out which controversies got the most coverage. And we will announce the winner on air and on this blog. Each contestant will get their own NPR T-shirt. Then, we will do some deeper analysis of other areas where both humans and machines are used to make predictions.