Talkographics: Using What Viewers Say Online to Calculate Audience Affinity Networks for Social TV-based Recommendations
Viewers of TV shows are increasingly taking to online sites like Facebook and Twitter to comment about the shows they watch as well as to contribute content about their daily lives. We present a novel recommendation system (RS) based on the user-generated content (UGC) contributed by TV viewers via the social networking site Twitter. In our approach, a TV show is represented by all of the tweets of its viewers who follow the show on Twitter. These tweets, in aggregate, enable us to reliably calculate the affinity between TV shows and to describe how and why certain shows are similar in terms of their audiences in a privacy friendly way. This paper’s two main contributions are: 1) a new methodology for collecting data from social media — including information about product networks (or how shows are connected through users on a social network), geographic location, and user-contributed text comments — which can be used to generate affinity networks and test them; and 2) a new privacy friendly UGC-based RS that relies on all publicly-available text contributed by viewers, as opposed to only pre-selected keywords extracted from the UGC associated with the shows, a specific ontology or taxonomy, which makes our approach more flexible and generalizable than those used in any prior research. We show that our approach predicts remarkably well the TV shows that Twitter users follow. We also explain why the approach works so well: First, we show that the UGC reflects the demographics, geographic location, and psychographics of viewers, and coin the term talkographics to refer to descriptions of a TV show’s viewers — or in general any product’s audience — that are revealed by the words used in text messages sent by Twitter-using TV viewers; second, we show that Twitter text can represent many complex nuanced combinations of the demographic, geographic, and psychographic features of the audience; third, we show that we can use talkographic profiles to first calculate similarities between TV shows, then use these similarities reliably in RSs; we also show that our approach can be combined with a product association network approach to achieve even better recommendations; finally, we show that our text-based approach performs best for shows for which there is a demographic bias to the viewing audience compared to those that do not have a demographic bias. To demonstrate that our RS is generalizable, we apply the same approach to followers of clothing and automobile retailers.