“Misinformation Online to Offline: A Twitter Field Study”

Abstract: Roughly 15-20% of the US population gets their news from social media, with most other Americans getting their news from TV, news websites, or traditional print sources. In this paper, we characterize how viral content (of all kinds) on social media might spread beyond online platforms to other forms of news (electronic, print, televised, or talk). In particular, we examine “phrases” (or n-grams) that appear on social media, including those often those associated misinformation (e.g., “Let’s Go Brandon”), and investigate where they originated and to what extent they infiltrate these broader sources of news. First, we build a universe of n-grams found more commonly in news than in the standard English language (using GoogleBook frequency as a baseline). Second, using NLP tools that are trained on labeled articles from third-party fact-checkers, we categorize each n-gram according to its importance and likelihood to be associated with misinformation. Third, we study the time series of their appearance on Twitter as compared to other news, and use a Granger causality test to determine whether misinformation tends to originate on social media and then spread elsewhere or vice-versa. Our results are useful for interpreting whether the impacts of social media misinformation are understated by not considering spillovers to broader sources of news.