Fake news on Twitter during the 2016 U.S. presidential election

Fake news on Twitter during the 2016 U.S. presidential electionMajor ResultsIntroData and DefinitionsFake news sourcesThree Classes of Fake News SourcesFake News Site-Type CountVoters on TwitterIdeologyResultsPrevalence and Concentration"Supersharers" and "Superconsumers"Who was Exposed to Fake News Sources?Fake News in One's News Feed, by IdeologyFake News Exposure, Likelihood of Sharing, and PredictorsOther factors associated with small increases in exposures to fake news sourcesWho shared fake news sources?Fake News and the Media EcosystemDiscussionLimitationsPoints of Leverage to Reduce Misinformation

Major Results


Primary Research Questions

  1. How many stories from fake news sources did individuals see and share on social media?
  2. What were the characteristics of those who engaged with these sources?
  3. How did these individuals interact with the broader political news ecosystem?

"We wish to understand how fake news sources were positioned within this ecosystem. In particular, if people who saw content from fake news sources were isolated from mainstream content, they may have been at greater risk of adopting misinformed beliefs."

Data and Definitions

Fake news sources

Three Classes of Fake News Sources

Categories differ based on the likelihood that they will produce fake news but also based on how the list was created...

  1. Black —> websites taken from preexisting lists of fake news sources constructed by fact checkers, journalists, and academics who identified sites that published almost exclusively fabricated stories [source1] [source2] (taken from the survey and web browsing data links above)

Additional sites were taken from Snopes.com (identified as sources of "questionable claims") and manually annotated by the authors of this paper as red or orange

  1. Red —> Sites (e.g., Infowars.com) that spread falsehoods which clearly reflected a flawed editorial process
  2. Orange —> Sites where annotators were less certain that the falsehoods stemmed from a systematically flawed process
Fake News Site-Type Count
171 sites64 sites65 sites

Voters on Twitter


Users are split into five political categories by comparing an individuals Twitter feed to feeds from registered Democrats and Republicans.

Political Categories

  1. Extreme Left (L)
  2. Left (L)
  3. Center (C)
  4. Right (R)
  5. Extreme Right (R)

How they do this, from the supplementary materials

We devise a continuous political affinity score for panel members and evaluate its accuracy in three different ways. The score estimates the similarity of an individual’s exposures to those of registered Democrats and Republicans using a logistic regression model, given individuals’ news diet on Twitter and precinct-level vote share in the 2012 general election. With respect to vote share, we use as a feature, for each individual panel member, the percentage of the vote received by Obama in the precinct in which their voter registration address is located. As a measure of people’s news diet, we average the political alignment of news sources the individual is exposed to on Twitter as describe below.

We infer the political alignment of news sources using a method similar to the one used by Bakshy et al., with a key distinction — we use exposure information rather that sharing of the source by partisans. This distinction lets us base our score on a much larger set of people who consume politics, but rarely tweet about it. As such, we compute a news source’s alignment as the proportion of registered Republicans and Democrats who were exposed to the source, and re-weight to correct for the imbalance of the two parties in our sample. In order to reduce the impact of cases where exposure to a news source is unlikely to reflect one’s political affinity, we only consider individuals with a minimum of 100 observed exposures to politics, and sites that occupying 1% or more of all political URLs in a person’s timeline. In addition, we only compute alignment scores for news sources with at least 30 registered voters. Fake news sources were excluded from the scores computation since a major part of our analysis pertains to the consumption of fake news as a dependent variable.



Prevalence and Concentration


"Supersharers" and "Superconsumers"

Median Daily Number of Tweets

SupersharersAvg. Users

Median Daily Number of Exposures to Political URLs

SuperconsumersAvg. Users

Median Daily Supersharer Political URL Sharing Habits

Political URLs Shared per DayFake News URLs Shared per Day
7.61.7 (of the 7.6)

Superspreaders and superconsumers of fake news sources even stand out among the overall most politically active accounts within he panel — i.e. the top 1% of sharers and consumers (Fig 2)

Given the high volume of posts shared or consumed by superspreaders of fake news, aswell as indicators that some tweets were authored by apps, we find it likely thatmany of these accounts were cyborgs: partially automated accounts controlled by humans. Their tweets included some self-authored content, such as personal commentary or photos, but also a large volume of political retweets. For subsequent analyses, we set aside the supersharer and superconsumer outlier accounts and focused on the remaining 99% of the panel.

Who was Exposed to Fake News Sources?

Fake News in One's News Feed, by Ideology


Fake News Exposure, Likelihood of Sharing, and Predictors

According to binomial regressions fit separately to each political affinity group, the strongest predictors of the proportion of fake news sources in an individual’s feed were the individual’s age and number of political URLs in the individual’s feed (Fig. 4, A & B)

A 10-fold increase in overall political exposures was associated with doubling the proportion of exposures to fake news sources (Fig. 4A)—that is, a 20-fold increase in the absolute number of exposures to fake news sources. This superlinear relationship holds for all five political affinity groups and suggests that a stronger selective exposure process exists for individuals with greater interest in politics.

Other factors associated with small increases in exposures to fake news sources

These findings are in line with previous work that showed concentration of polarizing content in swing states [source] and among older white men [source]. However, effects for the above groups were small (less than one percentage point increase in proportion of exposures) and somewhat inconsistent across political groups.

Who shared fake news sources?

Sharing Conditions on Prior Exposure [figure] (Fig 4 F-I)

These findings highlight congruency as the dominant factor in sharing decisions for political news. This is consistent with an extensive body of work showing that individuals evaluate belief-incongruent information more critically than belief-congruent information [source]. Our results suggest that fake news may not be more viral than real news [source].

Fake News and the Media Ecosystem

In a manner similar to other analyses of media co-consumption [source], we constructed this co-exposure network by using a technique that identifies statistically significant connections between sites [source]. (Fig 5)


Four groups of websites were cluster together, 3 of which consistently clustered together (using three different clustering algorithms). Group 4 consisted of the remaining nodes.

In summary, fake news sources seem to have been a niche interest: Group 2 accounted for only a fraction of the content seen by even the most avid consumers of fake news but nonetheless formed a distinct cluster of sites, many of them fake, consumed by a heavily overlapping audience of individuals mostly on the right.




Points of Leverage to Reduce Misinformation

Such interventions do raise the question of what roles platforms should play in constraining the information people consume. Nonetheless, the proposed interventions could contribute to delivering corrective information to affected populations, increase the effectiveness of corrections, foster equal balance of voice and attention on social media, and more broadly enhance the resiliency of information systems to misinformation campaigns during key moments of the democratic process.

Notes by Matthew R. DeVerna