The spreading of misinformation onlineMajor Findings Intro and Lit ReviewMethodsData CollectionPreliminaries and DefinitonsResults and DiscussionAnatomy of CascadesHomogenous ClustersThe Percolation Model
In this article the authors investigate (with the Facebook Graph API) how users consume information that is scientific or conspiratorial by looking at different types of Facebook groups (which are categorized within one of these two categories)
Although consumers of scientific and conspiracy stories present similar consumption patterns with respect to content, cascade dynamics differ
Selective exposure to content appears to the primary driver of content diffusion, generating homogeneous clusters — i.e. "echo chambers"
They also provide a data-driven percolation model which mimics rumor spreading and shows that homogeneity and polarization are the main determinants for predicting cascade size.
To select groups, they mimic Bessi et al. (2015) and end up with:
67 public pages:
Conspiracy Theory: 32 pages
Science: 35 pages
They also gather two additional pages — which they refer to as "troll pages" — which are used as a benchmark to fit the data-driven model
These pages are include, "those pages that intentionally disseminate sarcastic false information on the Web with the aim of mocking the collective credulity online."
It is not entirely clear how they gather these groups, even referencing the original Bessi et al. paper, but what they say is:
Using the approach described in ref. 10, we define the space of our investigation with the support of diverse Facebook groups that are active in the debunking of misinformation.
Collect all data for a 5-year period (2010-2014) by using the Facebook Graph API
Sharing trees = the oriented successive sharing behavior of a news item on Facebook
User polarization is defined as...
where is the fraction of "likes" a user gives to conspiracy-related content; hence, .
With user polarization they define the edge homogeneity, for any edge , between nodes and , as ...
With
Edge homogeneity reflects the similarity level in polarization between two nodes connected by that edge
Fig 1 shows us the sharing pattern overtime for each type of content:
Fig 2 shows us each categories lifetime as it relates to the size of a cascade:
Fig. 3 plots the PDF of edge homogeneity for the mean-edge homogeneity of all cascades for each topic
Notice that, although viral patterns related to distinct contents differ, homogeneity is clearly the driver of information diffusion. In other words, different contents generate different echo chambers, characterized by a high level of homogeneity inside them.
I would agree with the second sentence above, but not the first. It is not clear to me that homogeneity is "clearly the driver of information diffusion." I see a relationship, but causality is up in the air here.
They also take a CCDF (complimentary cumulative distribution function) of ALL TREE PATHS compared to ONLY TREE PATHS THAT ARE HOMOGENOUS and they find that there is no statistical difference between them.
As users tend to select and share only content which agrees with a specific narrative, this suggests confirmation bias is at play
Thus they develop a percolation model of rumor spreading to account for homogeneity and polarization
See the original text for details on the model itself, I will simply describe different ways in which they tested the model and the final conclusions
The model parameter space is tested nodes and news items.
The number of first sharers in a simulation is distributed in based on one of four different distributions:
Figure 5 shows that the inverse Gaussian is the distribution of first sharers that best fits real-world data
Notes by Matthew R. DeVerna