Characterizing social media manipulation in the 2020 U.S. presidential electionKey InsightsIntroData CollectionDescriptive Stats on DatasetTop 30 hashtags and mentionsTwitter unhashed banned user datasetBot DetectionCharacterizing User Political BiasAutomationTop 15 Hashtags Utilized by BotsTop 15 Hashtags Utilized by HumansBots in Campaign DiscourseRepublican Campaign-related HashtagsDemocratic Campaign-related HashtagsHashtags Utilized by Right-leaning Humans/BotsHashtags Utilized by Left-leaning Humans/BotsTweets Split by Party and BotnessRepublican Bots Change Tweeting Behavior Around DNCHuman-bot Interactions and Echo ChambersForeign Interference OperationsPolitical Bias of Foreign Interference AccountsPolitical Affiliation and Banned User InteractionDistortionThe Conspiracy TheoriesNine Popular ConspiraciesTop four tracked conspiracy theory hashtagsSelf-Reported Geographic Location of Q-Anon UsersConspiracies and Media BiasPolitical Ideology and Conspiracy EndorsementConspiracies and BotsHyper-partisan Media Outlets and Bots
Analyze over unique dataset of over 240 million election-related tweets recorded between 20 June and 9 September 2020
Focus on characterizing:
Discover that bots exacerbate the consumption of content produced by users with their same political views, worsening the issue of political echo chambers.
They also discuss efforts carried out by Russia, China and other countries
Draw a clear connection between bots, hyper-partisan media outlets, and conspiracy groups, suggesting the presence of systematic efforts to distort political narratives and propagate disinformation.
Using a combination of state-of-the-art machine learning technologies and human validation, we investigate a number of research questions pertaining to two signatures of manipulation:
A 600 million tweet data set — related to the election was gathered.
In this paper they focus on a smaller subset during a time period that is closer to the election
The data analyzed in this paper are tracking these politicians behavior on Twitter, as well as people who mention them in their tweets
As part of Twitter's Transparency Center initiative, a large set of banned accounts (backed by governments) were published by Twitter
Dataset can be found here: informations operation dataset
Countries included in the dataset:
Each country is associated with two datasets:
The banned users including metadata
The tweets of these users
… use a conservative approach to classify bots as accounts that sit at the top end of the bot score distribution, rather than carrying out a binary classification of accounts into bots and humans
This addresses the problem of determining the nature of borderline cases for which detection can be inaccurate, and conversely allows to focus on accounts that exhibit clear bot traits. The results will be manually validated for accuracy.
Similar to prior work (Bovet and Makse, 2019; Badawy, et al., 2019), we identify a set of 29 prominent media outlets that appear on Twitter.
Using allsides.com non-partisan ratings, they categorize media outlets into the political categories (left, lean left, center, lean right, right)
Botometer V3 is utilized to calculate botscores for 32% of the accounts in their dataset
Botometer V4 is utilized for manual validation (via the web interface) and for examples in the paper
They look at only tweets sent by users they have botscores for and then split them into top and bottom deciles
While we do recognize that there may be users who use these hashtags in tweets with opposing viewpoints, vast amounts of research in political polarization assert that that is relatively infrequent (Jiang, et al., 2020; Bail, et al., 2018). Hence, we selected hashtags that were most relevant to both campaigns, such as “trump2020” and “bidenharris2020”.
Bots are generally much more active, as we would expect.
Around specific events, the two volumes of activity become close and comparable
A manual inspection was done to confirm that this surge was seen for both humans and bots alike.
The right-leaning discourse also exhibits similar phenomena surrounding the Republican National Convention that took place from 24–27 August
Interestingly, we also see right-leaning discourse increase in activity during the DNC
Bots almost exclusively retweet accounts that are humans.
This is consistent with:
There are proportionally more right-leaning bot accounts, at 4:1
Likely-humans — right vs. left accounts — are present in a 2:1 ratio
Of total retweets… (Fig. 2a)
Looking within group (Fig. 2b), however, we see clear political echo chambers
This indicates a slightly higher propensity of within-cohort interaction for right-leaning users. Left-leaning users retweet around 13 percent across the aisle, whereas right-leaning users retweet 10 percent across the aisle.
Bots retweet both left-leaning and right-leaning users, but predominantly retweet from the same side of the aisle.
This indicates left-leaning bots have a more diverse retweet appetite than right-leaning bots do
Looking at the Twitter data of foreign interference operations, where users are categorized by country, the authors determine the propensity of these accounts, by country, to be either left- or right- leaning politically. [Fig 3.]
Next, they plot specific targets of banned users, and their relative position in the Twitter network.
Edges are weighted by the number of retweets or quoted tweets between users
Only include users who have:
Shared more than five politically oriented URLs and
Links with weights greater than 100
Visualization was generated using the distributed recursive layout algorithm and the Force Atlas algorithm (Jacomy, et al., 2014)
Roughly six left-leaning clusters (in blue) and two right-leaning clusters (in red).
We also show the position of banned Chinese-ops users (green diamonds) and Russian operations (yellow diamonds)
Since banned users often interact with Twitter celebrities, the users shown are ones exclusive to each cohort. That is, yellow diamonds are users who have only been associated with banned Russian accounts.
We also observe that Chinese state-sponsored users tend to interact with Republican users more.
Russian sponsored interactions also emerged outside of the right-leaning and left-leaning cores.
In order to stay away from the "conundrum" of establishing veracity, the authors decide to focus on conspiracy theories.
Conspiracy theories are most likely false narratives, oftentimes postulated upon rumors or unverifiable information, that appear in social networks shared by users or groups with the aim to deliberately deceive unsuspecting individuals who genuinely believe in such claims (van Prooijen, 2019)
They focus on three conspiracy theories:
QAnon: A far-right conspiracy movement which has gained popularity in the run up to the 2020 election. This group’s theory suggests that President Trump has been battling against a satan—worshipping global child sex-trafficking ring and an anonymous source called ‘Q’ is cryptically providing secret information about the ring (Zuckerman, 2019).
“gate” conspiracies: Another indicator of conspiratorial content is signalled by the suffix ‘-gate’ with theories such as obamagate, an unvalidated claim against the Barack Obama’s officials that allegedly conspired to entrap Trump’s former national security adviser, Michael Flynn, as part of a larger plot to bring down the then-incoming president. Another example of “gate” conspiracy theory is pizzagate, a debunked claim that connects several high-ranking Democratic Party officials and U.S. restaurants with an alleged human trafficking and child sex ring.
Covid conspiracies: A plethora of false claims related to the coronavirus pandemic emerged recently. They are mostly about the scale of the pandemic and the origin, prevention, diagnosis, and treatment of the disease. The false claims typically go alongside the hashtags such as #plandemic, #scandemic or #fakevirus.
Q-Anon narratives have more highly active and engaged users (as measured by the ratio of tweet/unique users) than -gate narratives
Using VADER (Valence Aware Dictionary and sEntiment Reasoner Hutto and Gilbert, 2014) the sentiment of hashtags are calculated as well
Nine conspiracy narratives are broken down below ([Fig. 5])
Word Clouds of co-occurring hashtags were utilized to find additional hashtags that were popular and to further investigate.
We see that conspiracy theory hashtags tend to drop off in mid-July, this is likely due to Twitter engaging in a takedown of over 7,000 QAnon associated accounts.
57.6% of users report a location in their Twitter profile...
They then look at whether left vs. right users share conspiratorial narratives differently
Users are considered "endorsing" a narrative if they retweet a tweet with a conspiracy hashtag
Users are given a score from 0 to 1, which represents the proportion of their endorsements which are conspiratorial narratives
Users are then dichotomized into two groups:
This binary classification is based on a threshold value t:
Groups are significantly different
Conspiratory users tend to skew to the right
The non-conspiratory users, those unlikely to share the conspiratory narratives, are distributed more equally across the political spectrum with significant proportions of them on left and center of the spectrum.
Two-sided t-test performed both on a continuous (t=5.17) and discrete data (t=7.5), confirms that two distributions are significantly different with p<0.005.
Note: I believe the two shades of red/brown are supposed to be the same color (conspiratory). This distinction in color is not mentioned in the publication at all so this is my own assumption.
Almost a quarter of users who endorse predominantly right-leaning media platforms are likely to engage in sharing conspiratory narratives. Out of all users who endorse left-leaning media, approximately two percent are likely to share conspiratory narratives.
The main question [the authors] seek to answer is: are bots used to target groups and how do they push conspiracy narratives with news related media?
We compare the botscores of four groups
Non-conspiracy user = someone who did not use any of the tracked conspiratorial hashtags
A user who uses conspiratorial hashtags can be assigned to multiple conspiracy groups if they use keywords from more than one category
We can see that Q-Anon and -Gate communities are quite similar
Covid has the highest level of bots and, (thankfully) the non-conspiracy community is much less bot-like than the others
This overlap suggests that users who share conspiracy related content are prone to adopting multiple conspiracy narratives and that the communities are highly connected.
The authors also investigate the proportion of users sharing URLs from these news Web sites who have also used QAnon hashtags at any point in our dataset
Hyper-partisan news outlets like One America News Network (OANN) and Infowars are outliers but see the greatest proportion of their user base tweeting QAnon material
Left-leaning news outlets such as the New York Times and Washington Post have a low Botometer score and proportion of QAnon users, but the volume of tweets mentioning these URLs is much (approx. 29 times) larger.
The proportion of users using QAnon keywords is highly correlated with the average Botometer score (correlation coefficient: 0.947) across the spectrum of left, right and neutral outlets.
Also explore the proportion of bots and compare it to their political leaning and usage of conspiratory language.
Bots appear across the political spectrum and are likely to endorse polarizing views.
The smallest fraction of automated accounts is among the users who endorse centric media outlets (4%).
A larger proportion of automated accounts that endorse right-leaning media outlets.
By performing a t-test on the distributions of bot scores they confirm that the differences between the pairs (left-center, center-right and right-center) are all significant with p<0.005.
The proportion of bots varies in between users who are likely to share conspiratory narratives and those who are not.
Almost 13% of all users that endorse conspiracy narratives are likely automated.
It is possible that such observations are in part the byproduct of the fact that bots are programmed to interact with more engaging content, and inflammatory topics such as conspiracy theories provide fertile ground for engagement (Stella, et al., 2018). On the other hand, bot activity can inflate certain narratives and make them popular. The automated accounts that are a part of an organized campaign can purposely propel some of the conspiracy narratives, further polarizing the political discourse.
Notes by Matthew R. DeVerna