Is Twitter data going to disappear? Our research community is bracing for possible consequences

In just over a month after the change in Twitter leadership, there have been significant changes to the social media platform, in its new “Twitter 2.0” version. For researchers who use Twitter as a primary source of data, including many computer scientists in our research community, the effects could be debilitating.

Over the years, Twitter has been extremely friendly to researchers, providing and maintaining a robust API (application programming interface) specifically for academic research. The Twitter API for Academic Research allows researchers with specific objectives who are affiliated with an academic institution to gather historical and real-time data sets of tweets, and related metadata, at no cost. Currently, the Twitter API for Academic Research continues to be functional and maintained in Twitter 2.0.

The data obtained from the API provides a means to observe public conversations and understand people’s opinions about societal issues. Indeed, Twitter represents “a primary platform to observe online discussion tied to political and social issues.” And Twitter touts its API for Academic Research as a way for academic researchers to use data from the public conversation to study topics as diverse as the conversation on Twitter itself. This article presents the potentially negative impact of the new “Twitter 2.0.” on our research at USC’s Information Sciences Institute (ISI).

A project we are currently working on is related to understanding how Twitter users are differently susceptible to misinformation, conspiracy theories, and online harms in general. In one of our recent papers we try to understand how people get radicalized to certain conspiracies, like QAnon. Our team wants both to detect deceptive and inauthentic activity, but also to see how they can protect users from it. We want to understand how Twitter users deal with fake news, misinformation, and conspiracy theory, and who the most vulnerable users are. But we can’t do that without the data.

The possibility that we will not have data, of course, is a problem, because our work leverages Twitter data sets and was also tailored for discovering things that might be helpful for Twitter itself. We are looking to reveal the effectiveness of moderation policies, while observing users’ engagement with harmful content. Our findings can inform social media providers, regulators, and policy makers to formulate strategies to counter the circulation of conspiracy theories and misinformation on social media. For example, understanding who are the most vulnerable users might allow Twitter to know how to deal with these users, and probably not expose them to all these attacks.