The Belt and Road Initiative on Twitter: An Annotated Dataset
This repository stores a collection of 500,711 posts (tweets) and 714,794 reposting threads (retweets) related to the BRI on Twitter. The dataset was collected through Twitter APIs using the terms: “belt and road”, “one belt one road”, ‘new silk road”, “maritime silk road”, and “silk road economic belt”, which includ the phrases and their hashtag format to download data. The time series of the dataset is from 7 September 2013 to 30 November 2021. Furthermore, the dataset has been annotated in terms of languages, sentiment, and geopolitical entities by using text analytics in language detection, neural machine translation, lexicon-based sentiment analysis, and web automation. To facilitate insights discovery, we classified the dataset into four databases in which can be analyzed separately and used by researchers related to various fields, such as social science, network science, sociology, and others to study the dynamic activities on Twitter, trends on keywords, users’ characteristics, as well as the social networks on citing patterns.
All programming scripts required to reproduce the dataset are available at: https://github.com/edmangog/The-BRI-on-Twitter