Weiboscope Open Data
datasetposted on 2021-09-27, 04:02 authored by King Wa FuKing Wa Fu
Welcome to the Open Weiboscope Data Access website. Weiboscope is a data collection and visualization project developed by the research team at the Journalism and Media Studies Centre, The University of Hong Kong (JMSC). One of the objectives of the project is to make censored Sina Weibo posts of a selected group of Chinese microbloggers publicly accessible, which enables academic use of the data for better understanding of the social media in China and making the Chinese media system more transparent. Since January 2011, the project has been regularly sampling timelines of more than 350,000 Chinese microbloggers who have more than 1,000 followers. The methodology has been detailed in an IEEE Internet Computing article (Fu, Chan, Chau, 2013). Besides, we have sampled Sina Weibo accounts randomly since 2012 and the samples' most recent timeline were collected and stored into the dataset. Our sampling approach is reported in a PLOS ONE article (Fu, Chau, 2013). This site contains all the Weiboscope data collected in the year 2012. We are delighted to share the data for open access. But for ethical reason, the data are anonymized, i.e. real user and message id are replaced by pseudo ID. When using the data, please cite the paper below. King-wa Fu, CH Chan, Michael Chau. Assessing Censorship on Microblogs in China: Discriminatory Keyword Analysis and Impact Evaluation of the 'Real Name Registration' Policy. IEEE Internet Computing. 2013; 17(3): 42-50. http://doi.ieeecomputersociety.org/10.1109/MIC.2013.28 Data Set Statistics: Number of weibo messages: 226841122 Number of deleted messages: 10865955 Number of censored ('Permission Denied') messages: 86083 Number of unique weibo users: 14387628 Enquiry: Send your question/comment to firstname.lastname@example.org. The project is funded by the University of Hong Kong Seed Funding Program for Basic Research.
Fu KW, Chan CH, Chau M. Assessing Censorship on Microblogs in China: Discriminatory Keyword Analysis and the Real-Name Registration Policy. Internet Computing, IEEE. 2013; 17(3): 42-50.