Supporting data for "Investigating Changes of COVID-19 Epidemiological Parameters from Different Perspectives"
My PhD thesis with title "Investigating Changes in COVID-19 Epidemiological Parameters from Different Perspectives" focus on using line list data (anonymized), patient hospitalization data (anonymized) and viral load data (anonymized) to improve the estimatin of different key epidemiological parameters during the COVID-19 pandemic in Hong Kong.
This dataset contains supporting data for reproducibility, it has 6 subfolders correspond to 6 chapters of the thesis (chapters 2, 4, 5, 6, 7, 8) where contain figures and data analyses, each sub folder contains data and R code for reproducing the figures and other analytical results, with README file accompanied with each sub folder.
In chapter 2, I provided an overview of the COVID-19 pandemic in Hong Kong and worldwide, and thus used datasets contain case incidence data and a R code to generate incidence figure. I also conducted a systematic review of the latent period estimation, and I provided the endnote library with spreadsheet of the endnote output that contain my paper screening process, which are included in subfolder dataset chapter 2.
In chapter 4, I did a detailed statistical analyses of the changing serial interval of COVID-19 in Hong Kong, and thus sub folder dataset chapter 4 contained anonymized transmission pair line list data for estimating the serial interval, I provided R codes and essential subset of the data output for reproducibility of my results. The related published work is on American Journal of Epidemiology, in README chapter4.txt I have put the DOI of this paper.
In chapter 5, I developed an inferential framework to infer the generation interval on temporal time scale, sub folder dataset chapter 5 contained public available line list data from mainland China, and R codes and essential subset of the data output for reproducibility of my results. The related published work is on Nature Communications, and the data and code are also available on github, I have out the DOI and github link in README chapter5.txt.
In chapter 6, I investigated the superspreading potential and setting-specific generation interval in Hong Kong, subfolder dataset chapter 6 contained simplified and anonymized transmission cluster size information, and related R code to reproduce the result, and also the R code for modelling buildig and estimation summary of the generation interval estimates.
In chapter 7, I estimated the latent period of COVID-19 based on different settings in Hong Kong, sub folder dataset chapter 7 contained processed and anonymized viral load record and transmission pair information of COVID-19 cases in Hong Kong, and related R code to reproduce the result, together with two spreadsheets for estimation summary. The entire R programming process contain a lot of R scripts, which I put two sub folders (R and Stan) under sub folder dataset chapter 7, and also put the original Github link for R programming of the method in README chapter 7.txt
In chapter 8, I analyzed the length of stay in hospital of COVID-19 patients in Hong Kong and the potential association with vaccination status. In sub folder dataset chapter 8 I put a simplified and anonymized dataset of patient's hospitalization record regarding their vaccination status and length of stay in hospital for the analysis. I also put R code and essential subset of the data output to reproduce the result.