The data (sources) provided here are used to reproduce the results presented in the thesis titled “Burden and control of chronic obstructive pulmonary disease”. In detail, the data obtained from the Global Burden of Disease 2017 were used to derive the relevant results in Chapter 3. The Hong Kong’s Elderly Health Service Cohort data were used to develop and calibrate the risk prediction models for chronic obstructive pulmonary disease (COPD) in Chapter 4. The summary genetic statistics from the INTERVAL, YFS, and FINNRISK, and ICGC were used to investigate the role of interleukin 1 family members and their receptors in airflow limitation, a surrogate of COPD. The summary genetic statistics from the MR-base were used to illustrate the application of control exposures in Mendelian randomization in the presence of potential selection bias. Furthermore, the data from RegulomeDB, PhenoScanner, and LDLink were used to validate the selected genetic instruments used in Chapters 5 and 6.