Supporting data for "Development, evaluation and implementation of an HPV vaccine chatbot for parents to improve HPV vaccination uptake among middle school girls in China"
Scope of data: This dataset supports research on the development, evaluation, and implementation of an HPV vaccine chatbot intervention for parents of middle school girls in China. The research employed a multi-phase approach: first developing and evaluating an HPV vaccine chatbot based on large language model (LLM) integrated with a China-specific HPV vaccine knowledge base. The second phase implemented the chatbot and assessed its effectiveness in improving HPV vaccination uptake among middle school girls and parental HPV-related literacy through a cluster randomized trial (cRCT) involving 2,671 parents from 180 middle school classes across four sites: Shanghai megacity and three regions in Anhui province (one urban and two rural areas), conducted from January to May 2024.
Data files description: The dataset comprises seven key files organized by research phases. Development phase (Chapter 3) data includes: (1) an HPV vaccine knowledge benchmark dataset with 212 validated questions used to evaluate LLM performance; (2) an HPV vaccine question dataset containing 107 parent-relevant questions with responses from different chatbot personas; and (3) a Chinese HPV vaccine knowledge base with 40 core knowledge points specific to the Chinese context. Implementation phase (Chapter 4) data includes: (4) a comprehensive dataset of 2,671 trial participants with demographic information and outcome measures; (5) baseline data for all 3,304 initially consented participants; (6) detailed chatbot engagement metrics for 1,051 intervention group parents; and (7) a collection of 16,950 user questions submitted to the chatbot during implementation.