A machine learning scan of popular social media platforms could help inform public health education and outreach during Covid-19, according to a new study published in the Journal of General Internal Medicine.
The study measured daily changes in the frequency of topics of discussion across 94,467 COVID-19-related comments on Reddit* in March, 2020 to test the following hypotheses:
- The pandemic has been accompanied by an “infodemic,” an overabundance of information and misinformation.
- Public response to the pandemic and infodemic is important, but undermeasured.
- Real-time analysis of public response could lead to earlier recognition of changing public priorities, fluctuations in wellness, and uptake of public health measures, all of which carry implications for individual- and population-level health.
*Reddit is the 19th most popular website in the world with 420 million monthly active users.
The study identified topics that fell into three categories of interest, tracking daily variations in the average prevalence of topics across all comments.
- response to public health measures
- impact on daily life
- sense of pandemic severity.
This analysis indicates that longitudinal topic modeling of Reddit content is effective in identifying patterns of public dialogue and could be used to guide targeted interventions.
Early recognition of this reality could have led to more specific information dissemination campaigns and earlier public acknowledgment of disease severity.
Questions about safely spending time outdoors peaked in mid-March, representing a missed opportunity for public guidance.
Tracking and responding proactively to common questions, such as what material is best used for a homemade mask, may minimize the spread of misinformation.
Notably missing from these Reddit topics were discussions of contact tracing, a growing area of public concern.
Stokes, D.C., Andy, A., Guntuku, S.C. et al. Public Priorities and Concerns Regarding COVID-19 in an Online Discussion Forum: Longitudinal Topic Modeling. J GEN INTERN MED (2020). https://doi.org/10.1007/s11606-020-05889-w
Limitations of this study include that Reddit users are not representative of all segments of the population, and that Reddit data is not associated with a geographic location. Real-time monitoring of online COVID-19 dialogue holds promise for more dynamically understanding and responding to needs in public health emergencies.