Aggregated, analyzed web searches predict spikes, falloffs in COVID cases
Emulating finance’s use of satellite parking-lot imagery to guide investments in retail, researchers have tapped Google search patterns to helpfully predict ebbs and flows of COVID cases across the U.S.
The work was conducted at New York University and is presented in Social Network Analysis and Mining.
The team focused on broad and specific search terms that web surfers tend to use before heading out to dine, shop, socialize or be entertained.
They then checked these “mobility predictors” against COVID case counts in respective states 10 to 14 days after the searches, when exposure would likely begin to produce symptoms.
The study authors report finding agreement between their net movement index and weekly reported cases in 42 of 50 states.
Further, homing in on five states for a special case study—Arizona, California, Florida, New York and Texas—they found mobility indexes dropped during initial lockdown periods and shot up when lockdowns ended. Similarly, sudden declines in mobility-indicative Google searches matched with clear declines in infection rates.
Most tellingly, four of the five states saw rises in COVID cases during post-lockdown spikes in mobility-indicative searches.
In addition, looking at Google search patterns and volumes indicating people preparing to hole up at home en masse—“isolation predictors” like “home yoga,” “food delivery” and “how to cut my own hair”—the researchers found dropping infection rates.
“Our work illustrates how a tool based on Google search volumes could, with further study, form part of an early warning system for COVID case resurgence or for other future epidemics,” the authors comment in their discussion. “In combination with other standard public health metrics and social statistics, this is a first step toward building a tool to feed real-time changes in behavior into prediction models.”
The study’s lead author is computer scientist Anasse Bari, PhD, of NYU’s Courant Institute of Mathematical Sciences. Its senior author is infectious disease specialist Megan Coffee, MD, PhD, of NYU Langone Health.
Anticipating concerns over privacy, Bari et al. underscore their use of only aggregated and anonymized data from massive repositories.
In internal coverage of the research by NYU’s news operation, undergraduate co-author Aashish Khubchandani says the team next plans to “build a knowledge base on human behavior change from alternative data during the life cycle of the pandemic in order to allow machine learning to predict behavior in future epidemics.”
The study is available in full for free.