Aggregated, analyzed web searches predict spikes, falloffs in COVID cases

Emulating finance’s use of satellite parking-lot imagery to guide investments in retail, researchers have tapped Google search patterns to helpfully predict ebbs and flows of COVID cases across the U.S.

The work was conducted at New York University and is presented in Social Network Analysis and Mining.

The team focused on broad and specific search terms that web surfers tend to use before heading out to dine, shop, socialize or be entertained.

They then checked these “mobility predictors” against COVID case counts in respective states 10 to 14 days after the searches, when exposure would likely begin to produce symptoms.

The study authors report finding agreement between their net movement index and weekly reported cases in 42 of 50 states.

Further, homing in on five states for a special case study—Arizona, California, Florida, New York and Texas—they found mobility indexes dropped during initial lockdown periods and shot up when lockdowns ended. Similarly, sudden declines in mobility-indicative Google searches matched with clear declines in infection rates.

Most tellingly, four of the five states saw rises in COVID cases during post-lockdown spikes in mobility-indicative searches.

In addition, looking at Google search patterns and volumes indicating people preparing to hole up at home en masse—“isolation predictors” like “home yoga,” “food delivery” and “how to cut my own hair”—the researchers found dropping infection rates.

“Our work illustrates how a tool based on Google search volumes could, with further study, form part of an early warning system for COVID case resurgence or for other future epidemics,” the authors comment in their discussion. “In combination with other standard public health metrics and social statistics, this is a first step toward building a tool to feed real-time changes in behavior into prediction models.”

The study’s lead author is computer scientist Anasse Bari, PhD, of NYU’s Courant Institute of Mathematical Sciences. Its senior author is infectious disease specialist Megan Coffee, MD, PhD, of NYU Langone Health.

Anticipating concerns over privacy, Bari et al. underscore their use of only aggregated and anonymized data from massive repositories.

In internal coverage of the research by NYU’s news operation, undergraduate co-author Aashish Khubchandani says the team next plans to “build a knowledge base on human behavior change from alternative data during the life cycle of the pandemic in order to allow machine learning to predict behavior in future epidemics.”

The study is available in full for free.

Dave Pearson

Dave P. has worked in journalism, marketing and public relations for more than 30 years, frequently concentrating on hospitals, healthcare technology and Catholic communications. He has also specialized in fundraising communications, ghostwriting for CEOs of local, national and global charities, nonprofits and foundations.

Around the web

The American College of Cardiology has shared its perspective on new CMS payment policies, highlighting revenue concerns while providing key details for cardiologists and other cardiology professionals. 

As debate simmers over how best to regulate AI, experts continue to offer guidance on where to start, how to proceed and what to emphasize. A new resource models its recommendations on what its authors call the “SETO Loop.”

FDA Commissioner Robert Califf, MD, said the clinical community needs to combat health misinformation at a grassroots level. He warned that patients are immersed in a "sea of misinformation without a compass."

Trimed Popup
Trimed Popup