U. of Chicago launching clinical data commons for oncology

BOSTON—Clinical data commons is the latest initiative at the University of Chicago that aims to foster innovation around data.

Robert Grossman, PhD, chief research informatics officer and director of the Center in Data Intensive Science at the university, spoke at Bio-IT World Congress.

His team plans to launch a large genomic and associated clinical data commons early this June. It’s designed for API access. “What better place than here to find like-minded individuals?”

Data commons co-locate data, storage and computing infrastructure and commonly used tools for analyzing and sharing data to create an interoperable resource, he explained. It’s a term used by the National Institutes of Health and is an idea that’s been around for a long time, he added.

Data commons can be all sizes and operations. This commons will go live with about 3 petabytes of oncology data, Grossman said, and 1.54 petabytes of legacy data. “The goal is to make this available to the research community. We want to interoperate with other people’s systems.”

The team is motivated by the International Cancer Genome Consortium (ICGC). “It gave us a nice, modern interface.”

Cancer is a very complex disease, Grossman said, but “the surprising thing is that a lot of papers are making inferences from nonharmonized data. Papers are typically saved by using biological validation. I’m not saying there’s anything wrong with the results but these cloud-based systems are one of the ways we can deal with the lack of harmonization.”

Standards are emerging for identification so the commons’ infrastructure is designed to support multiple digital identifications, he said. “The assumption is that for the next 50-75 years, we won’t decide on the right way to do it.”

Research data should be available at no charge, he said. Core access is free among this consortium. The internet was built by large ASPs to share traffic at no cost to users and switches were put in critical locations, Grossman explained. “We don’t have that principle right now with data. It’s always inexpensive to get data in but can be quite difficult to figure out how much it costs to get it out.”

Grossman also wants to make portability easy. The preferred access is API, not download. Getting data out in a way that it can be put somewhere else is extremely difficult, he said. He’s seeking an “indigo button,” he said, noting the success of the Blue Button initiative. “We’re looking for people to work with us to discover the format.”

The University of Chicago’s data commons is cloud neutral—interoperating with Amazon today and shortly with more cloud providers. “We just want to use the cloud that makes sense,” said Grossman.

With current scaling, he said they can get about 10,000 more patients into the commons every three years. The goal is to try to scale up to 100,000 patients. “It’s not that we couldn’t use cloud centers to do this now but I want to do it in such a way that we can interoperate and work with the community in a federated, safe way.”

It’s clear you can make a lot of money in this space, he said, but “we want sustainability,” he said. Companies can acquire data, silo it and monetize it but will there be enough research for researchers to treat cancer the way they want, he posed. “We need a critical mass of data. What are the rules of that ecosystem?” By the time that data is there in a clinical trial, some of that data is available but not all of it. There’s an intermediate ground of off-label use and taking strength of evidence to decide how to treat. That needs further discovery, Grossman said, regarding how it works, what’s public and what should be shared.

Beth Walsh,

Editor

Editor Beth earned a bachelor’s degree in journalism and master’s in health communication. She has worked in hospital, academic and publishing settings over the past 20 years. Beth joined TriMed in 2005, as editor of CMIO and Clinical Innovation + Technology. When not covering all things related to health IT, she spends time with her husband and three children.

Trimed Popup
Trimed Popup