Patient Data Matching

Beth Walsh | January 30, 2015 | EMR/EHR

Meaningful Use has led the country to a dramatic increase in EHR adoption and the focus on information exchange makes the accurate linkage of patients with their health information more important and more complex than ever.

The Office of the National Coordinator for Health IT (ONC) issued a report last February that included an environmental scan on current industry capabilities, best practices and patient matching literature review. But, the agency is still working on its strategy and continually updates its position.

Indiana’s success

Among the organizations recognized for successful patient data matching is the Indiana Health Information Exchange (IHIE). The exchange’s relationship with the Regenstrief Institute, an Indianapolis-based informatics and healthcare research organization, led to its creation.

IHIE patient matching is done as part of a repository system that has been functioning in Indiana since the mid-1990s. “We’ve been doing patient matching since day one,” says Interim President and CEO John Kansky. “The patient matching algorithms we rely on today were developed at Regenstrief and continue to be fine-tuned by informaticists from Regenstrief.” He notes that while organizations across the globe have been debating how to go about patient matching, IHIE has been quietly doing so for years.

More than 100 hospitals and different data sources in the state participate and each new organization becomes part and parcel of the patient matching system. “Everything we do at the repository statewide can benefit patients. It’s very egalitarian. If you participate in the exchange then your data get matched. It’s a broader value proposition when done at the HIE level—not just within an enterprise with their business partners.”

Regenstrief published a paper that studied the sensitivity and specificity of the matching algorithm. For reasons of accuracy and privacy, the algorithm is always tuned to leave out any false positives. “I’m sure every single day we’re leaving out information when we go to capture a patient’s data,” Kansky says.

Among IHIE participants, patient matching is primarily used when a patient walks into an emergency department (ED), Kansky explains. It also applies to other settings, however. In an ED, patients see a triage nurse they’ve never met before and are treated by a physician they’ve never seen before.

“When that patient registers in the ED, we have a real-time interface that gives us enough information about this person that allows us to go to our system of repositories, which contains data from all of the participating organizations, and match that patient with their data from the various organizations and assemble their personal jigsaw puzzle.” By compiling all the information from imaging, specialists, labs and more, ED physicians get as much clinical context as possible.

When the patient matching algorithm surveys different repositories and there is a small chance that data are not correct for that patient, it leaves them out, says Kansky. “There haven’t been any incidents of mismatched data but I’m absolutely certain we don’t match data that could have been helpful because we’re being conservative.”

He says other regions have not been successful in this area because marketplaces have not collaborated. “It’s not the technology—that’s not the hard part. It’s getting competitors to recognize that all boats can be raised and clinical quality and safety can be raised if you cooperate and collaborate.”

Changes in reimbursement could drive competitors to this type of collaboration, Kansky notes. “Having the ability to match these patients across many, many data silos gives the opportunity for population health management that didn’t exist before. It’s no longer just a person walking into the ED. The market is more motivated to crack this nut because of the need to do population health management.”

Another challenge in patient matching is accuracy evaluations, says Shaun Grannis, MD, MS, medical informatics research scientist at Regenstrief. "Few people in the record linking space are doing accuracy evaluations. Vendors don’t do them and others lack access to large, heterogenous representations of patient data." Indiana is fortunate in this area because it has more than 26 million records across diverse settings, he notes.

There is an idea that record linkage is a "black box," Grannis says, so "the more details we can get out there, the better." He worked with a hospital system on its matching algorithm and, "once we got under the hood, we were doing exact matching on five variables. It's incredibly simplistic. I think a lot of people either lack the desire or the capabity to dig into this. People will tackle other priorities before diving into the complexity of patient matching."

Investing in the challenge

Intermountain Healthcare has both the desire and capacity, spending more than $2 million a year just trying to manage and correct errors caused by identity, according to Sid Thornton, PhD, medical informatics director at the Salt Lake City organization’s Homer Warner Center for Informatics Research. “It’s a continuing challenge and we’ll be investing in it for quite a while until it’s solved.”

Patient matching ties into Intermountain’s commitment to providing its patients with the best possible care at the lowest possible cost through its computer-assisted decision-making strategy. “Computers can help reduce deviations in standards of care and we can understand and interpret what is true and justified variability from just errors.”

Doing that requires that computers can talk to each other, he says. “Shared accountability or a shared risk payment model relies on the efficiency of our computers to understand whether they are talking about the same person.”

Currently, Intermountain patients are represented in the information systems by constellations of demographic data which evolve and change over time, Thornton explains. Many times those changes are based on the context of how the information is gathered or what the current conditions are.

Intermountain has a group of eight employees dedicated to correcting errors after they’ve been detected. They are working on reducing those errors by training employees such tactics as always checking patients against government-issued identification. And, he suggests that organizations have patients “give a full complement of information even if you’re confident that you know who they are.”

Awareness of the issue is another consideration. A clinician might think a lab specimen or visit summary isn’t a big deal when it will actually be associated with a lifetime record.

While you might think that large systems are the only providers tackling this issue, that’s not necessarily the case. “We have found organizations, large and small, and their interest and support of issues of identity really correlate with their mission to provide longitudinal care vs. episodic care. There are small offices that are very concerned with this.”

When everything was organization-centric the primary concern was false-positive linking of information. Despite using the best algorithms and best practices in scheduling and registration, “because of fundamental flaws of demographic representation, we still have to employ people to fix mistakes,” he says.

The shift to community-based care plans and reliance on information from multiple organizations is turning attention away from misassociated data to false-negatives, Thornton says, which is equally bad. “Missing data or not finding them when they should be there is as risky and as costly to the organization as mislinked information.”

Intermountain has had to up its game, he says, making thresholds much less accepting of the false-negative error without compromising the false-positives. “We’ve got to get smarter.” He’s begun working with national efforts, such as Healtheway protocols, but that introduces new problems as well. For example, they need to make sure computers can work to support efficiency and quality gains without having to increase the size of Intermountain’s manual error correction processes.

The organization is working on a pilot project to get partners across its community to talk about what the qualitative information should be. “Historically, our computers have made binary decisions. Yes, you are this person or no, you are not associated with this identifier,” says Thornton. It’s now more of a probabilistic question as in there is a 75 percent or 51 percent chance that this is the correct patient. “If it’s just over the threshold, we’re golden,” says Thornton. But, “there’s quite a bit of difference between linkage with 30 years of attested link information and a new record where no clinical information is associated but it looks like it’s close.”

Computers today are smart enough to make the difference, he notes, when you think about Google and other sites using ranking mechanisms to sort through billions of records to bring forth those that are the most relevant. “That type of technology is available to us for linking and changing demographic profiles.” In that way, healthcare can start to bring qualitative information into reviews whether they are computer-assisted or manual. “If we have a way to retain and build that qualitative scoring of those linkages over time you can improve and reuse it. That’s the philosophic basis of these services.”

As part of the pilot project, Intermountain is actively researching the qualitative impact of understanding how one entity is associated with another, whether that is provider-provider, patient-provider, patient-patient and more. “The patterns that emerge based on both the active and passive usages give really tremendous insight into some of these hard-to-find statistical dilemmas which are, unfortunately, part of the imperfect demographic representation we use today.”

Thornton is interested in investing in standards and an open source, open communication toolset everyone can access. Pooling resources, use cases and building services will raise the whole ecosystem, information exchange in particular, he says.

The ‘three-legged stool’

New innovators-in-residence are part of the federal government’s patient matching initiative. A past fellow of biomedical informatics at both the National Library of Medicine and the Regenstrief Institute, Adam Culbertson, MS, recently began a two-year stint as innovator-in residence at the Department of Health & Human Services. Sponsored by HIMSS, he will spend his time working on patient data matching.

Culbertson describes a “three-legged stool” of record linkage: data quality, processes and algorithms.

Data quality is the most important, he says, because algorithms are the downstream consumer of the data and they can’t resolve sloppy or incorrect information. Improved data quality improves match rates. “As we increase the number of patients, matching those individual patients as distinct and unique entities becomes more and more difficult.”

For example, in a small town there might be just one patient named Jeff Smith. But, let's say we move to a larger area and there are now 1,000 Jeff Smiths. Which Jeff Smith are you looking for? They could each be distinct or many could be duplicates. “We have to go to date of birth and start adding different unique pieces of information about that individual to say he is the right person.” Without one defining identifier, matching occurs by combining different attributes. However, dates of birth can be entered incorrectly or in different ways. Names can be entered incorrectly or in different ways.

Regarding processes, healthcare organizations have a wide range of variability in quality assurance programs around patient matching, Culbertson says. For example, before registering a new patient, the registrar should query the master patient index to make sure the patient doesn’t not already exist in the system rather than create a duplicate record. “Having duplicate records is a challenge. If there is no process to make sure a patient is only added once, he or she could exist as two separate entities and the records are now fragmented.”

While there is room for improvement of patient matching algorithms, Culbertson notes that they are only as good as the data with which they are working. “Algorithms can’t resolve terrible data.”

Grannis agrees. "The better the fuel you put in the engine, the better the engine will run over time. We would need fewer algorithms if we didn’t have to accommodate for the messiness of the data gathered today." He also cites the need for ongoing training of the people who gather these data. It's typically the lowest paid employees who are gathering data and they need education on the importance of their role.

At the end of his two-year residency, Culbertson says he hopes for a common national discussion around patient matching. "There are varied views on what people view as patient matching. There are a lot of different perspectives. As we move forward, it’s important to have a common language and shared discourse about what we mean by data quality and how we evaluate algorithm performance and processes.”

Rather than federal oversight, industry consensus and buy-in are more likely to advance successful patient matching, he says. And, while healthcare can learn from other industries about technological solutions for identity management that could be applied to healthcare, patient matching certainly is a unique challenge. Vendors, too, play a big role, Culbertson says. “There is a big opportunity for both vendors and providers. The end goal is getting all of us focused on how to make care the most cost-effective and safest for patients through the sharing of patient health data in a safe and secure manner.”

Looking ahead

Patient matching remains a complex problem because the healthcare system is always changing, says Grannis. "Fifteen years ago, we weren't contemplating HIE and all these different sources of data. The needs of the healthcare system continue to evolve so the requirements for patient matching evolve. The more sources of data I have, the more sophistication my matching approach has to have."

Issues that capture national attention, such as the recent Ebola scare, expose weaknesses of the system, Grannis says. "If we get enough cases like that, we will see more rapid change than seven to 10 years." Anytime the issue of patient identification comes up, the question of a national identifier also is raised. However, ONC and other federal agencies have been legislatively prohibited from spending time and money on such a system.

“A lot of the arguments presented in favor of having a national identifier really are valid in the underpinnings of our methodology but without having to have the administration,” says Thornton.

The system he has been working on, however, is similar to a national identifier maintained within the Intermountain community but built from the grassroots and based on data access as opposed to a mandated enrollment-type process. Comparing the system to credit cards, “we get these de facto frameworks that allow for the trusted linkages of data to the identifiers that are assigned by the computer systems.”

And speaking of a national scope, “a national HIE is not going to happen,” says Kansky. The federal government is focused on a solution that will work for all 50 states but most healthcare is local. And, existing HIEs are in kindergarten or first grade when considering the entire scope of information exchange, he says.

“There are HIEs that aren’t attempting to patient match or save data. They’re routing data but they’re not going into the more difficult, sophisticated waters of having to persist data on behalf of their participants.” He predicts more HIEs going into these areas, however. He points out two ends of the candle. One is that “it’s hard to run a sustainable organization if the only value proposition doesn’t require persisting of data or patient matching. They’re leaving a lot of value opportunities off the table.”

Meanwhile, there are increasingly compelling business reasons to offer these more sophisticated processes, he says. “People understand interoperability better than they did five years ago which means that the right amount of leaders in a given market or state are going to understand the problem and be compelled to take on more sophisticated data persisting and patient matching.”

In other predictions, Thornton says 95 percent of the healthcare population is going to be resolved in terms of identity within the next five years. In another five to seven years, their care team linkages are going to be resolved.

Grannis agrees with that timeframe. "I'm hopeful that over the next five to 10 years we'll see some real progress in this space. Identity management is one of the linchpins of a functional learning integrated healthcare system. It's going to be important that we develop common understanding and a strategy for addressing this problem. A key component of this system is really left unaddressed."

The lack of an optimal approach to matching leads to inappropriate care and higher costs, he says. Indiana has been doing well with its system that matches patients across the entire state. However, "if we grow to 10 or 20 million people, I have no doubt that eventually these probalistic, sophisticated matching systems will begin to fail. If we want to match at large scale, we need more discriminating data."

Patient matching has been part of the fundamental issues going into Meaningful Use Stage 3, Thornton adds, with proposals for standard mechanisms for best practices for identity management. “Stage 2 bypassed a lot of that by insisting that we get data moving through Direct first but once we get into XDA mechanisms where we have much more structured exchange, we will have to have adopted with them best practices for identity management.”

Industries that have the information, such as communications and finance, will come up with affordable and clever ways to help us refine the process, he says. Rather than repeating the same learning over and over every time, organizations won’t have to start from scratch.

“It’s no longer about having a better algorithm,” says Thornton. Healthcare is still working on 1960s technology because it works. The flaws are in the way people represent themselves and the way we associate demographics with entities, he says. “That’s where we’ll start to see a lot of innovation. It’s not a coding problem, it’s a cultural issue.”

Another look at health data privacy

Are the privacy and security issues seen in healthcare really any different from those in other industries?

Micky Tripathi, president and CEO of the Massachusetts eHealth Collaborative and member of ONC’s federal policy committees, isn’t so sure. “My healthcare data has never been breached that I know of but my credit card information has at least four or five times.”

While he acknowledges there should be an expectation of zero breaches, the same standards for encryption and other security measures are used in healthcare and other industries. Healthcare, however, has a huge range of implementation, because the industry is so fragmented.

In banking, for example, a small bank is still a big organization. In healthcare, there are organizations with just one or two people. It’s unlikely that those organizations are implementing the latest security standards. Healthcare needs greater education and better implementation and monitoring processes, Tripathi says. “It’s getting better but it’s still an issue.”

As chair of the federal JASON Task Force that is working on a national interoperability roadmap, he says one big concept in their work is privacy bundling. “The problem with that is that no one knows what it is.” And, the JASON process does not allow for anyone to go back and learn more. The general concept is one place with a list of all providers and perhaps the types of information they have.

Consumers could go in and allow a hospital to release, for example, their blood pressure, height and weight to other providers, researchers and health plans. There would be a complex matrix to check off all those things and it would somehow be communicated so all providers and all potential recipients would understand in a way that doesn’t reveal private information.

It’s very complex, however. For example, Maclean Hospital is a widely recognized psychiatric hospital in Massachusetts. Just the fact that a patient was treated there is revealing of his or her condition. The work group is working on untangling these vexing issues, Tripathi says.

Aside from ensuring that senders and receivers are who they say they are, there is an appropriateness question. “Let’s say I have your consent and we’ve determined that I am who I say I am and I have given my provider consent—patients and vendors still have a very distinct interest in appropriate use of that information,” says Tripathi. Beyond that, misinterpretation is a real concern. “One provider could document in a certain way and another read that and not interpret the information in the same way.”

Tripathi has been hard at work on the interoperability roadmap, which focuses on public APIs. The term has some people confused. “It does not mean that anyone can come and get data. The idea is more from the developer perspective in that the technical specifications would be made public.” Developers and entrepreneurs would be able to look at the specs so they can write an app that builds upon Twitter or Facebook, for example, and successfully interface new products without needing a software engineer.

This kind of modern software design and use of nonproprietary standards would make it possible for ONC to certify products and offer its seal of approval.

Beth Walsh, Editor

Editor Beth earned a bachelor’s degree in journalism and master’s in health communication. She has worked in hospital, academic and publishing settings over the past 20 years. Beth joined TriMed in 2005, as editor of CMIO and Clinical Innovation + Technology. When not covering all things related to health IT, she spends time with her husband and three children.