Back to Home

Data Quality Matters

Written by Charlie Harp

February 24, 2021 at 10:45 AM

It has been a while since I posted a blog so first, I want to say thanks for hanging in there.


I have spent decades working in healthcare IT and every now and then I encounter situations where someone is struggling to understand why data quality is an issue. What difference does it make if the data is not great? What is the impact? Now, I can always come up with examples, often scary examples where bad quality or missing data can result in something terrible happening. It illustrates the point but is often cloaked in informatics-speak or clinical nuance. So, when I come across an example that I think is fairly accessible and something that I could put a lighthearted spin on, I am delighted to share it and who better to share it with than our blog subscribers.

So, grab a cup of coffee, sit back in your favorite work-from-home chair and I will tell you a story.

This story takes place across the sea in a kingdom on an island called Great Britain. The protagonist of our story is a strapping young-ish journalist named Liam. As you likely know, Great Britain has a National Health Service, or NHS, and every patient has a unique identifier, and their health information is somewhat standardized. Furthermore, their clinical data is aggregated within regional Clinical Commissioning Groups or CCGs. Compared to the siloed chaos we have here in the United States this environment sounds like a veritable data quality utopia. What kind of cautionary tale could possibly arise from such an idyllic ecosystem, you might ask? Well, during this time of COVID-19 the NHS decided to use algorithms operating against the data to help determine which citizens should be prioritized for vaccination based on their age and medical conditions.

As a person who has committed much of their adult life to healthcare information technology this situation is exactly the kind of thing we are trying to enable. Enabling software to help make good decisions quickly. So, I want to say that I applaud the initiative and nothing I say from this point forward should detract from that...

The algorithms were run and the invitations to get the vaccine, or ‘the jab’ as they call it in the UK, were sent out. When our hero, Liam, got his invitation he was ‘really confused’ as to why he was receiving it so early in the process. After all, he is in his thirties and has no chronic conditions, why was he being prioritized above other people that were obviously more vulnerable?

Liam contacted the CCG to ask why he had been prioritized. What did they know that he did not? Was he a long lost member of the royal family or did he have some terrible condition that the NHS had not informed him of? It turns out it was the latter. The CCG informed Liam that he was morbidly obese...

This was a surprise to Liam who did not feel morbidly obese. Could he lose a few pounds? Maybe, but morbidly obese? "You are, indeed," he was informed by the CCG who must have assumed Liam was in denial, because he had a Body Mass Index or BMI of over 28,000. Now, for those of you who are not familiar with the formula used to calculate BMI, it is your weight in pounds times a constant of 703 divided by the square of your height in inches. At a height of 6 feet 2 inches, to achieve a BMI of greater than 28,000, Liam would have had to weigh in at an impressive 224,000 pounds, which is roughly the weight of a railroad locomotive engine. Now, Liam glanced around his flat, making sure it was not a wheelhouse, and informed them that there must be some kind of mistake. They confirmed that his weight on record was a bit over 200 pounds but that was not the issue. The issue was his diminutive height of 6.2 centimeters. That’s right, according to the NHS Liam was two and a half inches tall.

这个故事的寓意,女士们,先生们,是that data quality matters the minute we ask software to help us make decisions using that data. In this case it resulted in a whimsical story that conjures adorable images of a tiny, albeit a bit chunky figure, maybe dressed in a green suit and top hat (sorry Liam), getting an early immunization. But it could also be something more serious, something that resulted in a missed intervention or a wrong intervention. What I also would like to highlight about this story is that it brings into focus that data quality is not always about terminology. In this scenario it was a ubiquitous entity that lives between terminology and quantitative data, the unit of measure, that caused the issue. One little mistake with our friend the unit of measure, opens the door to algorithmic mayhem.

Thanks to the for sharing Liam’s story and providing me with the opportunity to turn it into a cautionary tale, and thanks to Liam for being a thoughtful member of society and thinking of others.