Four Keys to Understanding Big Data in Healthcare


Extracting value from “Big Data” has become a pursuit across all industries as the internet of things and a proliferation of data producing devices in manufacturing, retail, consumer goods, communications and healthcare overwhelms typical software and database tools.  Our research on big data in healthcare is unearthing many viewpoints; including IBM’s finding that there is so much information being generated (and at an increasing pace) that 90% of the data in the world has been created in the last two years alone.  For industries like retail, the value in unlocking certain data sets around inventory controls and consumer targeting can provide a tangible financial incentive through cost reductions and increased top line.  However, the benefits to complex industries like healthcare are more ambiguous.

Big Data, as defined by the SAS Institute, is “the exponential growth, availability and use of information, both structures and unstructured.”  McKinsey adds a slight twist on the definition by describing big data as “datasets whose size is beyond the ability of typical database software tools to capture, store, manager, and analyze.”  Big data then, should be considered a relative term – as the pace of data production increases, so must the technology that captures and dissects the data.

The migration away from paper records in healthcare is a timely example of increases in data production.  While the paper-to-digital transition is important, the road to attaining meaningful results from the records is fraught with technological challenges in both the capture (EHR, record storage) and analysis (data mining, business intelligence) of the data.  Current initiatives such as the Health Information Technology for Economic and Clinical Health (HITECH) Act, while supporting the initiative to aggregate, analyze, and electronically share previously disparate records across healthcare constituents, are a giant step in the right direction but fail to take full opportunity of the granular data that the systems they require produce.  Clearly, the current initiatives are not a panacea for all healthcare data problems.

In fact, the HITECH act is serving as a catalyst for change across other healthcare issues.  Tom Zajac, the President of MEDai, argues that “government regulations often act to jump start the industry into moving into a direction it already knows is the right course.  And once the industry gains traction, it outruns the government by internally implementing the changes that are needed to take health care to the next level.”  With massive industry transitions such as EMR implementations, ICD-9 to 10 conversions, and the increasing reliance of mobile technology in healthcare converging at the same time, it is apparent that the Pandora’s Box of healthcare data production has been opened.  In light of these challenges, there are four components of the shift to Big Data that we’ve identified as part of our research.

  • Privacy:  Healthcare information is heavily regulated and protected.  Rules such as HIPAA set standards for electronically protected health information and the privacy of individually identifiable health information, and current policies may need to be adjusted or rewritten to optimize the flow of data across the healthcare system.
  • Fragmentation:  The fragmented nature of the healthcare industry restricts the free flow of data between systems and seemingly limits the value of big data within the industry (see McKinsey diagram below).  Currently, there is little overlap or sharing of these data sets.  While big data approaches can provide some benefit to each of the four primary data pools (clinical, claims, R&D, and behavioral), the benefit would be much greater if all forms of data were combined.  Providing the right set incentives and regulations to promote collaboration is a formidable obstacle to overcome.




  • Standards:  The many types of data in healthcare pose a large problem due to incompatible formats and lack of standardization.  Without data standardization, it is impossible to arrive at meaningful, trustworthy conclusions from data analysis – i.e. a single version of the truth.  In a clinical setting, the implementation of EHR systems will alleviate some issues but extracting meaningful information from them is still in its infancy.
  • Incentives:  The financial incentives of using big data within the health system are not equal across healthcare constituents or differing reimbursement models.  Consider traditional “fee-for-service models – why would a provider invest in systems to capture and analyze bid data when it could result in more efficient patient treatment, translating into less reimbursement?  However, in a gain-sharing or capitation world, the value proposition for providers grows markedly.

Leveraging big data concepts and implementing solutions will be a worthwhile pursuit in healthcare, but the range of issues listed above points to government needing to become a catalyst in leading the way.  Healthcare costs will be the key driver, but we’d like to know what you think.

Jeff Farnell
Like most websites we use cookies here, but we don't share your information. By continuing your visit, you accept the use of cookies. Find out more