10 essential ingredients for digital twins in healthcare

10 essential ingredients for digital twins in healthcare

This story looks at some fundamental building blocks that work together to build a digital twin infrastructure for medicine. It explains how promising techniques like APIs, graph databases, ontologies, electronic health records are being combined to unlock digital transformation in healthcare.

Digital twins could transform healthcare with a more integrated approach for capturing data, providing more timely feedback, and enabling more effective interventions. The information required to allow for better simulations lies scattered across medical records, wearables, mobile apps, and pervasive sensors.

Medical digital twins can use raw digital ingredients like natural language processing (NLP), APIs, and graph databases to understand all the data and cut through the noise to summarize what is going on. Equally important, these raw ingredients can be reconstituted to craft digital twins of healthcare organizations or drug and medical devices to improve medical outcomes and reduce costs. Other industries are likely to benefit by adapting similar ingredients to similar workflows in construction, product development, and supply chain management.

A living data system

One of the key promises of medical digital twins is not just to fix us when we have broken down but reduce the rate at which we break down. Dan Fero, managing director of OMX Ventures, a new firm investing in digital medicine, told VentureBeat, “A digital twin should represent a living data system that can take in longitudinal biodata over time and track and learn from that evolving data set to give a reflection of a person’s health and more importantly — health trajectory.”

This starts with measuring and tracking biodata such as cholesterol levels, vitamin panels, and medical imaging results. It will also need to include more complex datapoints, such as genomic, epigenetic, metabolomic, and immune function data.

“At present, we have ‘some’ idea of the importance of these datasets in isolation, but we aren’t truly capable of linking these datasets and using that linkage to understand likely changes to future health outcomes,” Fero said.  He believes the next phase lies in encoding the data to create digital twins at scale, pursued by only a handful of companies like Q Bio.

“I think this is a super fascinating space and will be an evolving area for decades to come as we continue to understand how to take in new biologic data points, sift through them to understand what is important and prognostic of a health change (good or bad), correlate the massive data sets to make sense of the full operating system of life and how that can be tracked longitudinally to track health or disease and to alter long term patient outcomes,” Fero said.

The ingredients for building a digital twin are still a work in progress. But the promise of doing this well is immense. Here is an overview of 10 of these essential ingredients and the role they play in creating medical digital twins:


The first ingredient is the system of record, which is the Electronic Health Record (EHR) in the healthcare industry. EHR systems capture the interaction with physicians, tracking medications, treatment plans, and outcomes. Leading EHR system providers include Cerner Corporation, Epic Systems, and Meditech.

These systems provide a baseline for organizing static information. They also face challenges when extending beyond existing healthcare workflows or across providers. One University of Utah study found that most implementations could not catch dangerous or deadly drug combinations 33% of the time in 2018, which is a noticeable improvement from 2011, when they missed 46% of prescription errors.

These EHR packages all included the ability to detect when drug combinations would be a problem. The researchers surmised the issues that arose from how each hospital customized these systems for their unique workflows. The upshot is that more work is required to improve data quality and integrate it across multiple systems.

Health Data Analytics Institute CEO Nassib Chamoun told VentureBeat, “Physicians have to make dozens of important decisions on diagnosis and treatment with limited time and incomplete information. Unfortunately, with current EHRs, the quantity and displays of data are overwhelming and disjointed.”


Language is a byproduct of how people describe things in different organizations and contexts. Ontologies help provide order to this chaos by standardizing the meaning of data and its links to other concepts. The medical industry has evolved across many disciplines, leading to a wealth of ontologies. The National Center for Biomedical Ontology currently lists 953 medical ontologies with thirteen million classes.

“Medicine is complicated, and it does not have a complete data model,” said Dave McComb, president of Semantic Arts, a business consulting firm specializing in applying ontologies to business systems, and author of Data-Centric Revolution.

Efforts are afoot to unite these disparate ontologies, including SNOMED-CT, the most exhaustive medical ontology. McComb said these efforts would also need to address the way programmers encode the structure of this data, such as its naming, validation, security, integrity, and meaning in application code. In the meantime, digital twins will rely on tools like intelligent API gateways, NLP, and real-world evidence platforms to bridge the gaps between data silos.

Graph databases

Graph databases are great for tying together heterogeneous data about different concepts like symptoms and diseases with medical records, test results, and diagnoses into one system. Many digital twins use cases involve weaving together many different types and sources of data to see patterns, which is one of the strengths of graph databases.

Neo4J director of graph data science Alicia Frame told VentureBeat, “We see many pharmaceutical and insurance companies using graph databases to get more out of their EHRs – importing EHR data into a graph DB to better understand how relationships impact outcomes, or to identify anomalous patterns of behavior.” For example, AstraZeneca uses EHR data and graph databases to better target new to market drugs and improve patient outcomes.

One large insurance company uses TigerGraph graph databases to integrate data from over two hundred sources to improve patient history visibility during call center interactions. This gives the agent an instant picture of all diagnoses, claims, prescription refills, and phone interactions. This reduced call center handling time by 10% and increased its net promoter score, reflecting customer satisfaction.

But Frame has seen more limited adoption of graph databases as the database of record for EHR systems in hospitals like Epic, Cerner, and others. “I attribute this to legacy systems using older technology, and the divide between storing the data (EHRs) and making sense of the data – where we often see graph databases coming into play,” she said.

Down the road, TigerGraph’s healthcare industry practice lead, Andrew Anderson, expects to see graph databases playing a larger role in building community digital twins to measure and improve population health. “Access to care, food insecurities, demographics, and financial factors can only be addressed and predicted by leveraging medical information with, and benchmarking against, the social determinants of health,” he said.


Whether modeling a patient or a hospital, digital twins are created by leveraging data sources, including electronic health records (EHRs), disease registries, wearables, and more. Gautam Shah, Change Healthcare, told VentureBeat, “Regardless of model type, APIs can play an integral role in driving the effective, scalable use of digital twins to improve the healthcare cost-quality curve.”

“Healthcare data sources and formats are highly fragmented in many cases,” said Shah. APIs can help smooth the subtle differences in how data is named, organized, and managed across sources. APIs can also reduce the time to gather, correlate, and prepare data to focus on creating the mechanisms that deliver the underlying value of the digital twin.

Modern API platforms evolve beyond data delivery pipes to function as intelligent connections. For instance, APIs can help build digital twins for precision medicine that capture the feedback and data and deliver them back to digital twins, allowing a constant refresh and update to the digital twin model.

Natural language processing

Medical data often exists across various sources, which can confound efforts to form a holistic picture of a patient, much less a population. “Digital twins can improve overall care by helping with information overload. We’re generating more data than ever before, and no one has time to sort through it all,” said David Talby, CTO of John Snow Labs. For example, if a person goes to see their regular primary care physician, they will have a baseline understanding of the patient, their medical history, and medications. If the same patient goes to see a specialist, they may be asked many repetitive questions.

Clinical NLP software can extract information from imaging and free-text data and serves as the connective tissue between what can be found in EHRs. For example, Roche uses NLP to build a clinical decision support product portfolio, starting with oncology. The NLP extracts clinical facts from pathology and radiology reports and marries them with other information found in unstructured free-text data to inform better clinical decision-making.

Structured data often characterized details, like whether the patient had a chronic condition, was taking any medication, or had insurance. But other considerations that affect a hospital state, such as pain level, appetite, and sleep patterns, can only be found in free-text data. NLP can help connect these dots.


Biosimulation is a computer-aided mathematical simulation of what happens when a dose of a drug is introduced to a human body. It is a large, complex model that simulates the drug’s transport, metabolism, excretion, and action over time to increase safety and efficacy. Better models promise to increase the productivity of the $200 billion spent on drug development globally. “The development of biosimulation software platforms has been transformative in drug development over the past couple of decades, and this trend is expected to continue,” Certara CEO William Feehery, PhD, told VentureBeat.

The US Food and Drug Administration (FDA) and the European Medicines Agency (EMA) have issued more than two dozen modeling & simulation-related guidance documents addressing drug-drug interactions. And the number of scientific publications that include biosimulation has tripled over the last decade.

One of the most promising areas has been mechanistic biosimulation, which integrates drug and physiological information to create a mathematical modeling framework. These models are instrumental in drug development to predict various untested clinical outcomes. Companies like Certara are taking the concept further by making digital twins of individual patients, replicating each patient’s different physiological attributes that affect a drug’s impact in their body and, hence, its effects.

These advances have helped better target dosing for different subpopulations of patients, such as the elderly and children. “The next step is to take the virtual twin technology into patient care and clinical decision-making to guide personalized medicine,” Feehery said.

Real-world evidence

Researchers often need to query data from various sources to generate insight into a particular question. RWE platforms aggregate and vet raw data to ensure it is used correctly to determine the causal relationship that can be used to make critical decisions. About 75% of all new drug approvals by the FDA in 2020 included some form of RWE. Real-world data can come from EHRs, insurance claims, product and disease registries, medical devices, and wearables. Gathering complete and high-quality data is challenging due to the large variety of data sources in interoperability limitations.

Dr. Khaled El Emam, SVP & general manager of replica analytics at Aetion, said, “These platforms will increase the value of synthetic data or digital twins by enabling customers to infer the same causalities that a researcher would discover in the source data. This goes beyond observable patterns a researcher may spot in analyzing digital twins without the support of an RWE platform to create the appropriate context.”

One big takeaway for other safety-critical industries is the role that RWE workflows can play in improving the management of evidence to ensure the safety of buildings, vehicles, and other things. El Emam said, “Careful consideration of criteria to assure quality and feasibility is a major component in RWE workflows – and should be applied across the entire RWE generation process, from data sources and data processing to defining appropriate use cases.”

Surgical intelligence

Surgical intelligence is a new concept coined by Theator to characterize tools for capturing surgical process data from the surgical theater. “The main innovation lies not only in the structuring of data and new ontologies we create but in the immediate feedback surgeons receive, as soon as they scrub out of a case,” Dr. Tamir Wolf, CEO and cofounder of Theator, told VentureBeat,

It’s similar to other kinds of physical process capture tools in industries like manufacturing and logistics from companies like Drishti and Tulip Interfaces. In medicine, these tools allow surgeons to zero in on specific stages in surgical operations and capture minute details on how procedures were performed.

Wolf said, “One of the first and most crucial steps in enabling hospital systems to deploy digital twins effectively will lie in their ability to collect robust high-quality data about the care being provided, connect performance to outcomes, and disseminate best practices.”

Predictive analytics

One promising aspect of digital twins is that they can help predict the course of a specific combination of symptoms and then assess the odds that various combinations of interventions will lead to recovery. Predictive analytics tools can collaborate with digital twins to match a patient’s digital twin to others with a similar profile.

Health Data Analytics Institute CEO Nassib Chamoun said, “Advanced statistical techniques are used to determine the prospective health risk profile, and the clinician can then assess what types of treatments have worked for these types of patients in the past and make more informed decisions on care for the current patient.” Predictive analytics tools can help predict various treatment approaches’ costs and clinical outcomes.

The predictive analytics work with the digital twins to generate different UI experiences to surface important insight. For example, HDAI has developed custom views for clinicians, patients, and population health managers. The clinician views are embedded into EHRs, while the population and patient views are embedded into various apps.


It’s often more important to highlight salient medical details than simply display realistic ray-traced imagery in medicine. For example, better insight can help physicians improve their use of medical imaging to make essential decisions on factors such as implant size and positioning. FEops CEO Matthieu De Beule said, “This is not always straightforward, since it can often become challenging to imagine how devices will interact with different patients.”

Regulatory certified medical digital twins of organs can improve surgical planning and guidance. For example, FEops has developed a regulatory cleared heart simulation to reduce procedure time and radiation exposure. Major heart valve manufacturers also use it for next-generation implant development.

De Beule said his company is working with big medical imaging players like GE, Philips, and Siemens. The FEops HEARTguide product uses AI to calibrate the raw imaging data to the patient’s unique anatomy and physiology. This helps accentuate the landmarks that guide doctors during surgery for appropriate device placement.


Next Post

AMA, Manatt Health present solutions for digital access to behavioral health

With the nation’s decades-long behavioral health crisis worsened by the COVID-19 pandemic, a collaboration between the American Medical Association (AMA), Manatt Health Strategies and a group of health care experts has identified solutions to increase access to behavioral health services through the adoption of digital technologies. Read More