In your pioneering work, how have you managed to seamlessly integrate EHR data and advanced data science techniques to develop predictive models that personalize treatment strategies for patients with complex health conditions?
Seamlessly? Alas, not quite. But here’s where we moved the needle:
a. Liberating EHR data for genomics and precision medicine research**: Back in the early 2000s, it became clear to us that despite all of its warts, EHR data was an untapped resource. The narrative text could be used for phenotyping patients at an unprecedented scale using natural language processing (NLP). Without NLP, the costs of clinical characterizations were far more expensive than rapidly decreasing genotyping costs. So, we developed a suite of tools, funded by the NIH, called i2b2 (Informatics for Integrating Biology and the Bedside). This free and open-source toolkit enabled thousands of studies, including pioneering work in pharmacovigilance, genomic studies in underrepresented populations, and COVID-related therapeutic studies.
b. Providing programmatic access to EHR data**: We developed the SMART-on-FHIR API to broaden the developer community’s access to innovate on EHR platforms. It has been deployed on most leading EHR vendor platforms, allowing third-party functionalities. Notably, Apple adopted the API, enabling over 800 hospitals to make data accessible to patients, furthering patient-facing healthcare support. The 21st Century Cures Act highlighted this as an example of democratizing patient data.
2. Data Science & Patient Engagement
Your contributions have significantly enhanced patient engagement through data science. What innovative data science techniques have proven most effective in improving patient adherence to personalized treatment plans?
No single technique is sufficient to improve patient adherence, as patients are autonomous and won’t uniformly respond to any directive. However, providing patients direct feedback on their progress through a continuous communication loop with clinicians (even AI-augmented) is the most reliable method to improve adherence. This feedback loop requires a combination of technology and human involvement.
3. Scalability of Precision Medicine Solutions
Given your contributions to precision medicine at scale, what challenges have you encountered in standardizing and implementing EHR-based personalized treatment solutions across diverse healthcare systems, and how have you addressed them?
The largest challenges have been around data sharing. There are many barriers to effectively sharing patient data for clinical care and research. While technical solutions (like SMART-on-FHIR, SHRINE system, and Indivo/PING) exist, concerns around safety, privacy, and compliance remain. However, legislation such as the 21st Century Cures Act and the ONC’s interoperability initiatives are making slow but steady progress in breaking down information-blocking practices.
4. Evolution of Data Science and Machine Learning in Precision Medicine
How do you see data science and machine learning evolving in precision medicine, and what specific impacts do you anticipate on clinical practice over the next five years?
The combination of “liberated” multimodal clinical data and increasingly powerful multimodal AI will transform precision medicine faster and more dramatically than the Human Genome Project. Within five years, I expect AI-augmented healthcare support services staffed by paraprofessionals, with intermittent involvement from costly hospital-based resources. This infrastructure will also revolutionize clinical trials, patient recruitment, and therapeutic development.