#FutureReadyHealthcare

What we think

Architecting a scalable MLOps framework in healthcare

Executive Summary

The healthcare industry has been increasing its investment in data science and machine learning to develop predictive models that deliver better health outcomes. However, operationalizing ML models has often been a challenge for many, and much of this has to do with data-related challenges and clunky, brittle development and deployment processes that are manual, time- consuming, and often unscalable. In this whitepaper, we highlight how organizations across the healthcare value chain can benefit from architecting robust MLOps capabilities. We share what it takes to enhance ML models from development to deployment, what a scalable MLOps architecture looks like, and how that can be used to empower life sciences teams to be more agile and strategic in their decision-making.

The use of machine learning in healthcare is undoubtedly on the rise among companies, with many of them leveraging these models for a plethora of high-impact use cases such as patient analytics, omnichannel marketing, salesforce recommendations, etc. with many of them leveraging these models for a plethora of high-impact use cases such as patient analytics, omnichannel marketing, salesforce recommendations, etc. But even with a multitude of ML models being developed or tested, very few of them ever make it past the experimental stage primarily because of data and process-related challenges involved in operationalizing ML models. The massive amount of real-world data produced in highly dynamic clinical environments adds to this complexity.

The need of the hour? Healthcare organizations must adopt appropriate foundational technologies to build a sophisticated ML framework that continuously evaluates and improves the performance of the models in production, pushing them beyond the pilot stage and allowing companies to scale them effectively.

In this article, we delve deeper into what it takes to get there. But first, let’s understand the unique challenges healthcare organizations face while developing ML models.

Challenges in ML model development

Issues with data

ML models dig through mountains of clinical data to find patterns that might not otherwise be recognizable. Because clinical settings are dynamic environments, the underlying data and the interpretation of a specific data set can change with time. The former is called a data drift while the latter is called a concept drift – and the occurrence of both directly cause ML models to decay. Therefore, ML models need to be continuously evaluated and improved to align with any change in real-world healthcare data.

Take this for example: For infectious diseases like the flu, data can significantly evolve based on new emerging pathogens. This can potentially lead to new screening methods, higher rates of drug resistance, new treatments, and more. Similarly, a predictive COVID-19-based ML model needs to consider data related to evolving treatment regimens, fluctuations in hospitalization rates, new data on high-risk patient groups, dominant variants, episodes of vaccine-induced immune thrombotic thrombocytopenia (VITT), and more.

Basically, the type of data that is likely to change in the short term relates to the occurrence of pandemics or epidemics, the discovery of new rare diseases, mutations, and sudden resistance to drugs. In the long term, changes in certain disease characteristics can directly impact overall patient demographics – sometimes potentially expanding the target population of patients who are at risk.

Therefore, healthcare data scientists and engineers often struggle with variations between the production data and the data used to test and validate the model before deployment because of the significant gap between the two activities. This gap can range anywhere from weeks to months, to even years. When patient health and market data are not monitored and updated in real-time during this period, the model’s predictive power begins to decay as a result of the changes in real-world environments. This is when healthcare predictions are likely to go wrong, potentially leading to an irreversible negative impact when it comes to patient care-related decisions.

Figure 1: The gap between data collection and model deployment

The time between data collection and final model deployment can be significant, sometimes ranging from weeks to months, to even years

Data-related issues like the above make it challenging to build a scalable and reliable ML infrastructure that continuously and accurately delivers batch and real-time inferences in the production environment, often keeping healthcare data scientists and engineers from answering key questions such as:

Figure 2: Challenges addressing key data questions in ML

Issues with consistency and reusability

As discussed above, data is the crux of any ML model, and understanding the path of data exploration is paramount. In a manual setting, this path is often captured in isolation by individual data scientists and engineers (either in personal notebooks, datasets stored on personal computers, or the cloud). This ultimately drives the storage, organization, and usage of data based on individual work patterns and independent views on logic and methodologies - resulting in knowledge fragmentation, lack of easy discoverability of prior work, and difficulty sharing results with partners.

Therefore, data scientists, engineers, operations managers, and other relevant stakeholders involved in the ML lifecycle need to build a unified data science platform, collaboratively share their work to ensure models are developed in integrated environments, and centralize access to drive reusability and consistency for long-term value.

Issues with inconsistent monitoring and training

Evaluating the health of an ML model in irregular intervals (and manually) can be extremely error-prone and time-consuming, especially when it comes to keeping track of different versions of the code, reacting to degradation in model accuracy, and maintaining robust data governance throughout the lifecycle.

The need for an ML Operations (MLOps) strategy in healthcare

For ML predictions to be useful in practice they should be reliable, efficient, and scalable. MLOps does just that. It unifies, standardizes, and streamlines every activity across the ML development, operations, and production stages. It helps companies establish a common link between different clinical systems, and ensures that every piece of clinical data and code used in ML models are standardized, audited, updated, and well-documented in real-time.

Take Electronic health records (EHRs) for example. EHRs can be tremendously complicated. Even a temperature measurement has a different meaning depending on if it’s taken under the tongue, through your eardrum, or on your forehead. Moreover, each hospital customizes their EHR system, making the data collected at one hospital look different than the data collected on a patient receiving similar care at another hospital. An MLOps framework will synchronize these clinical systems and workflows to combine and standardize similar data sets in a consistent format.¹

MLOps also bridges the communication and collaboration gap between healthcare data scientists and engineers – enabling them to not only configure and manage deployments centrally but also automate the process of monitoring and retraining models effectively. This helps them achieve the accuracy, scale, reproducibility, and governance needed to effectively productionize ML initiatives.

To execute a full-blown MLOps strategy in healthcare, organizations need to complement deep ML skills with a modern data management system coupled with software development, engineering, and project management capabilities – all crucial components to build ML processes, pipelines, and enable workflow automation. Here are a few key elements to keep in mind:

■

Data definitions and strategies (metadata, data provenance, data discovery, pipelines, scoping, splits, etc.) to work with different data types, ensuring label consistency, strategies to cope with the class imbalance and highly skewed data sets, etc.

■

Continuous benchmarking, debugging, and re-mediating model performance to measure its robustness, fairness, and stability

■

Distributed processing and parallelism techniques to orchestrate data management pipelines and compute, store, and activate I/O resources for continuously training and monitoring models efficiently

■

Clear insights into model inner workings to track changes across the MLOps life cycle while addressing regulatory and legal requirements

■

Performance baseline for each model and advanced approaches to improve them

What benefits does MLOps offer?

Higher Productivity

Through self-service data and development environments

Better Traceability

Using enterprise model registries and monitoring operations that also create wider auditability

Superior Quality

By adopting robust data and model governance best practices

Asset Reusability

Through feature/code containerization and automated model training, evaluation, versioning, and deployment steps

More Reliability

By incorporating continuous integration (CI) continuous delivery (CD) and continuous monitoring (CM) best practices

Feedback Loop

Respond to business opportunities and changes quickly, incorporate enhancements to product on regular basis

A brief overview of how it works

Figure 3: MLOps workflow

MLOps, in its simplest implementation, must do the following on a continuous (almost cyclical) basis throughout the model lifecycle: .

■

Validate incoming data

■

Compute features on the incoming data

■

Generate train and test the data

■

Train the model

■

Validate the model

■

Tune the model

■

Deploy the model, and

■

Monitor the model in production

The architecture workflow comprises of an ML pipeline development that supports and orchestrates data science exploration, data pre-processing, model training, model validation, prediction, and monitoring workflows. Here, the application of continuous integration (CI), continuous delivery (CD), and continuous monitoring (CM) best practices, along with automation across the build, test, and deployment phases, enable the robust governance, standardization, and faster roll-out of ML models.

An effective data pipeline runs validation checks on every piece of data for accuracy and quality before preparing it for processing by different ML tools. This includes converting the raw data into features that are suitable for use. Quality assurance is an integral part of this step, especially for automating tests for pipelined data and features.

An environment for model development and management enables data scientists to experiment, train, and tune multiple models before selecting the best one. During this stage, the performance metrics of all trained models are validated before deployment. All users with access can track ML experiments with ease, and gather information about what data went into which model and what outcomes it drove.

A deployment process is activated to take the selected model to production. In this stage, a model server allows the deployed model to accept prediction requests for inferencing by taking input data.

The performance monitoring step runs automated checks on model performance, enabling users to effectively track models, identify performance degradation, model decays, and inform further actions.

A quick look at some critical building blocks

Data platform

MLOps is best implemented on a modern data platform equipped with a built-in features store that seamlessly develops and orchestrates features and training pipelines. The features, training data, models, environment details, model logs, model scores, etc., should be stored on this platform to manage data, serve the latest feature value, and monitor ML models effectively. This enables data scientists to collaborate, govern, and share ML assets nimbly and securely through a single platform. The platform architecture must be based on strong DataOps fundamentals – from data discovery and observability to feature engineering, data drift monitoring, data pipeline automation, and data quality assurance.

Feature store

The features store is a key building block of the MLOps framework responsible for separating feature pipelines from training pipelines. For superior outcomes, data scientists should be able to independently develop features by building production feature pipelines (such as data transformations, application of business rules, dimensionality reduction, QA testing, etc.) in the environment of their choice (such as Python, Spark, pySpark, SQL, etc.).

Code repository

Data scientists and ML engineers often struggle to sync their work due to the prototyping and exploratory (often ad-hoc) nature of data science work. This is where large volumes of codes can potentially get lost without ever being recorded or documented. The need of the hour is to adopt code branching techniques that allow data scientists and engineers to easily collaborate inside of one central code base - a solution that only an integrated code repository can solve. Codes, issues, environment definitions, data schemas, data dictionaries, data and tech decision documentation, experiment results, benchmarks, etc., are written in the code repository for each branch. This information is passed as part of the pull request when independent branches are merged. MLOps automatically triggers the orchestrated features and training pipelines (including data quality assessments to check for any updates or changes in data).

Packaging and production

After establishing the baseline performance, the model is sent to the CI/CD pipeline for continuous deployment. Next, serving codes are triggered for production-ready models. The CI/CD deployment pipeline builds and tests the model before it is deployed for production. The model serving component makes predictions on new, unseen data trickling in from the feature store.

Model-serving applications are configured within a container, with predictions handled via RPC API, REST API, or language wrappers. Easy methods to containerize production-ready ML models are through model servers with manually deployed containers or container orchestration engines (like Kubernetes) or containers running in bare-metal clusters.

The performance of model serving metrics and infrastructure is monitored in real time, and a feedback loop is triggered to track weaknesses/losses in model performance. This information is sent to upstream data and feature engineering teams and data scientists for further model improvements – enabling a continuous training and retraining workflow.

*Stay tuned for a more detailed and technical view of the MLOps workflow in our follow-up article.

Getting MLOps right with 4 underpinning principles of success

An MLOps framework in healthcare is governed by certain foundational pillars that act as guiding principles to ensure that the models are collaborative and scalable

Principles	Key steps:
Integration of code MLOps integrates and versions every piece of code to improve its accessibility, traceability, and accuracy. This makes it possible to share and reproduce the code base across projects and teams.	■ Automated data extraction and processing ■ Data validation checks ■ Feature engineering on processed data ■ Automated monitoring
Scalability MLOps enables organizations to effortlessly scale their ML initiatives by providing a model registry to store and version trained ML models. This greatly simplifies the task of tracking models as they move through every stage of the ML lifecycle.	■ Creation of a model registry (to capture metadata, metrics, inputs, etc) ■ Continuous versioning of model and data artifacts ■ Seamless tracking of artifacts for quick debugging, better traceability, and auditability
Continuous monitoring and Continuous training MLOps frameworks are powered by robust monitoring techniques that enable the continuous learning, training, and retraining of ML models.	■ Model performance monitoring to identify model and data drift ■ Monitoring of model latency, system metrics, and operations of ML pipelines ■ Measurement of business impact based on predefined use cases and KPIs
Operationalizing MLOps orchestrates all the different steps of a data pipeline, including batch or real-time processing and end-to-end automation.	■ Automated triggers for model training and retraining alerts ■ Access to a centralized dashboard for seamless tracking

A general rule of thumb: Prioritize data and model governance

ML without governance is extremely risky, especially since the healthcare industry deals with massive amounts of data related to a patient’s Personal Identifiable Information (PII) and Protected Health Information (PHI). ML systems should be protected from potential biases, lack of transparency in algorithms and codes, privacy concerns with the data used for training models, and safety and liability issues. Healthcare MLOps tools equip you with the ability to manage the governance of ML models while addressing questions such as data provenance, how the original data was collected and under what terms of use, data accuracy and its time relevance, rate of model quality decay, and more.

5-step approach to effective MLOps governance:

Figure 4: MLOps governance

The future of MLOps in healthcare

The future of MLOps in life sciences holds immense promise, transforming patient care from reactive to proactive. Expect increased automation and cloud adoption, streamlining the ML lifecycle for faster, more frequent model updates. Hyper-automation will free data scientists for deeper research, while adaptive MLOps platforms will continuously adjust models to evolving data patterns and regulatory changes. Federated learning and decentralized AI will enable secure collaboration across functions, leveraging vast datasets for groundbreaking discoveries without compromising patient privacy. Explainable AI and interpretable models will build trust and enable more informed decision-making. Predictive analytics will also take center stage, powering early disease detection, personalized treatment plans, and proactive intervention for high-risk patients. Building a robust data infrastructure and prioritizing data governance is crucial. Life sciences companies should embrace continuous learning and a culture of experimentation to navigate this rapidly evolving landscape.

Get started with MLOps

MLOps is an opportunity for healthcare organizations to ensure the reliability of their AI and ML models. It is a basic necessity that gets you ahead in the race by improving the lifetime value of the model and ensuring its robustness, reversibility, fairness, accountability, and explainability. It gives your data scientists the freedom to do what they do best — find answers (without having to worry about manual bottlenecks and operational barriers in the process). Now is the time for healthcare organizations to be future-ready with MLOps. Get started and put your models into production quickly and securely.

References

Deep Learning for Electronic Health Records | Google https://ai.googleblog.com/2018/05/deep-learning-for-electronic-health.html

Authors

Jitesh Sah

Vikas Mahajan

Digvijay Yeola

Nellai Srinivasan