Validation and Qualification Approaches to AI/ ML – 4 Critical Areas of Consideration When Applying CSV Principles

September 27, 2023
7:47 am
AI algorithm qualification, data governance, explainable AI, quality risk management

Artificial Intelligence (AI) and Machine Learning (ML) have the potential to bring significant improvements to the drug manufacturing process.

AI/ ML can optimize existing drug manufacturing processes to maximize efficiency and minimise waste. Continuous, real time sensor data enables manufacturers to detect changes or deviations during the manufacturing process that signal the need for equipment maintenance. AI can also monitor product quality.

What is Artificial Intelligence and Machine Learning?

Artificial Intelligence (AI) and Machine Learning (ML) can be described as a branch of computer science, statistics, and engineering that uses algorithms or models to perform tasks and exhibit behaviours such as learning, making decisions, and making predictions. ML is considered a subset of AI that allows models to be developed by training algorithms through analysis of data, without models being explicitly programmed [1].

What is the FDA’s Perspective on the Use of AI/ML in Drug Development?

As part of this effort, FDA’s Center for Drug Evaluation and Research (CDER), in collaboration with the Center for Biologics Evaluation and Research (CBER) and the Center for Devices and Radiological Health (CDRH), issued an initial discussion paper to communicate with a range of stakeholders and to explore relevant considerations for the use of AI/ML in the development of drugs and biological products. The agency will continue to solicit feedback as it advances regulatory science in this area.

AI/ML will undoubtedly play a critical role in drug development, and the FDA plans to develop and adopt a flexible risk-based regulatory framework that promotes innovation and protects patient safety [1]. Additionally, in 2021, FDA published an action plan regarding the use of AI and ML as software for medical devices [2].

While many CSV principles apply to AI and ML systems, there are 4 critical areas of consideration that require to be addressed.

4 Critical Areas of Consideration When Applying CSV Principles

1. Risk-Based Approach:

Resembling traditional computer systems, a risk-based approach is crucial. However, AI and ML systems may introduce novel risks, such as biased decisions or lack of interpretability. These risks need to be thoroughly evaluated and mitigated.

There is no official or solid guidance from the FDA currently (for the AI/ ML approach in the Development of Drug and Biological Products), however, the FDA released two discussion documents earlier in 2023, with the aim of seeking insights from the industry regarding potential new applications, obstacles to implementation, essential data management steps, and any other input that could prove valuable as the agency works on refining its guidance for the forthcoming GMP AI/ ML revolution [3].

In theory, we can use existing guidance and tools for CSV (computer system validation). The ISPE GAMP 5 (2nd Edition), Appendix D11 provides an overview of a risk-based compliant AI/ML life cycle framework, where we adopt a 3 phased approach; Concept phase, Project phase and Operation phase [4]. Please refer to graphic below:

A robust framework for quality risk management (QRM) is crucial at this stage. This is particularly important given that the expected performance criteria of AI/ML systems can vary significantly based on patient proximity, ranging from upstream process control, which generally presents lower risks, to downstream QA decision-making, associated with higher risks [5].

2.Data Governance – Data Validation, Quality and Integrity:

Data validation, quality and integrity is paramount for AI and ML. Ensuring the integrity, accuracy, and reliability of training and validation data is vital, as the performance of these systems heavily depends on the data they learn from. The integrity of the training data has to be verified. It ensures that the data is representative of the real-world scenario, free from biases, errors, or inconsistencies, and suitable for the intended purpose of the model.

Of course, a robust data governance initiative is a pre-requisite for any industry implementing AI/ML applications. And the PIC/s guide emphasises the crucial role of data governance in ensuring data integrity and quality throughout pharmaceutical manufacturing processes, highlighting the need for comprehensive policies, procedures, and oversight mechanisms [6].

Please find below a refresher on 3 related concepts in the field of data management, but they have distinct meanings and purposes.

We would like to emphasise that these 3 processes or concepts are crucial in developing accurate and reliable ML models:

Data Validation:

Definition: Data validation refers to the process of checking data for accuracy, completeness, and adherence to predefined rules or standards.

Purpose: The primary goal of data validation is to ensure that the data entered into a system or collected from external sources meets specific criteria or requirements. This helps in preventing erroneous or incomplete data from being used or stored.

Data Quality:

Definition: Data quality refers to the overall reliability, accuracy, and suitability of data for its intended use.

Purpose: The aim of data quality management is to maintain high-quality data throughout its lifecycle. This involves activities such as data cleansing, normalization, and monitoring to ensure that data remains accurate, consistent, and reliable.

Data Integrity:

Definition: Data integrity refers to the accuracy and consistency of data in storage, processing, and retrieval.

Purpose: Data integrity ensures that data remains unchanged and reliable over time. It involves safeguards against unauthorized or accidental alterations, deletions, or corruptions of data.

While they are related, these concepts have distinct emphases:

Data Validation primarily focuses on the correctness and adherence of data to predefined criteria at the point of entry or acquisition.

Data Quality encompasses a broader set of activities aimed at maintaining high-quality data throughout its lifecycle, addressing issues beyond initial validation.

Data Integrity emphasizes the protection of data from unauthorized or accidental alterations, ensuring that data remains reliable and unchanged.

In summary, data validation is one aspect of data quality, and both contribute to maintaining data integrity. Together, they ensure that data is accurate, reliable, and suitable for its intended purpose.

3. Algorithm Qualification:

AI and ML systems learn from data and evolve over time. Traditional CSV may involve validating software code, whereas for AI and ML, the focus shifts towards qualifying the algorithms’ behaviour, training data, and model outputs.

Hence, there are several challenges associated with qualifying AI/ML algorithms. One challenge is ensuring that the algorithm is trained on adequate and accurate data. Machine learning algorithms rely on large amounts of data to learn, but it can be challenging to obtain sufficient data that accurately represents the real-world scenarios that the system will encounter. Another challenge is testing for bias, which requires a thorough understanding of the training data and the potential sources of bias [7].

By thoroughly testing and qualifying these algorithms, we can have greater confidence in their ability to produce accurate and reliable results, and ensure they’re fit for purpose.

It’s been proposed, that qualification of the AI algorithm is a prerequisite before a machine learning model can even be validated. Qualified algorithms enable validation of machine learning models that can be used for process optimization.

The Parenteral Drug Association has published an AI Algorithm Qualification study. The goal of this research initiative was to provide a methodology based on a design of experiment (DoE) approach to agnostically qualify the Isolation Forest algorithm [8]

Finally, Algorithm Qualification ensures that risks in a GXP environment align with the algorithm’s intended performance. Compliance with these requirements during regulatory inspections for AI/ML often hinges on effective risk management.

4. Interpretability and Explainability:

AI and ML models often lack transparency, making it challenging to understand their decision-making process. Ensuring appropriate levels of interpretability and explainability is crucial, especially in regulated industries, such as Life Sciences.

Explainability/ Explainable AI (XAI), is crucial in healthcare AI to build trust and ensure safety. Complex models like deep neural networks can deliver state-of-the-art performance but act as “black boxes”, making it hard to understand their reasoning. This presents challenges for domains like medical diagnosis where stakeholders need to understand the basis of model predictions. Emerging methods aim to strike a balance between performance and interpretability.

Interpretability also builds trust and transparency. Understanding how a model arrives at its predictions or decisions builds trust. It allows stakeholders to have confidence in the system’s output and helps in establishing transparency. Life Sciences have strict regulations regarding the use of AI/ML algorithms. Interpretability is also often a requirement to ensure regulatory compliance with these regulations.

And in critical applications, such as manufacturing a drug, it’s important to be able to explain and justify the decisions made by AI/ML algorithms to ensure ethical practices.

Finally, it’s important to highlight that explainable AI helps to understand why a model made a specific prediction based on the data it is trained with, but it adds limited value to manage the risks of AI/ML-applications. The reason is that it tells why it inferred. It does not tell you how it will infer incorrectly when given data that are not covered in the training / test data. This additional risk for AI/ML applications compared to rule-based applications, cannot be mitigated by XAI [9].

Takeaway

The integration of AI/ML in the Life Sciences sector is already well underway, with at least 50 pharmaceutical and biotechnology companies actively leveraging these technologies, either internally or through outsourcing arrangements [10]. This signifies the expansive potential that AI/ML holds within the industry. As these technologies continue to evolve, their impact on drug discovery, continuous improvement, clinical research, and healthcare management is poised to be transformative, minimising costs and accelerating timelines.

With some revisions to our quality risk management framework, data governance policies, utilising key standards and enforcing digital education and tools for our workforce, we’re a significant step in the right direction and closer to becoming a digitally mature healthcare facility.

References:

[1] Artificial Intelligence and Machine Learning (AI/ML) for Drug Development

https://www.fda.gov/science-research/science-and-research-special-topics/artificial-intelligence-and-machine-learning-aiml-drug-development

[2] Artificial Intelligence and Machine Learning in Software as a Medical Device

https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device

[3] Using Artificial Intelligence and Machine Learning in the Development of Drug and Biological Products; Availability. Federal Register. Published May 11, 2023

https://www.federalregister.gov/documents/2023/05/11/2023-09985/using-artificial-intelligence-and-machine-learning-in-the-development-of-drug-and-biological

[4] GAMP5-A Risk-Based Approach to Compliant GxP Computerized Systems, Second Edition.pdf

[5] EMA. ICH Q9 Quality risk management – Scientific guideline. European Medicines Agency

https://www.ema.europa.eu/en/ich-q9-quality-risk-management-scientific-guideline

[6] Pharmaceutical Inspection Convention Pharmaceutical Inspection Co-Operation Scheme. Good Practices for Data Management and Integrity in Regulated GMP/GDP Environments. Published online July 1, 2021

https://picscheme.org/docview/4234

[7] A Complete Guide to Testing AI and ML Applications

https://www.qed42.com/insights/perspectives/biztech/complete-guide-testing-ai-and-ml-applications

[8] T. Manzano, “Qualifying AI Algorithms in Pharmaceutical Manufacturing,” BioPharm International 35 (1) 30–31,37 (2022)

[9] Strategy for the GxP-validation of AI/Machine Learning solutions

https://www.inviteresearch.com/fileadmin/user_upload/Research/Publications/2019a/

White_Paper_Strategy_for_the_GxP_validation_of_AI_Machine_Learning_solutions.pdf

[10] Pharma AI Readiness Index: Who’s Best Positioned for the AI Boom?

https://www.cbinsights.com/research/ai-readiness-index-pharma/