21 Regulation & Ethics
Clinical Context: Your team has developed an AI system that detects diabetic retinopathy from fundus photographs with performance matching expert ophthalmologists. The algorithm works. But can you actually deploy it? The answer depends on navigating a complex regulatory landscape—FDA clearance in the US, CE marking in Europe—and addressing ethical questions that no algorithm can answer. This chapter covers the regulatory and ethical frameworks that govern medical AI.
Building a medical AI system that works technically is only part of the challenge. Deploying it clinically requires regulatory approval, and using it responsibly demands ethical frameworks that go beyond mere compliance. Medical AI operates at the intersection of healthcare regulation, software development, and emerging AI governance—a landscape that is complex, evolving, and critically important.
This chapter provides a practical guide to regulation and ethics for medical AI. We cover FDA pathways and European requirements in enough detail to understand what’s required and which path applies. We explore ethical frameworks through case studies that illustrate real dilemmas. And we discuss responsible AI development practices that should guide your work regardless of regulatory requirements.
21.1 Why Regulation and Ethics Matter
21.1.1 The Stakes
Medical AI systems make decisions that affect patient health and lives. The potential for harm is real:
- A false negative in cancer screening means delayed diagnosis
- A biased algorithm may systematically underserve certain populations
- Over-reliance on AI predictions can lead clinicians to miss what the algorithm misses
- Deployment failures can affect thousands of patients simultaneously
Unlike a drug that affects one patient at a time, a flawed AI system deployed across a health network can cause widespread harm rapidly. This scale of potential impact justifies careful regulatory oversight.
21.1.2 Beyond Compliance
Regulation sets a floor, not a ceiling. FDA clearance means a device is safe and effective enough to market—not that it’s optimal, fair, or appropriate for every use case. Ethical practice goes beyond regulatory compliance to consider:
- Who benefits and who bears risk from this technology?
- Are we being transparent about capabilities and limitations?
- How do we handle uncertainty and errors?
- What happens to patients who can’t access AI-enhanced care?
The most successful medical AI teams embed ethical thinking throughout development, not as a checkbox at the end.
21.1.3 The Evolving Landscape
Medical AI regulation is developing rapidly. The FDA has cleared over 500 AI/ML-enabled medical devices, with the pace accelerating each year (Muehlematter, Daniore, and Vokinger 2021). Europe’s Medical Device Regulation (MDR) took full effect in 2021, significantly increasing requirements. The EU AI Act introduces new risk-based frameworks. National strategies proliferate.
What’s required today may differ from what’s required when you deploy. Staying current with regulatory developments is part of responsible practice.
21.2 FDA Regulation of AI/ML Medical Devices
Clinical Context: You’ve developed an AI algorithm that analyzes chest X-rays to detect pneumonia. Before any US hospital can use it clinically, you need FDA clearance. Understanding the pathways—and which one applies to your device—is essential.
21.2.1 Software as a Medical Device (SaMD)
The FDA regulates AI/ML systems intended for medical purposes as Software as a Medical Device (SaMD). The key word is “intended”—the same algorithm might be a medical device or not, depending on its intended use.
Is a medical device (requires FDA oversight): - AI that diagnoses disease from medical images - Algorithm that recommends treatment based on patient data - Software that interprets ECGs for arrhythmia detection
Not a medical device (no FDA oversight): - General health tracking apps (steps, sleep) - Electronic health record systems (data storage, not decision-making) - Administrative tools (scheduling, billing)
The FDA uses a framework from the International Medical Device Regulators Forum (IMDRF) that considers:
- Healthcare situation: How critical is the clinical context?
- Significance of information: Does the output drive or inform decisions?
Higher risk = more regulatory scrutiny.
21.2.2 Exercise: Classify These Systems
Before continuing, test your understanding. For each system below, determine: Is it SaMD? If so, what risk class?
AI scheduling tool that optimizes MRI appointment slots based on predicted no-show rates
→ Not SaMD (administrative function, no clinical decision-making)
Chest X-ray triage system that flags “critical findings” for priority radiologist review
→ SaMD, Class II (aids clinical decision, moderate risk—radiologist makes final call)
Diabetic retinopathy screener that provides autonomous diagnosis without physician review
→ SaMD, Class III (autonomous diagnosis, high risk—see IDx-DR example below)
EHR auto-complete that suggests billing codes based on clinical notes
→ Likely not SaMD (administrative) unless codes directly affect treatment decisions
Medication dosing calculator that recommends insulin doses based on glucose readings
→ SaMD, Class II or III (directly informs treatment, patient harm possible from errors)
The key question: Does the output drive or inform clinical decisions? If yes, it’s likely SaMD. How much harm could result from an error? That determines the class.
21.2.3 Risk Classification
Medical devices are classified by risk level:
Class I (Low Risk): General controls sufficient. Examples: tongue depressors, bandages. Most exempt from premarket review.
Class II (Moderate Risk): General controls plus special controls. Examples: powered wheelchairs, pregnancy tests. Require 510(k) clearance.
Class III (High Risk): General and special controls insufficient. Examples: pacemakers, high-risk diagnostics. Require Premarket Approval (PMA).
Most AI/ML medical devices are Class II, cleared through the 510(k) pathway.
21.2.4 The 510(k) Pathway
The 510(k) process requires demonstrating substantial equivalence to a legally marketed predicate device. Your device must have:
- Same intended use as the predicate
- Same technological characteristics OR different characteristics that don’t raise new safety/effectiveness questions
What’s required: - Device description and intended use - Comparison to predicate device - Performance data (often clinical validation studies) - Software documentation - Labeling
Timeline: FDA goal is 90 days for review, but total process typically takes 6-12 months including preparation.
Example: If there’s an existing FDA-cleared AI system for detecting diabetic retinopathy, a new system with the same intended use can use 510(k) by demonstrating equivalent or better performance.
21.2.5 The De Novo Pathway
What if there’s no predicate device? The De Novo pathway is for novel devices that are low-to-moderate risk but have no substantially equivalent predicate.
When to use De Novo: - Novel AI application with no existing cleared devices - Low-to-moderate risk (not Class III) - Want to establish a new device type
What’s required: - More extensive than 510(k) - Risk analysis and mitigation - Clinical performance data - Typically 150+ day review
Landmark example: IDx-DR (now Digital Diagnostics) used De Novo for the first autonomous AI diagnostic system—diabetic retinopathy detection that provides a diagnosis without physician review. FDA cleared it in 2018, creating a new regulatory pathway for autonomous AI diagnostics.
Once a De Novo device is cleared, it can serve as a predicate for future 510(k) submissions.
21.2.6 Premarket Approval (PMA)
PMA is required for high-risk (Class III) devices. It’s the most rigorous pathway:
- Requires clinical trials demonstrating safety and effectiveness
- FDA reviews scientific evidence, not just substantial equivalence
- Typically takes 1-3 years and costs millions of dollars
Most AI/ML medical devices avoid PMA by careful intended use statements that position them as Class II. An AI that “assists diagnosis” (Class II) faces different requirements than one that “diagnoses autonomously” (potentially Class III).
21.2.7 Predetermined Change Control Plans (PCCPs)
AI/ML systems learn and improve. Traditional regulation assumes devices are fixed at approval. The FDA’s Predetermined Change Control Plan framework addresses this tension.
A PCCP submitted with your initial application describes:
- What changes you anticipate (retraining on new data, performance improvements)
- How you’ll validate changes (testing protocols, performance thresholds)
- When FDA notification is required vs. changes you can make under the plan
This allows continuous improvement without repeated submissions for every model update—if changes stay within the predetermined plan.
21.2.8 Continuously Learning Systems
Fully adaptive systems that learn from real-world data during deployment pose harder challenges:
- How do you validate a system that’s constantly changing?
- What if it learns from biased data in deployment?
- How do you ensure it doesn’t degrade for subpopulations?
The FDA is still developing frameworks for truly continuously learning systems. Current guidance recommends “locked” algorithms for initial deployment, with controlled update processes.
21.2.9 Real-World Examples
IDx-DR (2018): First FDA-cleared autonomous AI diagnostic. De Novo pathway. Detects diabetic retinopathy from fundus photos without physician oversight. Required clinical trial data demonstrating sensitivity and specificity.
Caption Health (2020): AI guidance for cardiac ultrasound that enables novice users to capture diagnostic-quality images. 510(k) clearance. Demonstrated that nurses with AI guidance could acquire images meeting cardiologist quality standards.
Viz.ai (2018): AI that analyzes CT scans for large vessel occlusion stroke and alerts specialists. 510(k) clearance. Notable for being a triage/notification system rather than diagnostic.
Paige Prostate (2021): AI for prostate cancer detection in pathology. First FDA-cleared AI for pathology cancer diagnosis. De Novo pathway.
21.2.10 What the FDA Expects
Based on FDA guidance, submissions for AI/ML devices should address:
Data management: - Training data sources and characteristics - Patient population represented - Data quality and labeling process
Model development: - Algorithm description - Feature selection rationale - Training and tuning methodology
Performance evaluation: - Test data separate from training - Performance metrics appropriate to clinical use - Subgroup analyses (demographics, disease subtypes)
Clinical validation: - Standalone performance vs. clinical performance - Comparison to current standard of care - Reader studies if applicable (AI vs. experts, AI + experts)
21.3 European Regulation: MDR and CE Marking
Clinical Context: Your FDA-cleared AI device cannot be sold in Europe without separate approval. The EU Medical Device Regulation (MDR) has its own requirements—often more stringent than FDA—and the process differs significantly.
21.3.1 Medical Device Regulation (MDR) Overview
The EU Medical Device Regulation (2017/745), fully effective since May 2021, replaced the previous Medical Devices Directive. It significantly strengthened requirements for medical devices, including AI/ML software.
Key features: - Stricter classification rules (many devices moved to higher classes) - Enhanced clinical evidence requirements - Ongoing post-market surveillance - Unique Device Identification (UDI) requirements - Greater scrutiny of high-risk devices
21.3.2 Classification Under MDR
MDR classifies medical devices into four classes: I, IIa, IIb, and III (increasing risk). Software classification follows Rule 11, which considers:
- Whether the software provides information for diagnosis or therapy
- The potential impact of that information on patient health
Rule 11 classifications:
| Software Purpose | Classification |
|---|---|
| Information for diagnosis/therapy decisions, could cause death or irreversible health deterioration | Class III |
| Could cause serious health deterioration or surgical intervention | Class IIb |
| All other diagnostic/therapeutic decision support | Class IIa |
| Other medical software | Class I |
Most diagnostic AI systems fall into Class IIa or IIb under MDR—often higher than under FDA rules.
21.3.3 CE Marking Process
To sell a medical device in the EU, you need CE marking, which indicates conformity with MDR requirements.
For Class I devices: - Self-declaration of conformity - No Notified Body involvement (except for sterile or measuring devices)
For Class IIa, IIb, III devices: - Conformity assessment by a Notified Body (independent organization designated by EU member states) - Quality management system audit - Technical documentation review - Clinical evaluation assessment
What’s required in technical documentation: - Device description and specifications - Design and manufacturing information - Risk management file - Clinical evaluation report - Post-market surveillance plan
21.3.4 Clinical Evidence Under MDR
MDR requires clinical evidence demonstrating safety and performance. This can come from:
- Clinical investigations (trials with your device)
- Clinical evaluation of literature and experience
- Equivalence to another device (very restricted under MDR)
For AI/ML devices, clinical evidence typically requires:
- Validation studies on representative populations
- Comparison to current clinical practice
- Analysis of subgroup performance
- Evidence of clinical utility (does it improve outcomes?)
MDR’s requirements are often more demanding than FDA’s, particularly for clinical evidence of equivalence claims.
21.3.5 Notified Bodies
Notified Bodies are organizations authorized to conduct conformity assessments. After MDR implementation, many Notified Bodies left the market or weren’t reauthorized, creating bottlenecks.
Choosing a Notified Body: - Must be designated for your device class and type - Capacity and timeline vary significantly - Costs vary (typically €20,000-100,000+ for initial assessment)
The limited number of Notified Bodies with AI/ML expertise can extend timelines.
21.3.6 GDPR Implications
The General Data Protection Regulation (GDPR) affects medical AI in several ways:
Training data: - Patient data used for training requires legal basis (consent or legitimate interest) - Anonymization or pseudonymization requirements - Data minimization principles
Automated decision-making (Article 22): - Individuals have rights regarding solely automated decisions with significant effects - May require human oversight of AI decisions - Right to explanation of automated decisions
Cross-border data: - Training data from EU may have restrictions on transfer outside EU - Cloud processing must comply with data localization requirements
21.3.7 Key Differences from FDA
| Aspect | FDA | EU MDR |
|---|---|---|
| Primary pathway | 510(k) (substantial equivalence) | Conformity assessment (conformity to standards) |
| Equivalence claims | Common, well-established | Very restricted |
| Classification | Generally lower for AI/ML | Often higher (Rule 11) |
| Clinical evidence | Varies by pathway | Consistently required |
| Post-market | Adverse event reporting | Comprehensive surveillance system |
| Review body | Single agency (FDA) | Multiple Notified Bodies |
21.4 Other Regulatory Frameworks
21.4.1 UK MHRA
Post-Brexit, the UK operates an independent regulatory system under the UK Medical Devices Regulations 2002 (being updated). Currently:
- UKCA marking required (CE marking accepted through 2028 for most devices)
- Similar classification to EU MDR
- MHRA (Medicines and Healthcare products Regulatory Agency) oversight
The UK is developing AI-specific guidance and may diverge from EU approaches.
21.4.2 Health Canada
Health Canada regulates SaMD under the Medical Devices Regulations:
- Classification based on risk (Classes I-IV)
- Most AI/ML devices are Class II or III
- Requires clinical evidence
- Post-market surveillance requirements
Health Canada has issued AI/ML-specific guidance aligned with FDA/IMDRF frameworks.
21.4.3 International Harmonization
IMDRF (International Medical Device Regulators Forum) develops harmonized guidance adopted by multiple jurisdictions:
- SaMD definition and classification framework
- Software lifecycle management principles
- Clinical evaluation considerations
Following IMDRF guidance can ease multi-jurisdictional submissions.
21.5 Ethical Frameworks for Medical AI
Clinical Context: Your AI system is FDA-cleared and CE-marked. It’s legal to deploy. But is it ethical? Regulatory approval addresses safety and effectiveness—it doesn’t answer questions about fairness, autonomy, or justice. Ethical frameworks provide guidance for these harder questions.
21.5.1 Traditional Medical Ethics
Medical AI inherits the ethical principles of medicine itself:
Beneficence: Act in the patient’s best interest. AI should improve outcomes, not just be technically impressive.
Non-maleficence: Do no harm. Consider not just direct harms but also opportunity costs, psychological impacts, and systemic effects.
Autonomy: Respect patient self-determination. Patients should understand when AI is involved in their care and have meaningful choices.
Justice: Distribute benefits and burdens fairly. AI should not exacerbate health disparities or preferentially benefit the privileged.
These principles apply to AI just as they do to any medical intervention—but their application can be complex.
21.5.2 AI-Specific Ethical Principles
Several additional principles have emerged for AI systems:
Transparency: Stakeholders should understand how AI systems work, their capabilities, and their limitations. This includes: - Disclosure that AI is being used - Explanation of what the AI does - Communication of uncertainty and error rates
Explainability: Decisions should be understandable. When AI recommends a diagnosis or treatment, there should be some basis for understanding why. (See Chapter 18 on Interpretability.)
Accountability: Clear responsibility for AI system outcomes. Who is accountable when AI errs—the developer, the deploying institution, the clinician who relied on it?
Human oversight: Maintaining appropriate human involvement in high-stakes decisions. AI should augment, not replace, human judgment in critical situations.
Privacy: Protecting patient data used to train and operate AI systems. This goes beyond legal compliance to consider patient expectations and trust.
21.5.3 Professional Guidelines
Medical professional organizations have issued AI guidance:
American Medical Association (AMA): - Physicians must retain authority over patient care decisions - AI should be rigorously validated - Systems should be designed to enhance health equity
American College of Radiology (ACR): - AI should be validated on diverse populations - Clear labeling of AI involvement in interpretations - Radiologist oversight of AI-assisted diagnoses
World Health Organization (WHO): - AI should promote autonomy and protect human agency - Systems should be designed for equity and inclusiveness - Data protection and privacy must be maintained
These guidelines inform ethical practice but aren’t legally binding.
21.6 Ethical Dilemmas in Practice
Abstract principles become concrete when applied to real situations. The following case studies illustrate ethical challenges in medical AI.
21.6.1 Case 1: Algorithmic Triage
Scenario: An emergency department deploys an AI triage system that predicts patient acuity and recommends wait times. The system was trained on historical data and performs well on standard metrics. However, analysis reveals that it systematically assigns lower acuity scores to Black patients presenting with chest pain, reflecting biases in historical triage decisions.
The dilemma: The system improves average triage accuracy but perpetuates racial disparities. Removing it means worse triage overall; keeping it means continued inequity for Black patients.
Considerations: - Justice: The system’s benefits and harms are unequally distributed - Non-maleficence: Systematic under-triage could cause serious harm - Transparency: Should patients know an AI influenced their triage? - Accountability: Who is responsible for the biased outcomes?
Approaches: - Audit for subgroup performance before deployment - Implement human review for flagged cases - Retrain on debiased data or with fairness constraints - Continuous monitoring for disparate impact
21.6.2 Case 2: The Black Box Diagnosis
Scenario: A deep learning system detects rare cancers from pathology slides with superhuman accuracy. The model is uninterpretable—no one can explain why it makes specific predictions. A patient receives a cancer diagnosis based substantially on AI analysis. She asks why the AI thinks she has cancer; no one can answer.
The dilemma: The AI is more accurate than interpretable alternatives. Using it improves diagnostic accuracy but undermines the patient’s ability to understand and question her diagnosis.
Considerations: - Autonomy: Can patients give truly informed consent without understanding the basis for diagnosis? - Beneficence: Higher accuracy benefits patients - Transparency: Patients expect and deserve explanations - Trust: Healthcare relationships depend on understanding
Approaches: - Develop post-hoc explanation methods (see Chapter 18) - Use interpretable models for final decisions, AI as second reader - Honest communication about AI limitations - Patient choice about AI involvement
21.6.3 Case 3: Resource Allocation
Scenario: An ICU capacity model predicts which patients will benefit most from intensive care during a surge. The model was trained on survival outcomes and accurately predicts mortality. Hospital administrators propose using it to allocate scarce ICU beds. Clinicians note that “survival” was measured at hospital discharge—the model doesn’t account for quality of life or longer-term outcomes.
The dilemma: The model makes resource allocation more systematic and arguably more accurate—but its definition of “benefit” is narrow and may not align with patient or societal values.
Considerations: - Justice: Is survival probability the right basis for allocation? - Beneficence: Should quality of life, patient preferences, or other factors matter? - Autonomy: Patients may have different views on acceptable outcomes - Accountability: Are algorithmic decisions appropriate for life-or-death resource allocation?
Approaches: - Involve ethicists and community stakeholders in defining outcomes - Use AI as input to human decisions, not the final arbiter - Maintain transparency about allocation criteria - Build in appeals processes and exceptions
21.6.4 Case 4: Automation Bias
Scenario: Radiologists using AI-assisted detection for mammography screening begin missing cancers that the AI misses. Analysis reveals that radiologists have become over-reliant on AI, paying less attention to regions the AI doesn’t flag. Overall cancer detection is slightly better with AI, but radiologists’ independent skills have degraded.
The dilemma: The AI improves population-level outcomes but creates new vulnerabilities. When the AI fails, the safety net of human expertise has eroded.
Considerations: - Non-maleficence: AI-induced skill degradation creates new harms - Beneficence: Overall outcomes have improved - Human oversight: Is meaningful oversight being maintained? - System resilience: What happens if the AI fails or is unavailable?
Approaches: - Design interfaces that maintain radiologist engagement - Require periodic AI-free reading to maintain skills - Train on AI limitations and failure modes - Monitor both AI-assisted and independent performance
21.6.5 Framework for Ethical Decision-Making
When facing ethical dilemmas in medical AI:
- Identify stakeholders: Patients, clinicians, institutions, society
- Clarify the dilemma: What values are in tension?
- Gather facts: What do we actually know about impacts?
- Consider principles: Which ethical principles apply?
- Explore options: What alternatives exist?
- Evaluate tradeoffs: Who bears costs and benefits of each option?
- Decide and monitor: Make a reasoned decision and track outcomes
- Revise as needed: Ethical analysis is ongoing, not one-time
Ethical dilemmas rarely have clean solutions. The goal is thoughtful engagement with difficult questions, not false certainty.
21.7 Responsible AI Development
Clinical Context: Ethical medical AI doesn’t happen by accident. It requires deliberate practices throughout development—from data collection to deployment to monitoring. This section covers practical approaches to responsible AI development.
21.7.1 Documentation and Transparency
Model Cards: Standardized documentation of ML models, including: - Model details (type, version, developers) - Intended use and out-of-scope uses - Training data description - Evaluation metrics and results - Ethical considerations - Limitations and failure modes
Data Sheets: Documentation of datasets, including: - Collection methodology - Patient populations represented - Labeling process and quality - Known biases or limitations - Privacy and consent status
This documentation serves multiple purposes: regulatory submission, clinical deployment decisions, and ongoing monitoring.
21.7.2 Stakeholder Engagement
Involve diverse perspectives throughout development:
Patients and patient advocates: - What outcomes matter to them? - What concerns do they have about AI? - How should AI involvement be communicated?
Clinicians: - How does the AI fit clinical workflow? - What information do they need to use it appropriately? - When should they override AI recommendations?
Frontline staff: - What training is needed? - How does AI change their work? - What problems do they observe?
Ethics and equity experts: - Are there fairness concerns? - What populations might be harmed? - How should difficult tradeoffs be handled?
21.7.3 Continuous Monitoring
Responsible deployment requires ongoing surveillance:
Performance monitoring: - Track accuracy metrics over time - Monitor for performance degradation - Compare real-world to validation performance
Subgroup monitoring: - Assess performance across demographics - Identify emerging disparities - Track usage patterns across populations
Safety monitoring: - Adverse event reporting and review - Near-miss tracking - User feedback collection
Usage monitoring: - Are clinicians using the system appropriately? - Are recommendations being followed, overridden, or ignored? - Is automation bias emerging?
21.7.4 Incident Response
When problems occur:
- Detection: How will you know something is wrong?
- Assessment: How severe is the issue?
- Mitigation: Can you address it without shutting down?
- Communication: Who needs to know?
- Investigation: What went wrong and why?
- Remediation: How do you fix it?
- Learning: How do you prevent recurrence?
Have incident response plans before you need them.
21.7.5 Organizational Structures
Responsible AI requires organizational commitment:
AI governance committee: - Oversees AI development and deployment decisions - Includes clinical, technical, ethical, and legal expertise - Reviews high-risk applications
AI ethics review: - Evaluates AI projects for ethical concerns - Analogous to IRB for research - May be integrated with existing ethics structures
Clear accountability: - Who owns each AI system? - Who is responsible for monitoring? - Who can make decisions about changes or shutdown?
21.7.6 The Model Card Approach in Practice
A model card for a medical AI system might include:
MODEL CARD: Chest X-ray Pneumonia Detector
Model Details
- Developed by: [Institution/Company]
- Version: 2.1 (January 2024)
- Type: Convolutional neural network (DenseNet-121)
- Input: PA chest radiograph
- Output: Probability of pneumonia, heatmap
Intended Use
- Assist radiologists in pneumonia detection
- NOT intended for: autonomous diagnosis, pediatric patients, ICU portable films
Training Data
- Source: [Hospital system] 2015-2020
- Size: 150,000 chest X-rays
- Demographics: 52% male, mean age 58
- Limitation: 78% White patients, may not generalize to other populations
Evaluation Results
- AUC: 0.94 (95% CI: 0.93-0.95) on internal test set
- Sensitivity: 0.91, Specificity: 0.88 at default threshold
- Known weakness: Lower sensitivity (0.82) for subtle interstitial patterns
Ethical Considerations
- Validated primarily on White patients; fairness audit ongoing
- Should not be used as sole basis for treatment decisions
- Radiologist must review all positive and negative results
Limitations
- Not validated on portable X-rays
- Performance may degrade on images from different X-ray equipment
- Does not detect other pathologies
21.8 Liability and Legal Considerations
Medical AI liability is evolving and jurisdiction-specific. This brief overview highlights key issues; consult legal counsel for specific situations.
21.8.1 Who Is Liable When AI Errs?
Potential parties in AI-related malpractice:
The physician: Traditional medical malpractice applies to physician decisions. Using AI doesn’t transfer liability. If a physician relies inappropriately on AI recommendations, standard malpractice analysis applies.
The institution: Hospitals may be liable for negligent AI selection, deployment, or oversight. Corporate negligence theories may apply to inadequate AI governance.
The developer: Product liability theories may apply to defective AI systems. The learned intermediary doctrine (common in pharma) may or may not apply to AI.
The AI itself: AI is not a legal person and cannot be held liable. Liability flows to the humans and organizations involved.
21.8.2 Evolving Legal Landscape
Key uncertainties:
- How does malpractice standard of care evolve as AI becomes common?
- When does failing to use AI become negligent?
- How do courts handle “black box” causation questions?
- Does AI-generated diagnosis qualify as physician practice?
These questions are being actively litigated and legislated. The legal landscape will continue to evolve.
21.8.3 Risk Mitigation
- Document AI validation and appropriateness for use case
- Maintain clear human oversight and decision authority
- Train users on AI capabilities and limitations
- Track and respond to adverse events
- Maintain appropriate insurance coverage
- Involve legal counsel in deployment decisions
21.9 Looking Forward
21.9.1 Evolving Regulatory Frameworks
Regulatory approaches continue to develop:
FDA: Expanding PCCP framework for AI/ML modifications. Developing guidance on clinical decision support. Exploring real-world evidence for regulatory decisions.
EU: AI Act creates horizontal AI regulation beyond medical devices. High-risk AI systems (including medical AI) face additional requirements for risk management, data governance, and human oversight.
International: IMDRF continues harmonization efforts. Bilateral mutual recognition agreements may ease multi-jurisdiction approval.
21.9.2 The EU AI Act
The EU AI Act (Regulation 2024/1689), which entered into force in August 2024 with phased implementation through 2027, creates a horizontal, risk-based framework for AI systems across all sectors—including healthcare. Unlike MDR, which regulates medical devices specifically, the AI Act regulates AI systems based on their risk to fundamental rights.
21.9.2.1 Risk Categories
The AI Act defines four risk tiers:
Unacceptable risk (Prohibited): - Social scoring by governments - Real-time biometric identification in public spaces (with exceptions) - Manipulation of vulnerable persons - Not typically relevant to clinical AI
High risk (Strict requirements): - AI systems that are safety components of products covered by EU legislation (including medical devices) - AI used in specific listed domains: biometric identification, critical infrastructure, education, employment, essential services, law enforcement, migration, justice - Most clinical AI falls here due to medical device classification
Limited risk (Transparency only): - AI systems interacting with humans (chatbots) - Emotion recognition systems - Deepfake generators - Requirement: Inform users they’re interacting with AI
Minimal risk (No additional requirements): - AI-enabled video games - Spam filters - No AI Act obligations
21.9.2.2 High-Risk Requirements for Medical AI
For high-risk AI systems (including most clinical AI), the AI Act mandates:
1. Risk Management System (Article 9): - Continuous iterative process throughout AI lifecycle - Identify and analyze known and foreseeable risks - Adopt risk mitigation measures - Similar to but broader than MDR risk management
2. Data Governance (Article 10): - Training, validation, and testing datasets must be relevant, representative, and free from errors - Appropriate statistical properties for intended purpose - Explicit requirement for subgroup analysis
3. Technical Documentation (Article 11): - Design specifications and general logic - Development process choices and rationale - Risk management documentation - More prescriptive than MDR requirements
4. Record-keeping (Article 12): - Automatic logging of AI system operations - Traceability of decisions - New requirement not in MDR
5. Transparency and User Information (Article 13): - Clear instructions for use - Disclosure of AI nature and capabilities - Information about known limitations - Extends MDR labeling requirements
6. Human Oversight (Article 14): - Design for effective oversight by natural persons - Ability to understand, monitor, and intervene - Stronger than “intended use” in MDR
7. Accuracy, Robustness, and Cybersecurity (Article 15): - Appropriate levels throughout lifecycle - Resilience to errors and attacks - Overlaps with MDR but adds AI-specific concerns
21.9.2.3 Conformity Assessment and Compliance
For medical AI that is already a medical device under MDR:
- The MDR conformity assessment also covers AI Act requirements (no separate AI Act assessment)
- But the AI Act requirements expand what Notified Bodies must verify
- AI-specific documentation, logging, and human oversight become part of the MDR assessment
Timeline: - August 2024: AI Act entered into force - February 2025: Prohibitions apply - August 2025: Governance and penalties apply - August 2026: High-risk requirements apply - August 2027: Full application to embedded AI systems
21.9.2.4 Interaction with MDR
Medical AI developers face layered requirements:
| Requirement | MDR | AI Act |
|---|---|---|
| Risk classification | ✓ | ✓ (different scheme) |
| Clinical evidence | ✓ | Not explicit |
| Post-market surveillance | ✓ | ✓ (enhanced) |
| Technical documentation | ✓ | ✓ (more prescriptive) |
| Human oversight | Implicit in intended use | Explicit requirement |
| Automatic logging | Not required | Required |
| Subgroup analysis | Best practice | Explicit requirement |
The AI Act doesn’t replace MDR—it adds to it. A chest X-ray classifier needs CE marking under MDR and must satisfy AI Act high-risk requirements. In practice, a well-prepared MDR submission will cover most AI Act requirements, but the logging, human oversight, and data governance requirements may require additional work.
21.9.2.5 Penalties
Non-compliance penalties under the AI Act are substantial: - Prohibited AI practices: Up to €35 million or 7% of global annual turnover - High-risk violations: Up to €15 million or 3% of turnover - Incorrect information to authorities: Up to €7.5 million or 1% of turnover
These penalties apply to all parties in the AI value chain—developers, deployers, and importers.
Medical AI developers must comply with both MDR and AI Act—overlapping but distinct requirements that together create the world’s most comprehensive AI regulatory framework.
21.9.3 Building a Culture of Responsible AI
Regulation and ethics frameworks are necessary but not sufficient. Sustainable responsible AI requires:
Leadership commitment: Organizations must prioritize ethics alongside performance and efficiency.
Technical capacity: Teams need skills to implement fair, transparent, robust AI systems.
Continuous learning: The field is moving fast; staying current is essential.
Humility: Acknowledging what we don’t know and what can go wrong.
The goal is not perfect AI—which is impossible—but thoughtful, accountable, continuously improving AI that serves patients and society.
21.10 Chapter Summary
Deploying medical AI requires navigating complex regulatory and ethical landscapes.
FDA regulation: - AI/ML systems are regulated as Software as a Medical Device (SaMD) - Most use 510(k) (substantial equivalence) or De Novo (novel devices) - Predetermined Change Control Plans enable controlled updates - Real-world examples (IDx-DR, Caption Health) illustrate pathways
EU regulation: - MDR classifies medical AI often higher than FDA - CE marking requires Notified Body assessment - Clinical evidence requirements are substantial - GDPR adds data protection considerations
Ethical frameworks: - Traditional medical ethics (beneficence, autonomy, justice) apply - AI-specific principles (transparency, explainability, accountability) - Case studies illustrate real dilemmas without easy answers - Systematic ethical analysis supports better decisions
Responsible development: - Documentation (model cards, data sheets) enables transparency - Stakeholder engagement throughout development - Continuous monitoring after deployment - Organizational structures for AI governance
Looking forward: - Regulatory frameworks continue to evolve - EU AI Act adds new requirements - Legal liability remains uncertain - Building responsible AI culture is essential
Medical AI offers tremendous potential to improve healthcare. Realizing that potential requires not just technical excellence but thoughtful engagement with regulatory requirements and ethical obligations. The work is harder than just building models—and it’s essential.
21.11 Exercises
Your team has developed an AI system that predicts sepsis onset from EHR data. Describe the FDA regulatory pathway you would pursue. What evidence would you need to submit?
The same sepsis prediction system will be deployed in both the US and Germany. What are the key differences in regulatory requirements you need to address?
Analyze the following scenario using the ethical framework presented in this chapter: An AI system for prioritizing patients for specialist appointments consistently gives lower priority scores to patients from rural areas, because historical data shows they’re less likely to show up for appointments. Is this ethical? What would you recommend?
Draft a model card for an AI system you’ve worked with or read about. What information was readily available? What was missing?
Your hospital wants to deploy a commercial AI diagnostic system. What questions should you ask the vendor before deployment? What ongoing monitoring would you implement?