Enterprise AI adoption is hitting a compliance wall. Legal, security, and compliance teams are asking questions that engineering teams often can’t answer. Data lineage, model auditing, and regulatory requirements demand new practices.
Here’s how to build AI systems that satisfy compliance requirements.
The Compliance Landscape
Emerging Requirements
ai_compliance_landscape:
data_regulations:
- GDPR (EU data protection)
- CCPA (California privacy)
- Industry-specific (HIPAA, SOC2, PCI)
ai_specific:
- EU AI Act (coming)
- NIST AI Risk Framework
- Industry guidelines
enterprise_concerns:
- Data residency
- Vendor agreements
- Audit requirements
- Liability
Common Compliance Questions
compliance_questions:
data:
- Where does our data go?
- Is data used for training?
- Can we delete user data?
- What's the retention policy?
model:
- What model makes decisions?
- Can we explain outputs?
- How do we audit decisions?
- What's the error rate?
process:
- Who approved this AI use?
- What testing was done?
- How do we handle incidents?
- What's the rollback plan?
Building Compliant Systems
Data Governance
class AIDataGovernance:
"""Track data flowing through AI systems."""
def __init__(self, config: GovernanceConfig):
self.data_store = config.data_store
self.audit_log = config.audit_log
async def process_with_lineage(
self,
input_data: dict,
user_id: str,
purpose: str
) -> ProcessingResult:
# Create lineage record
lineage_id = await self.audit_log.create_record(
input_hash=hash_data(input_data),
user_id=user_id,
purpose=purpose,
timestamp=datetime.utcnow(),
data_classification=classify_data(input_data)
)
# Process with tracking
try:
result = await self._process(input_data)
await self.audit_log.update_record(
lineage_id=lineage_id,
output_hash=hash_data(result),
model_used=result.model,
status="success"
)
return result
except Exception as e:
await self.audit_log.update_record(
lineage_id=lineage_id,
status="error",
error=str(e)
)
raise
async def handle_deletion_request(self, user_id: str):
"""GDPR right to deletion."""
# Find all records
records = await self.audit_log.find_by_user(user_id)
# Delete or anonymize
for record in records:
await self.audit_log.anonymize(record.id)
return DeletionConfirmation(
user_id=user_id,
records_affected=len(records),
timestamp=datetime.utcnow()
)
Model Auditing
class ModelAuditSystem:
"""Maintain audit trail for model decisions."""
def __init__(self, storage: AuditStorage):
self.storage = storage
async def log_inference(
self,
request_id: str,
model_id: str,
input_data: dict,
output: dict,
metadata: dict
) -> AuditRecord:
record = AuditRecord(
request_id=request_id,
timestamp=datetime.utcnow(),
model_id=model_id,
model_version=await self.get_model_version(model_id),
input_hash=hash_pii_safe(input_data),
output_summary=summarize_output(output),
latency_ms=metadata.get("latency_ms"),
token_count=metadata.get("tokens"),
user_id=metadata.get("user_id"),
session_id=metadata.get("session_id")
)
await self.storage.save(record)
return record
async def generate_audit_report(
self,
start_date: datetime,
end_date: datetime
) -> AuditReport:
records = await self.storage.query(start_date, end_date)
return AuditReport(
period=(start_date, end_date),
total_requests=len(records),
models_used=self._count_models(records),
error_rate=self._calculate_error_rate(records),
average_latency=self._calculate_avg_latency(records),
data_classifications=self._count_classifications(records)
)
Vendor Management
API Provider Assessment
vendor_assessment:
data_handling:
questions:
- Is data used for model training?
- What's the data retention period?
- Where is data processed geographically?
- Is data encrypted in transit and at rest?
openai:
training: "No (API data not used for training by default)"
retention: "30 days for abuse monitoring"
location: "US primarily"
encryption: "Yes"
anthropic:
training: "No (not by default)"
retention: "30 days"
location: "US"
encryption: "Yes"
compliance_certifications:
check_for:
- SOC 2 Type II
- ISO 27001
- GDPR compliance statement
- BAA availability (for HIPAA)
Contract Requirements
contract_checklist:
data_protection:
- Data processing agreement (DPA)
- Sub-processor list
- Breach notification terms
- Data deletion provisions
service_levels:
- Uptime guarantees
- Latency commitments
- Support response times
liability:
- Indemnification terms
- Limitation of liability
- Insurance requirements
exit_provisions:
- Data portability
- Termination notice
- Transition assistance
Documentation Requirements
AI System Documentation
required_documentation:
system_card:
purpose: Describe AI system and its use
contents:
- System description
- Intended use cases
- Known limitations
- Risk assessment
- Mitigation measures
data_inventory:
purpose: Track data used in AI systems
contents:
- Data sources
- Data types (PII, sensitive, etc.)
- Processing purposes
- Retention periods
- Access controls
model_registry:
purpose: Track models in use
contents:
- Model identifier
- Version history
- Training data description
- Evaluation results
- Deployment status
Key Takeaways
- AI compliance requirements are real and growing
- Build audit trails from day one—retrofitting is painful
- Track data lineage through AI systems
- Document model decisions for explainability
- Assess vendors for compliance certifications
- Prepare for GDPR-style deletion requests
- Maintain model registries with versions
- Legal and compliance teams are stakeholders
- Build compliance into architecture, not as afterthought
- EU AI Act will raise the bar—prepare now
Compliance enables AI adoption. Build it in.