The General Data Protection Regulation (GDPR) takes effect on May 25th. Beyond the legal requirements, GDPR demands significant technical changes to how we handle personal data. Engineering teams need concrete implementation guidance.
This guide covers the technical aspects: what to build, how to structure systems, and common implementation patterns.
Understanding the Technical Requirements
Personal Data Scope
Personal data is broader than you might think:
Obvious personal data:
- Names, email addresses
- Physical addresses
- Phone numbers
- Government IDs
Often overlooked:
- IP addresses
- Device identifiers
- Cookie identifiers
- Location data
- User behavior patterns
- Any data that can identify a person when combined
Audit your systems for all personal data, not just the obvious fields.
Key Rights to Implement
Right of Access (Article 15): Users can request all data you hold about them.
Right to Erasure (Article 17): Users can request deletion of their data.
Right to Rectification (Article 16): Users can correct inaccurate data.
Right to Portability (Article 20): Users can receive their data in a machine-readable format.
Right to Object (Article 21): Users can object to certain processing.
Each right requires technical implementation.
Data Mapping
Create a Data Inventory
Document every system storing personal data:
systems:
- name: users-service
data_types:
- email
- name
- password_hash
- created_at
- last_login
retention: indefinite # Flag for review
purposes:
- authentication
- communication
- name: analytics-db
data_types:
- user_id
- ip_address
- page_views
- session_data
retention: 90_days
purposes:
- product_improvement
- analytics
- name: third_party_crm
data_types:
- email
- name
- purchase_history
purposes:
- marketing
- customer_support
data_processor: true
Track Data Flow
Map how data moves:
User Input → API Gateway → User Service → Database
→ Analytics
→ Email Service (3rd party)
→ CRM (3rd party)
→ Logs
→ Backups
Every destination needs GDPR compliance.
Identify Legal Basis
Each processing activity needs legal basis:
- Consent: User explicitly agreed
- Contract: Necessary for service delivery
- Legal obligation: Required by law
- Legitimate interest: Your business interest, balanced against user rights
Document which basis applies to each data use.
Consent Management
Technical Consent System
Store consent with specificity:
CREATE TABLE consent_records (
id UUID PRIMARY KEY,
user_id UUID NOT NULL,
purpose VARCHAR(100) NOT NULL, -- e.g., 'marketing_email', 'analytics'
consent_given BOOLEAN NOT NULL,
consent_timestamp TIMESTAMP NOT NULL,
consent_mechanism VARCHAR(100), -- e.g., 'signup_form', 'preferences_page'
consent_version VARCHAR(50), -- Version of consent text
ip_address VARCHAR(45),
user_agent TEXT,
withdrawn_at TIMESTAMP,
UNIQUE(user_id, purpose)
);
Consent API
@app.route('/api/consent', methods=['POST'])
def update_consent():
data = request.json
user_id = get_authenticated_user()
for purpose, granted in data['purposes'].items():
if granted:
consent_service.grant(
user_id=user_id,
purpose=purpose,
timestamp=datetime.utcnow(),
mechanism='preferences_api',
consent_version=CURRENT_CONSENT_VERSION
)
else:
consent_service.withdraw(
user_id=user_id,
purpose=purpose,
timestamp=datetime.utcnow()
)
return jsonify({'status': 'updated'})
Pre-checked Boxes Are Not Consent
GDPR requires affirmative action:
<!-- Wrong -->
<input type="checkbox" checked> Send me marketing emails
<!-- Right -->
<input type="checkbox"> Send me marketing emails
Withdrawal Must Be Easy
If consent was given with one click, withdrawal should require one click:
@app.route('/api/consent/<purpose>', methods=['DELETE'])
def withdraw_consent(purpose):
user_id = get_authenticated_user()
consent_service.withdraw(user_id, purpose, datetime.utcnow())
# Stop processing immediately
processing_service.stop_processing(user_id, purpose)
return jsonify({'status': 'withdrawn'})
Right to Erasure (Data Deletion)
Deletion Strategy
Not all data can be deleted immediately:
Immediate deletion:
- Profile data
- Preferences
- Non-essential data
Anonymization (alternative):
- Analytics data (remove identifiers, keep aggregates)
- Historical records (remove personal data, keep business data)
Legal retention:
- Financial records (often 7 years)
- Legal documents
- Tax records
def handle_erasure_request(user_id):
# Immediate deletion
user_service.delete_profile(user_id)
preferences_service.delete(user_id)
session_service.invalidate_all(user_id)
# Anonymization
analytics_service.anonymize(user_id)
# Scheduled deletion (after legal retention)
schedule_deletion(
service='billing_service',
user_id=user_id,
after=years(7)
)
# Third-party deletion requests
for processor in third_party_processors:
processor.request_deletion(user_id)
# Backup handling
add_to_backup_exclusion_list(user_id)
Backup Complications
Backups contain personal data that’s hard to delete:
Options:
- Exclusion lists: Track deleted users; exclude from restores
- Encrypted user keys: Encrypt user data with per-user keys; delete keys
- Backup rotation: Accept that data exists until backup expires
- Selective backup: Separate personal data from business data
Document your approach and retention periods.
Cascading Deletion
Personal data exists across systems:
def cascade_delete(user_id):
deletion_tasks = []
# Primary database
deletion_tasks.append(
delete_from_db('users', user_id)
)
# Related services
for service in ['orders', 'reviews', 'messages', 'notifications']:
deletion_tasks.append(
service_client[service].delete_user_data(user_id)
)
# Search indexes
deletion_tasks.append(
search_service.remove_user(user_id)
)
# Caches
deletion_tasks.append(
cache.invalidate_user(user_id)
)
# Execute with error handling
results = execute_all(deletion_tasks)
if any_failed(results):
alert_and_retry(user_id, failed_tasks(results))
Right to Access (Data Export)
Data Export API
Provide machine-readable export:
@app.route('/api/data-export', methods=['POST'])
def request_data_export():
user_id = get_authenticated_user()
# Queue export job (can be slow)
job_id = export_service.queue_export(user_id)
return jsonify({
'job_id': job_id,
'status': 'processing',
'estimated_completion': '24 hours'
})
@app.route('/api/data-export/<job_id>', methods=['GET'])
def get_export_status(job_id):
user_id = get_authenticated_user()
job = export_service.get_job(job_id)
if job.user_id != user_id:
abort(403)
if job.status == 'complete':
return jsonify({
'status': 'complete',
'download_url': job.download_url,
'expires_at': job.expires_at
})
else:
return jsonify({'status': job.status})
Export Format
JSON is typically acceptable:
{
"export_date": "2018-05-14T10:30:00Z",
"user": {
"email": "user@example.com",
"name": "Jane Doe",
"created_at": "2017-03-15T08:00:00Z"
},
"orders": [
{
"id": "order_123",
"date": "2018-04-01",
"total": 99.99,
"items": [...]
}
],
"activity": [
{"timestamp": "2018-05-01T10:00:00Z", "action": "login"},
{"timestamp": "2018-05-01T10:05:00Z", "action": "view_product", "product_id": "prod_456"}
]
}
Include all personal data across all systems.
Privacy by Design
Data Minimization
Collect only what’s necessary:
# Before: Collect everything
user_profile = {
'email': email,
'name': name,
'phone': phone, # Not needed
'birthday': birthday, # Not needed
'address': address, # Not needed for this service
}
# After: Collect only what's needed
user_profile = {
'email': email,
'name': name,
}
Pseudonymization
Separate identifying data from processing data:
Users Table (restricted access):
user_id, email, name
Activity Table (analytics):
pseudonymous_id, page_view, timestamp
Mapping Table (very restricted):
user_id → pseudonymous_id
Retention Policies
Implement automatic data expiration:
-- Soft delete with retention
UPDATE users SET
deleted_at = NOW(),
email = 'deleted_' || id || '@deleted.invalid',
name = '[DELETED]'
WHERE id = ?;
-- Hard delete after retention period
DELETE FROM users
WHERE deleted_at < NOW() - INTERVAL '30 days';
Encryption
Encrypt personal data at rest:
from cryptography.fernet import Fernet
class EncryptedField:
def __init__(self, key):
self.cipher = Fernet(key)
def encrypt(self, value):
return self.cipher.encrypt(value.encode())
def decrypt(self, encrypted):
return self.cipher.decrypt(encrypted).decode()
# Per-user encryption enables crypto-shredding
user_key = get_user_encryption_key(user_id)
encrypted_data = EncryptedField(user_key).encrypt(personal_data)
# Deletion: just delete the key
delete_user_encryption_key(user_id) # Data now unreadable
Logging and Audit
Access Logging
Log access to personal data:
def get_user_profile(user_id, requestor_id):
log_access(
resource_type='user_profile',
resource_id=user_id,
accessor_id=requestor_id,
action='read',
timestamp=datetime.utcnow()
)
return user_service.get_profile(user_id)
Processing Records
Maintain records of processing activities:
CREATE TABLE processing_records (
id UUID PRIMARY KEY,
timestamp TIMESTAMP NOT NULL,
data_subject_id UUID,
processing_activity VARCHAR(100),
legal_basis VARCHAR(50),
purpose VARCHAR(200),
data_categories TEXT[],
recipients TEXT[],
retention_period VARCHAR(50)
);
Key Takeaways
- Map all personal data: locations, flows, purposes, and legal basis
- Implement consent management with granular, withdrawable consent
- Build data export functionality returning all personal data in machine-readable format
- Implement deletion cascading across all systems, including third parties
- Handle backups through exclusion lists, encryption, or documented retention
- Apply privacy by design: minimize data, pseudonymize, set retention policies
- Log access to personal data for accountability
- Start now—May 25th deadline requires functioning systems, not plans
GDPR compliance is an engineering challenge as much as a legal one. Build the systems now.