GDPR Technical Implementation Guide for Engineers

May 14, 2018

The General Data Protection Regulation (GDPR) takes effect on May 25th. Beyond the legal requirements, GDPR demands significant technical changes to how we handle personal data. Engineering teams need concrete implementation guidance.

This guide covers the technical aspects: what to build, how to structure systems, and common implementation patterns.

Understanding the Technical Requirements

Personal Data Scope

Personal data is broader than you might think:

Obvious personal data:

Often overlooked:

Audit your systems for all personal data, not just the obvious fields.

Key Rights to Implement

Right of Access (Article 15): Users can request all data you hold about them.

Right to Erasure (Article 17): Users can request deletion of their data.

Right to Rectification (Article 16): Users can correct inaccurate data.

Right to Portability (Article 20): Users can receive their data in a machine-readable format.

Right to Object (Article 21): Users can object to certain processing.

Each right requires technical implementation.

Data Mapping

Create a Data Inventory

Document every system storing personal data:

systems:
  - name: users-service
    data_types:
      - email
      - name
      - password_hash
      - created_at
      - last_login
    retention: indefinite  # Flag for review
    purposes:
      - authentication
      - communication

  - name: analytics-db
    data_types:
      - user_id
      - ip_address
      - page_views
      - session_data
    retention: 90_days
    purposes:
      - product_improvement
      - analytics

  - name: third_party_crm
    data_types:
      - email
      - name
      - purchase_history
    purposes:
      - marketing
      - customer_support
    data_processor: true

Track Data Flow

Map how data moves:

User Input → API Gateway → User Service → Database
                                       → Analytics
                                       → Email Service (3rd party)
                                       → CRM (3rd party)
                                       → Logs
                                       → Backups

Every destination needs GDPR compliance.

Each processing activity needs legal basis:

Document which basis applies to each data use.

Store consent with specificity:

CREATE TABLE consent_records (
    id UUID PRIMARY KEY,
    user_id UUID NOT NULL,
    purpose VARCHAR(100) NOT NULL,  -- e.g., 'marketing_email', 'analytics'
    consent_given BOOLEAN NOT NULL,
    consent_timestamp TIMESTAMP NOT NULL,
    consent_mechanism VARCHAR(100),  -- e.g., 'signup_form', 'preferences_page'
    consent_version VARCHAR(50),     -- Version of consent text
    ip_address VARCHAR(45),
    user_agent TEXT,
    withdrawn_at TIMESTAMP,
    UNIQUE(user_id, purpose)
);
@app.route('/api/consent', methods=['POST'])
def update_consent():
    data = request.json
    user_id = get_authenticated_user()

    for purpose, granted in data['purposes'].items():
        if granted:
            consent_service.grant(
                user_id=user_id,
                purpose=purpose,
                timestamp=datetime.utcnow(),
                mechanism='preferences_api',
                consent_version=CURRENT_CONSENT_VERSION
            )
        else:
            consent_service.withdraw(
                user_id=user_id,
                purpose=purpose,
                timestamp=datetime.utcnow()
            )

    return jsonify({'status': 'updated'})

GDPR requires affirmative action:

<!-- Wrong -->
<input type="checkbox" checked> Send me marketing emails

<!-- Right -->
<input type="checkbox"> Send me marketing emails

Withdrawal Must Be Easy

If consent was given with one click, withdrawal should require one click:

@app.route('/api/consent/<purpose>', methods=['DELETE'])
def withdraw_consent(purpose):
    user_id = get_authenticated_user()
    consent_service.withdraw(user_id, purpose, datetime.utcnow())
    # Stop processing immediately
    processing_service.stop_processing(user_id, purpose)
    return jsonify({'status': 'withdrawn'})

Right to Erasure (Data Deletion)

Deletion Strategy

Not all data can be deleted immediately:

Immediate deletion:

Anonymization (alternative):

Legal retention:

def handle_erasure_request(user_id):
    # Immediate deletion
    user_service.delete_profile(user_id)
    preferences_service.delete(user_id)
    session_service.invalidate_all(user_id)

    # Anonymization
    analytics_service.anonymize(user_id)

    # Scheduled deletion (after legal retention)
    schedule_deletion(
        service='billing_service',
        user_id=user_id,
        after=years(7)
    )

    # Third-party deletion requests
    for processor in third_party_processors:
        processor.request_deletion(user_id)

    # Backup handling
    add_to_backup_exclusion_list(user_id)

Backup Complications

Backups contain personal data that’s hard to delete:

Options:

  1. Exclusion lists: Track deleted users; exclude from restores
  2. Encrypted user keys: Encrypt user data with per-user keys; delete keys
  3. Backup rotation: Accept that data exists until backup expires
  4. Selective backup: Separate personal data from business data

Document your approach and retention periods.

Cascading Deletion

Personal data exists across systems:

def cascade_delete(user_id):
    deletion_tasks = []

    # Primary database
    deletion_tasks.append(
        delete_from_db('users', user_id)
    )

    # Related services
    for service in ['orders', 'reviews', 'messages', 'notifications']:
        deletion_tasks.append(
            service_client[service].delete_user_data(user_id)
        )

    # Search indexes
    deletion_tasks.append(
        search_service.remove_user(user_id)
    )

    # Caches
    deletion_tasks.append(
        cache.invalidate_user(user_id)
    )

    # Execute with error handling
    results = execute_all(deletion_tasks)
    if any_failed(results):
        alert_and_retry(user_id, failed_tasks(results))

Right to Access (Data Export)

Data Export API

Provide machine-readable export:

@app.route('/api/data-export', methods=['POST'])
def request_data_export():
    user_id = get_authenticated_user()

    # Queue export job (can be slow)
    job_id = export_service.queue_export(user_id)

    return jsonify({
        'job_id': job_id,
        'status': 'processing',
        'estimated_completion': '24 hours'
    })

@app.route('/api/data-export/<job_id>', methods=['GET'])
def get_export_status(job_id):
    user_id = get_authenticated_user()
    job = export_service.get_job(job_id)

    if job.user_id != user_id:
        abort(403)

    if job.status == 'complete':
        return jsonify({
            'status': 'complete',
            'download_url': job.download_url,
            'expires_at': job.expires_at
        })
    else:
        return jsonify({'status': job.status})

Export Format

JSON is typically acceptable:

{
  "export_date": "2018-05-14T10:30:00Z",
  "user": {
    "email": "user@example.com",
    "name": "Jane Doe",
    "created_at": "2017-03-15T08:00:00Z"
  },
  "orders": [
    {
      "id": "order_123",
      "date": "2018-04-01",
      "total": 99.99,
      "items": [...]
    }
  ],
  "activity": [
    {"timestamp": "2018-05-01T10:00:00Z", "action": "login"},
    {"timestamp": "2018-05-01T10:05:00Z", "action": "view_product", "product_id": "prod_456"}
  ]
}

Include all personal data across all systems.

Privacy by Design

Data Minimization

Collect only what’s necessary:

# Before: Collect everything
user_profile = {
    'email': email,
    'name': name,
    'phone': phone,  # Not needed
    'birthday': birthday,  # Not needed
    'address': address,  # Not needed for this service
}

# After: Collect only what's needed
user_profile = {
    'email': email,
    'name': name,
}

Pseudonymization

Separate identifying data from processing data:

Users Table (restricted access):
  user_id, email, name

Activity Table (analytics):
  pseudonymous_id, page_view, timestamp

Mapping Table (very restricted):
  user_id → pseudonymous_id

Retention Policies

Implement automatic data expiration:

-- Soft delete with retention
UPDATE users SET
  deleted_at = NOW(),
  email = 'deleted_' || id || '@deleted.invalid',
  name = '[DELETED]'
WHERE id = ?;

-- Hard delete after retention period
DELETE FROM users
WHERE deleted_at < NOW() - INTERVAL '30 days';

Encryption

Encrypt personal data at rest:

from cryptography.fernet import Fernet

class EncryptedField:
    def __init__(self, key):
        self.cipher = Fernet(key)

    def encrypt(self, value):
        return self.cipher.encrypt(value.encode())

    def decrypt(self, encrypted):
        return self.cipher.decrypt(encrypted).decode()

# Per-user encryption enables crypto-shredding
user_key = get_user_encryption_key(user_id)
encrypted_data = EncryptedField(user_key).encrypt(personal_data)

# Deletion: just delete the key
delete_user_encryption_key(user_id)  # Data now unreadable

Logging and Audit

Access Logging

Log access to personal data:

def get_user_profile(user_id, requestor_id):
    log_access(
        resource_type='user_profile',
        resource_id=user_id,
        accessor_id=requestor_id,
        action='read',
        timestamp=datetime.utcnow()
    )
    return user_service.get_profile(user_id)

Processing Records

Maintain records of processing activities:

CREATE TABLE processing_records (
    id UUID PRIMARY KEY,
    timestamp TIMESTAMP NOT NULL,
    data_subject_id UUID,
    processing_activity VARCHAR(100),
    legal_basis VARCHAR(50),
    purpose VARCHAR(200),
    data_categories TEXT[],
    recipients TEXT[],
    retention_period VARCHAR(50)
);

Key Takeaways

GDPR compliance is an engineering challenge as much as a legal one. Build the systems now.