Security incidents are inevitable. The organizations that handle them well have prepared: roles are defined, runbooks exist, and teams have practiced. Here’s a practical playbook for security incident response.
Incident Classification
Severity Levels
SEV-1 (Critical):
- Active data breach
- Ransomware or destructive attack
- Compromise of authentication systems
- Regulatory notification required
SEV-2 (High):
- Suspected breach, scope unknown
- Vulnerability under active exploitation
- Credential compromise
- Significant system compromise
SEV-3 (Medium):
- Vulnerability discovered, not exploited
- Phishing attempt affecting multiple users
- Suspicious activity requiring investigation
SEV-4 (Low):
- Minor policy violations
- Single-user phishing attempt (unsuccessful)
- Routine security events
Trigger Criteria
Define what triggers incident response:
- Alert from security tooling
- Report from employee
- External notification (customer, researcher)
- Law enforcement contact
- Media coverage of vulnerability
Response Framework
Phase 1: Detection and Triage (First 15 Minutes)
1. Acknowledge alert/report
2. Initial assessment:
- What happened?
- What systems are affected?
- Is attack ongoing?
3. Classify severity
4. Page appropriate responders
5. Create incident channel
Triage questions:
- Is this a real incident or false positive?
- What’s the potential impact?
- Is immediate action required?
Phase 2: Containment (First Hour)
Priority: Stop the bleeding
Actions:
- Isolate affected systems
- Revoke compromised credentials
- Block malicious IPs/domains
- Disable compromised accounts
- Preserve evidence before changes
Containment strategies:
# Isolate compromised host
iptables -A INPUT -j DROP
iptables -A OUTPUT -j DROP
# Or with security groups
aws ec2 modify-instance-attribute --instance-id i-xxx \
--groups sg-isolated
# Disable user account
aws iam update-login-profile --user-name compromised-user --no-password-reset-required
aws iam delete-access-key --user-name compromised-user --access-key-id AKIA...
Phase 3: Investigation
Questions to answer:
- How did attackers get in?
- What did they access?
- How long were they in?
- Did they move laterally?
- What data was affected?
Evidence sources:
- Application logs
- Cloud audit logs (CloudTrail, GCP Audit Logs)
- Network logs
- Authentication logs
- Database access logs
- Endpoint telemetry
Phase 4: Eradication
Remove attacker presence:
- Clean or rebuild affected systems
- Remove malware and backdoors
- Close vulnerability used for entry
- Rotate all potentially compromised credentials
- Review and update access controls
Don’t just patch and move on—ensure complete removal.
Phase 5: Recovery
Restore normal operations:
- Bring cleaned systems online
- Validate system integrity
- Enable monitoring
- Gradual return to service
- Monitor for reinfection
Phase 6: Post-Incident
Within 24-48 hours:
- Conduct blameless post-mortem
- Document timeline and actions
- Identify root cause
- Create action items
- Update playbooks
- Brief stakeholders
Roles and Responsibilities
Incident Commander
Owns the incident:
- Makes decisions
- Coordinates responders
- Manages communication
- Escalates when needed
Not necessarily the most senior—needs to be available and focused.
Security Lead
Drives technical investigation:
- Analyzes evidence
- Identifies attack vector
- Recommends containment
- Validates eradication
Communications Lead
Manages information flow:
- Internal updates
- External communication (if needed)
- Customer notification
- Regulatory notification
Scribe
Documents everything:
- Timeline of events
- Actions taken
- Decisions made
- Evidence locations
Communication
Internal Communication
Incident channel:
#incident-2019-07-15-credential-leak
Topic: Credential leak incident - SEV-1
IC: @alice
Security Lead: @bob
Status: Containment
Last update: 10:30 AM
Regular updates:
Every 30 minutes for SEV-1
Every hour for SEV-2
As needed for SEV-3/4
External Communication
When to disclose:
- Regulatory requirements (GDPR: 72 hours)
- Customer data affected
- Industry notification requirements
Who writes it:
- Security team drafts
- Legal reviews
- Communications polishes
- Executive approves
Sample template:
What happened: Brief factual description
When: Timeline
What we're doing: Response actions
What you should do: User actions (password reset, etc.)
How to contact: Support channels
Forensic Preservation
Before Making Changes
# Capture memory
sudo lime -format raw -path /evidence/memory.raw
# Disk image
sudo dd if=/dev/sda of=/evidence/disk.img bs=4M status=progress
# Cloud: Snapshot before modifying
aws ec2 create-snapshot --volume-id vol-xxx --description "Incident evidence"
Log Preservation
# Export logs before they rotate
aws logs filter-log-events --log-group-name /app/api \
--start-time 1563148800000 \
--end-time 1563235200000 \
> /evidence/api-logs.json
# Copy relevant files
tar -czf /evidence/config-backup.tar.gz /etc /var/log
Chain of Custody
Evidence: api-server-memory.raw
Collected by: @alice
Collected at: 2019-07-15 10:45 UTC
Hash: sha256:abc123...
Storage: s3://evidence-bucket/incident-2019-07-15/
Playbook Examples
Compromised Credentials
1. Identify affected accounts
2. Revoke all sessions
3. Reset passwords/rotate keys
4. Review account activity for unauthorized actions
5. Notify affected users
6. Investigate how credentials were obtained
7. Remediate root cause
Ransomware
1. Isolate affected systems immediately
2. Identify ransomware variant
3. Determine encryption scope
4. Evaluate backup availability and integrity
5. Do not pay ransom (usually)
6. Restore from known-good backups
7. Investigate entry vector
8. Report to law enforcement
Data Breach
1. Identify what data was accessed
2. Determine affected individuals
3. Assess regulatory notification requirements
4. Prepare customer notification
5. Engage legal counsel
6. Notify regulators within required timeframe
7. Notify affected individuals
8. Offer remediation (credit monitoring, etc.)
Preparation
Before Incidents Happen
- Incident response plan documented
- Roles and escalation paths defined
- Communication templates prepared
- Legal counsel identified
- Forensic capabilities ready
- Backup and recovery tested
- Insurance reviewed (cyber liability)
Regular Practice
- Tabletop exercises quarterly
- Full simulation annually
- Playbook review and update
- New team member training
Key Takeaways
- Classify incidents by severity to drive appropriate response
- Contain first, investigate second—stop the bleeding
- Preserve evidence before making changes
- Define clear roles: IC, Security Lead, Comms Lead, Scribe
- Communicate regularly internally; carefully externally
- Document everything during and after
- Conduct blameless post-mortems within 48 hours
- Prepare before incidents: plans, playbooks, practice
When an incident happens, you don’t rise to the occasion—you fall to your level of preparation.