Load testing validates system behavior under stress before users experience it. But many load tests are unrealistic, poorly designed, or produce misleading results. Here’s how to do load testing effectively.
Types of Load Tests
Baseline Testing
Establish normal performance:
Goal: Understand system behavior at expected load
Load: Current production traffic patterns
Duration: 1-2 hours
Metrics: Latency, throughput, error rate, resource utilization
Stress Testing
Find breaking points:
Goal: Discover where system fails
Load: Increase until failure
Pattern: Ramp up gradually
Metrics: When does latency spike? When do errors occur?
Spike Testing
Handle sudden load increases:
Goal: Validate autoscaling and sudden load handling
Load: Normal → 10x → Normal
Duration: Spike for 5-10 minutes
Metrics: Recovery time, errors during spike
Soak Testing
Find issues that emerge over time:
Goal: Identify memory leaks, connection issues
Load: Sustained moderate load
Duration: 12-24 hours
Metrics: Resource trends, error rate over time
Chaos + Load Testing
Combine with failure injection:
Goal: Validate graceful degradation under load
Load: Normal production load
Failures: Kill pods, inject latency, network partition
Metrics: Service behavior during failures
Realistic Load Patterns
Model Real Traffic
// Bad: Constant load
for (let i = 0; i < 1000; i++) {
makeRequest();
}
// Good: Realistic distribution
const loadProfile = [
{ hour: 0, rps: 100 },
{ hour: 8, rps: 500 }, // Morning peak
{ hour: 12, rps: 800 }, // Lunch peak
{ hour: 18, rps: 600 }, // Evening
{ hour: 23, rps: 150 }, // Night
];
Vary Request Types
// Production distribution
const requestMix = {
'GET /products': 60, // Most common
'GET /products/:id': 25, // Product details
'POST /cart': 10, // Add to cart
'POST /checkout': 5, // Checkout
};
Realistic Data
// Use production-like data volumes
const testUser = getTestUser(); // Has real order history
const products = getPopularProducts(); // Actual product IDs
Think Time
Real users pause between actions:
scenario('browse_and_buy', {
exec: async () => {
await viewHomepage();
await sleep(2); // User browses
await viewProduct(randomProduct());
await sleep(5); // User reads
await addToCart();
await sleep(3); // User decides
await checkout();
}
});
Tool Selection
k6
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp up
{ duration: '5m', target: 100 }, // Sustain
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
const res = http.get('https://api.example.com/products');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}
Locust
from locust import HttpUser, task, between
class WebsiteUser(HttpUser):
wait_time = between(1, 5)
@task(10)
def view_products(self):
self.client.get("/products")
@task(5)
def view_product(self):
product_id = random.choice(PRODUCT_IDS)
self.client.get(f"/products/{product_id}")
@task(1)
def checkout(self):
self.client.post("/checkout", json=ORDER_DATA)
Gatling
class BasicSimulation extends Simulation {
val httpProtocol = http.baseUrl("https://api.example.com")
val scn = scenario("BasicLoad")
.exec(http("Get Products").get("/products"))
.pause(2)
.exec(http("Get Product").get("/products/123"))
setUp(
scn.inject(
rampUsers(100).during(2.minutes),
constantUsersPerSec(50).during(5.minutes)
)
).protocols(httpProtocol)
}
Infrastructure Considerations
Test Environment
Options:
- Production: Most realistic, highest risk
- Production clone: Expensive but accurate
- Scaled staging: Cost-effective, less accurate
Distributed Load Generation
Single machine has limits:
# k6 Cloud or distributed setup
k6 cloud run script.js --vus 10000
# Or self-hosted with multiple generators
k6 run --execution-segment "0:1/4" script.js # Machine 1
k6 run --execution-segment "1/4:2/4" script.js # Machine 2
k6 run --execution-segment "2/4:3/4" script.js # Machine 3
k6 run --execution-segment "3/4:1" script.js # Machine 4
Baseline Measurements
Always know what you’re comparing against:
Before test:
- Current CPU utilization
- Current memory usage
- Baseline latency
- Current connection count
Analyzing Results
Key Metrics
Response time:
- Average (less useful)
- Percentiles: p50, p90, p95, p99 (more useful)
- Max (outliers)
Throughput:
- Requests per second
- Successful requests per second
Errors:
- Error rate
- Error types
- When errors started
Resources:
- CPU utilization
- Memory usage
- Database connections
- Network I/O
Finding Bottlenecks
High CPU → Application or algorithm issue
High Memory → Memory leak or insufficient allocation
High DB connections → Connection pool exhaustion
High latency with low CPU → External dependency or I/O wait
Results Documentation
## Load Test Results: 2019-08-26
### Configuration
- Target: api.example.com
- Duration: 30 minutes
- Peak load: 1000 RPS
### Results
| Metric | Target | Actual |
|--------|--------|--------|
| P95 Latency | < 500ms | 420ms |
| Error Rate | < 1% | 0.3% |
| Max Throughput | 1000 RPS | 1200 RPS |
### Observations
- Database CPU reached 80% at 800 RPS
- Connection pool exhaustion at 1100 RPS
### Recommendations
- Increase connection pool from 50 to 100
- Add read replica for query-heavy endpoints
CI/CD Integration
Automated Load Tests
# GitHub Actions
load-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run load test
run: |
k6 run --out json=results.json load-test.js
- name: Check thresholds
run: |
if grep -q '"thresholds":{"http_req_duration":{"ok":false}' results.json; then
echo "Performance regression detected"
exit 1
fi
Pre-Production Gates
# Don't deploy if load test fails
stages:
- build
- test
- load_test # Must pass
- deploy
Key Takeaways
- Different tests serve different purposes: baseline, stress, spike, soak
- Model realistic traffic patterns, request mixes, and data
- Include think time between requests
- Run from distributed load generators for high loads
- Focus on percentiles, not averages
- Analyze resource utilization to find bottlenecks
- Document results with clear metrics and recommendations
- Integrate load tests into CI/CD pipeline
Load testing prevents production surprises. Invest in realistic, regular testing.