A year ago, we migrated our primary API from REST to GraphQL. The promise was compelling: clients request exactly what they need, strong typing, and better developer experience. We’ve now processed billions of GraphQL queries in production.
Here’s what we’ve learned.
What Worked
Client Flexibility
The primary promise delivered. Clients get exactly the data they need:
# Mobile app - minimal data for list view
query ProductList {
products(first: 20) {
id
name
thumbnailUrl
price
}
}
# Web app - rich data for detail view
query ProductDetail($id: ID!) {
product(id: $id) {
id
name
description
fullImageUrl
price
inventory
reviews {
rating
comment
author { name }
}
relatedProducts {
id
name
thumbnailUrl
}
}
}
Mobile apps use less bandwidth. Web apps get rich data. Same API.
Type Safety
GraphQL’s type system caught errors early:
type Product {
id: ID!
name: String!
price: Float!
description: String
inventory: Int!
}
TypeScript clients generated from schema ensure type safety end-to-end. Breaking changes are detected at build time, not runtime.
Documentation
Schema is documentation:
"""
A product in the catalog.
Products can be queried by ID or searched.
"""
type Product {
"""The unique product identifier."""
id: ID!
"""Display name for the product."""
name: String!
"""Current price in USD."""
price: Float!
}
GraphQL Playground provides interactive documentation. Developers explore the API without external docs.
Frontend Developer Experience
Frontend teams report significant productivity improvements:
- No more waiting for backend to add endpoints
- Self-service data requirements
- Local development with schema mocking
- Clear contract between frontend and backend
Schema Evolution
Adding fields is non-breaking:
# Before
type Product {
id: ID!
name: String!
price: Float!
}
# After - add fields without breaking clients
type Product {
id: ID!
name: String!
price: Float!
salePrice: Float # New optional field
onSale: Boolean! # New required field with default resolver
}
Old clients continue working. New clients use new fields.
What Didn’t Work
N+1 Query Problem
Naive implementation creates database query explosions:
query {
products(first: 100) {
id
name
category { # N queries for categories
name
}
reviews { # N queries for reviews
rating
}
}
}
Without optimization, this executes 201 database queries.
Solution: DataLoader
const categoryLoader = new DataLoader(async (categoryIds) => {
const categories = await db.categories.findByIds(categoryIds);
return categoryIds.map(id => categories.find(c => c.id === id));
});
const resolvers = {
Product: {
category: (product) => categoryLoader.load(product.categoryId),
},
};
DataLoader batches and caches within a request. Essential for performance.
Query Complexity Attacks
Without limits, malicious queries can overwhelm servers:
query Evil {
products(first: 1000) {
reviews {
author {
products {
reviews {
author {
products {
# Exponential explosion
}
}
}
}
}
}
}
}
Solutions:
// Query complexity analysis
const complexityLimiter = graphqlComplexity({
maximumComplexity: 1000,
variables: req.body.variables,
onComplete: (complexity) => {
console.log('Query Complexity:', complexity);
},
});
// Query depth limiting
const depthLimiter = depthLimit(5);
// Rate limiting by query cost
const rateLimiter = costBasedRateLimiter({
maxCost: 10000,
window: '1 minute',
});
Implement query analysis before production.
Caching Complexity
REST’s HTTP caching doesn’t work with GraphQL POST requests:
# REST - cacheable
GET /products/123
Cache-Control: max-age=3600
# GraphQL - not HTTP cacheable
POST /graphql
{query: "{ product(id: 123) { name } }"}
Solutions:
// Persisted queries - cache by query hash
const persistedQueries = {
'abc123': 'query ProductDetail($id: ID!) { product(id: $id) { name } }',
};
// CDN caching with @cacheControl directive
type Product @cacheControl(maxAge: 3600) {
id: ID!
name: String! @cacheControl(maxAge: 86400)
inventory: Int! @cacheControl(maxAge: 0) # Never cache
}
// Response caching
const cache = new RedisCache();
const cachePlugin = responseCachePlugin({ cache });
Caching requires more thought than REST.
Monitoring Difficulty
Traditional endpoint-based monitoring doesn’t work:
# REST - clear what's happening
GET /products - 50ms
GET /users/123 - 30ms
# GraphQL - all one endpoint
POST /graphql - 150ms (which query? what was slow?)
Solution: Per-field tracing
const server = new ApolloServer({
plugins: [
{
requestDidStart() {
return {
executionDidStart() {
return {
willResolveField({ info }) {
const start = Date.now();
return () => {
const duration = Date.now() - start;
if (duration > 100) {
logger.warn(`Slow field: ${info.fieldName} (${duration}ms)`);
}
};
},
};
},
};
},
},
],
});
Apollo Studio or similar tools provide query-level visibility.
Error Handling Nuances
GraphQL returns 200 OK with errors in the response:
{
"data": {
"product": null
},
"errors": [
{
"message": "Product not found",
"path": ["product"]
}
]
}
Clients must check for partial errors. Monitoring tools expecting HTTP error codes miss GraphQL errors.
Schema Design Mistakes
Early schema decisions are hard to change:
# Our mistake: too-specific naming
type ProductSearchResult {
products: [Product!]!
totalCount: Int!
}
# Better: generic connection pattern
type ProductConnection {
edges: [ProductEdge!]!
pageInfo: PageInfo!
}
We’re still living with early design decisions.
What We’d Do Differently
Start with Schema Design
We built resolvers first, then designed schema around them. Should have:
- Design schema from client perspective
- Review with frontend teams
- Iterate on design
- Then implement resolvers
Schema-first development produces better APIs.
Implement Complexity Limiting Earlier
We added query complexity limits after incidents. Should be there from day one.
Use Persisted Queries From Start
Persisted queries provide:
- Better security (clients can’t send arbitrary queries)
- Better caching
- Smaller request payloads
- Query analysis at build time
We migrated to persisted queries later; it was harder than starting with them.
Better Tooling Investment
We underinvested in:
- Schema governance tools
- Breaking change detection
- Performance monitoring
- Development tooling
Good tooling multiplies productivity.
Federation Earlier
Our monolithic schema became unwieldy. Apollo Federation lets teams own their schema portions:
# Products service
type Product @key(fields: "id") {
id: ID!
name: String!
price: Float!
}
# Reviews service - extends Product
extend type Product @key(fields: "id") {
id: ID! @external
reviews: [Review!]!
}
We’re migrating to federation now; earlier adoption would have scaled better.
Recommendations
For New Projects
- Use GraphQL if: Multiple clients with different needs, complex data requirements, frontend teams that benefit from self-service
- Avoid GraphQL if: Simple CRUD, single client, team unfamiliar with GraphQL
Essential Practices
- DataLoader for N+1 prevention
- Query complexity limits
- Depth limits
- Field-level monitoring
- Schema testing
- Breaking change detection
Recommended Tools
- Apollo Server: Full-featured, great ecosystem
- GraphQL Code Generator: Type generation
- Apollo Studio: Monitoring and schema management
- GraphQL Inspector: Schema change detection
Key Takeaways
- Client flexibility is GraphQL’s biggest win—different clients get exactly what they need
- Type safety and self-documenting schemas improve developer experience
- N+1 queries require DataLoader or similar batching
- Implement query complexity limits before production
- Caching requires explicit strategies (persisted queries, response caching)
- Monitor at the field level, not just endpoint level
- Design schema from client perspective before implementing resolvers
- Consider federation early for team scalability
- GraphQL isn’t always the right choice—evaluate honestly
GraphQL has been net positive for us, but it requires investment in tooling and practices. The benefits are real when implemented well.