Async Communication: Making It Work for Engineering Teams

April 13, 2020

Synchronous communication—meetings, real-time chat, tapping someone on the shoulder—doesn’t work at scale, especially remotely. Every interruption costs context switching. Meetings block deep work. Time zones make synchronous impossible.

Asynchronous communication is the answer, but it requires intentional practice.

Why Async Matters

The Cost of Sync

Synchronous interruption cost:
- Immediate: 5 minutes to answer question
- Context switch: 15-25 minutes to regain focus
- Cascading: Others interrupted too

Total cost: 20-30+ minutes for a 5-minute question

Deep Work Requirements

Engineering requires focus:

Task complexity vs. time required:
- Bug fix: 30 min focused work
- Feature: 2-4 hours focused work
- Architecture: 4+ hours focused work

Fragmented time doesn't add up the same:
4 x 30 min ≠ 2 hours of continuous focus

Time Zone Reality

Global teams can’t all be online simultaneously:

Team across US, EU, Asia:
- 2-3 hours overlap at best
- Someone is always "after hours"
- Waiting for meetings delays work

Async removes time zone bottlenecks.

Effective Async Writing

Complete Messages

Include everything the recipient needs:

## Bad async message:
"Hey, quick question about the API"

## Good async message:
Subject: Question about rate limiting in /api/orders endpoint

I'm implementing retry logic for the mobile app and need to
understand the rate limiting behavior:

1. What are the current limits? (requests/minute, per user or global?)
2. Is there a Retry-After header when limited?
3. Should we use exponential backoff?

Context: We're seeing 429s in production logs during peak hours.
The mobile app currently doesn't handle these gracefully.

No urgency - tomorrow is fine. I'll check the code in the meantime.

Structure for Scanning

Busy people scan; structure helps:

## Update: Payment service migration

**Status:** On track
**Completion:** 60%
**ETA:** April 24

### Done this week:
- Migrated sandbox environment
- Updated CI/CD pipeline
- Fixed connection pooling issue (PR #234)

### Next week:
- Production migration (Wednesday 2am UTC)
- Monitoring dashboard updates
- Documentation

### Blockers:
None currently

### Questions/Input needed:
Should we keep the legacy endpoint for 30 or 60 days post-migration?
(Need answer by Tuesday for production planning)

Explicit Response Expectations

Be clear about what you need:

Urgency/response indicators:
- "No response needed" → FYI only
- "Thoughts welcome" → Optional input
- "Need decision by Tuesday" → Required response with deadline
- "URGENT: Production issue" → Respond ASAP

Async Patterns

Daily Status Updates

Replace standups with written updates:

## Daily Update: April 13

### Yesterday:
- Completed PR #456 (auth refactor)
- Reviewed PRs #457, #458
- Investigated memory issue in worker service

### Today:
- Address PR feedback on #456
- Start on JIRA-123 (add caching layer)
- 1:1 with Sarah (3pm UTC)

### Blockers:
- Waiting on design review for caching approach
  (pinged in #design channel, following up tomorrow if no response)

Async Decision Making

Decisions don’t require meetings:

## RFC: Switch from Redis to PostgreSQL for job queue

### Proposal:
Replace Redis-backed job queue with PostgreSQL-based solution
using SKIP LOCKED pattern.

### Why:
1. Reduce infrastructure complexity (one less system)
2. Transactional job creation with business data
3. Simpler local development

### Tradeoffs:
+ Simplicity: One less system to manage
+ Reliability: Jobs and data atomically consistent
- Performance: ~20% slower for high throughput
- Features: Lose some Redis-specific capabilities

### Decision criteria:
- Current throughput: 100 jobs/min (PostgreSQL can handle 1000+)
- Maintenance burden: High (Redis cluster issues monthly)
- Team experience: Stronger with PostgreSQL

### Proposed decision:
Proceed with migration to PostgreSQL.

### Input requested:
- Concerns with this approach?
- Edge cases we should consider?

Deadline: April 17. Will proceed if no blocking concerns raised.

Code Review Comments

Make review comments self-contained:

## PR Comment (good):
The connection pool here is created per-request. This will
exhaust connections under load.

Suggestion: Move pool initialization to application startup
and share across requests.

Example:
```python
# In app initialization
pool = create_pool(max_connections=20)

# In request handler
async with pool.acquire() as conn:
    result = await conn.execute(query)

Let me know if you’d like to discuss - happy to pair on this.


### Project Status

Keep stakeholders informed without meetings:

```markdown
## Weekly Project Update: Authentication Redesign

**Overall status:** 🟡 Yellow (slightly behind)

### Progress:
- Week 3 of 6
- Completed: 45% (planned: 50%)
- Core OAuth flow working in dev

### Highlights:
- Successfully integrated with IdP
- Performance testing shows 50% improvement over current system

### Risks:
1. Mobile SDK integration taking longer than estimated
   - Mitigation: Brought in additional resource
   - Impact: 2-3 day delay to mobile milestone

### Upcoming milestones:
- April 20: Staging deployment
- April 27: Security review
- May 4: Production rollout

### Questions for stakeholders:
None this week.

Next update: Monday, April 20.

Tools and Practices

Right Tool for Right Purpose

communication_channels:
  quick_questions:
    tool: Slack/Teams
    expectation: Response within 4 hours
    format: Short, can be interruptive

  decisions:
    tool: Notion/Confluence RFC
    expectation: Response within 2-3 days
    format: Structured document

  status_updates:
    tool: Project management tool
    expectation: Weekly read
    format: Standardized template

  deep_discussion:
    tool: Long-form doc with comments
    expectation: Async unless stuck
    format: Detailed writeup

  urgent:
    tool: PagerDuty/On-call system
    expectation: Immediate
    format: Critical only

Slack/Chat Hygiene

Chat can become a sync tool; keep it async:

practices:
  - Write complete thoughts, don't "ping first"
  - Use threads to contain discussions
  - Don't expect immediate responses
  - Set status when unavailable
  - Batch check messages (not constant)

channels:
  #team-updates: Announcements, status (low traffic)
  #team-help: Questions, discussions (medium traffic)
  #team-random: Social, off-topic (any traffic)
  #engineering-incidents: Urgent only (rare)

Documentation as Communication

Good docs reduce repetitive questions:

## Common patterns:
- Architecture decisions → ADRs (Architecture Decision Records)
- How-to → Runbooks and guides
- Project context → Project docs
- Team processes → Team handbook

Making Sync Count

Async doesn’t mean never sync. Use sync wisely:

When to Meet Synchronously

Effective Sync Time

## Meeting structure:
1. Async prep: Share materials 24h before
2. Sync time: Discussion and decisions only
3. Async follow-up: Document outcomes, assign actions

Example:
- Before: "Read RFC, add comments to doc"
- During: "Discuss open questions, decide"
- After: "I'll update the RFC with our decision and next steps"

Async Culture

Leadership Modeling

Leaders must practice async:

Response Time Expectations

Set clear norms:

response_expectations:
  slack: Same business day (4-8 hours)
  email: Within 24 hours
  pr_review: Within 24 hours
  rfc_comments: Within 3 days

exceptions:
  - On-call: Immediate for pages
  - Blockers: Escalate if waiting > 4 hours
  - Urgent: Use appropriate channel with clear "URGENT"

Respect for Deep Work

Cultural norms around focus:

Key Takeaways

Async communication is a skill. It takes practice, but teams that master it are more productive, more inclusive of different time zones, and create better documentation as a side effect.