Why Every Developer Should Understand Networking

September 19, 2016

“It works on my machine” often translates to “I don’t understand networking.” Application developers increasingly work with networked systems—microservices, APIs, cloud infrastructure—but many lack fundamental networking knowledge. When things fail, they’re helpless.

Understanding networking won’t make you a network engineer. But it will help you debug problems, design better systems, and communicate effectively with infrastructure teams. Here are the fundamentals every developer should know.

The Network Stack

Network communication happens in layers. Each layer handles specific concerns and provides services to layers above.

Physical Layer

The actual transmission medium: ethernet cables, fiber optics, wireless signals. Developers rarely interact with this layer directly, but understanding that physical constraints exist matters. Light speed limits latency. Bandwidth has physical limits.

Handles communication on a local network segment. MAC addresses identify devices. Switches forward frames between devices on the same network.

Key concepts:

Network Layer (IP)

Handles routing between networks. IP addresses identify hosts across the internet. Routers forward packets between networks.

Key concepts:

Transport Layer (TCP/UDP)

Handles end-to-end communication between applications. Port numbers identify specific services on a host.

TCP (Transmission Control Protocol):

UDP (User Datagram Protocol):

Application Layer

Application-specific protocols built on TCP or UDP: HTTP, SMTP, FTP, PostgreSQL protocol, etc. This is where developers spend most time.

TCP Deep Dive

Most application communication uses TCP. Understanding TCP explains many application behaviors.

Three-Way Handshake

TCP connections begin with a handshake:

  1. Client sends SYN (synchronize)
  2. Server responds with SYN-ACK (synchronize-acknowledge)
  3. Client responds with ACK (acknowledge)

This takes one round-trip time (RTT) before data can flow. For distant servers, RTT might be 100-200ms—noticeable latency just to establish a connection.

Connection State

TCP connections have states:

Check connection states with netstat or ss:

ss -tan | grep ESTABLISHED

Congestion Control

TCP adjusts sending rate based on network conditions. New connections start slowly (slow start), increasing rate until packet loss indicates congestion.

This means:

Nagle’s Algorithm and Delayed ACK

By default, TCP batches small packets (Nagle’s algorithm) and delays acknowledgments. This improves efficiency but adds latency.

For latency-sensitive applications (interactive protocols, real-time systems), you might disable Nagle’s algorithm:

socket.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

DNS

Domain Name System translates hostnames to IP addresses. Every networked application depends on DNS.

DNS Resolution

When you connect to api.example.com:

  1. Application calls resolver library
  2. Resolver checks local cache
  3. If not cached, queries configured DNS server
  4. DNS server may query other servers (root, TLD, authoritative)
  5. Response cached according to TTL (Time To Live)

DNS Issues

Common DNS problems:

Check DNS resolution:

dig api.example.com
nslookup api.example.com

DNS and Application Design

HTTP

Most applications communicate via HTTP. Understanding HTTP’s network behavior helps debug issues.

Connection Reuse

HTTP/1.1 introduced persistent connections: multiple requests over one TCP connection. This avoids handshake overhead per request.

HTTP/2 multiplexes many requests over one connection, further improving efficiency.

Connection reuse matters most for high-latency connections (mobile, distant servers).

HTTP Latency Anatomy

A simple HTTP request involves:

  1. DNS resolution (if hostname not cached)
  2. TCP handshake (if new connection)
  3. TLS handshake (if HTTPS, adds 1-2 RTT)
  4. Request transmission
  5. Server processing
  6. Response transmission

For a new HTTPS connection to a server 100ms away:

With connection reuse:

This is why connection pooling and keepalive matter.

Debugging Network Issues

When network things break, systematic debugging helps.

Check Connectivity

Basic connectivity tests:

# Can you reach the host at all?
ping api.example.com

# Can you reach the specific port?
telnet api.example.com 443
nc -zv api.example.com 443

DNS Debugging

Verify DNS resolution:

# What IP does the hostname resolve to?
dig api.example.com

# Is DNS resolution slow?
time dig api.example.com

Connection Debugging

Examine connection state:

# Active connections
ss -tan

# Connections to specific host
ss -tan dst api.example.com

# Connections in TIME_WAIT (potential port exhaustion)
ss -tan state time-wait | wc -l

Traffic Capture

For deep debugging, capture network traffic:

# Capture HTTP traffic
tcpdump -i any -A port 80

# Capture and write to file for Wireshark analysis
tcpdump -i any -w capture.pcap host api.example.com

Traffic captures reveal exactly what’s happening on the wire: malformed requests, unexpected responses, timing issues.

Application-Level Tools

Use application-level tools too:

# HTTP request with timing breakdown
curl -w "@curl-format.txt" -o /dev/null -s https://api.example.com

# Where curl-format.txt contains:
#   time_namelookup:  %{time_namelookup}\n
#   time_connect:     %{time_connect}\n
#   time_appconnect:  %{time_appconnect}\n
#   time_pretransfer: %{time_pretransfer}\n
#   time_starttransfer: %{time_starttransfer}\n
#   time_total:       %{time_total}\n

Designing for Networks

Understanding networking influences application design:

Connection Pooling

Reuse TCP connections rather than creating new ones per request. Most HTTP client libraries support connection pooling; ensure it’s enabled and configured appropriately.

Timeouts

Set explicit timeouts for all network operations:

Operations without timeouts can hang indefinitely when networks fail.

Retries with Backoff

Networks have transient failures. Implement retries with exponential backoff:

def request_with_retry(url, max_retries=3):
    for attempt in range(max_retries):
        try:
            return requests.get(url, timeout=5)
        except requests.RequestException:
            if attempt == max_retries - 1:
                raise
            time.sleep(2 ** attempt)  # 1, 2, 4 seconds

Graceful Degradation

Design applications to function (perhaps with reduced capability) when network dependencies fail. Cache responses, provide fallbacks, and fail fast when dependencies are unavailable.

Key Takeaways