WebSocket
WebSocket (RFC 6455) provides full-duplex, bidirectional communication over a single TCP connection. Here’s a summary of the protocol.
Background & Motivation#
HTTP follows a request-response model — the client sends a request, the server replies. This works well for loading web pages, but falls short for real-time applications (chat, gaming, live dashboards) where the server needs to push data to the client at any time.
Before WebSocket, developers used workarounds:
| Technique | How It Works | Drawbacks |
|---|---|---|
| Polling | Client repeatedly sends HTTP requests at intervals | Wasteful — most responses are empty; high latency |
| Long Polling | Client sends request, server holds it open until data is available | Still one-directional; connection overhead on each response |
| SSE | Server pushes events over a long-lived HTTP connection | Unidirectional (server→client only); limited to text; 6 connections per domain under HTTP/1.1 |
All of these are hacks layered on top of HTTP’s request-response model. WebSocket solves this at the protocol level.
What Is It & How It Works#
WebSocket is a protocol that upgrades an HTTP connection into a persistent, full-duplex channel. After the initial handshake, both client and server can send messages independently at any time over the same TCP connection.
Key characteristics:
- Built on TCP: the HTTP upgrade handshake and all subsequent WebSocket frames share the same TCP connection — “upgrade” switches the application protocol, not the transport
- Full-duplex: both sides send/receive simultaneously
- Low overhead: no HTTP headers on every message — just a small frame header (2-14 bytes)
- URI schemes:
ws://(plaintext) andwss://(TLS-encrypted, analogous to HTTPS)
Traditional HTTP:
Client ──req──▶ Server
Client ◀──res── Server
Client ──req──▶ Server
Client ◀──res── Server
WebSocket (all over the same TCP connection):
Client ──HTTP Upgrade──▶ Server ← TCP + HTTP
Client ◀──101 Switching── Server ← TCP + HTTP
╔═══════════════╗
Client ◀══════════════▶ Server ← TCP + WebSocket frames
╚═══════════════╝
Protocol Details#
Opening Handshake#
The connection starts as a standard HTTP/1.1 request with an Upgrade header:
Client Request:
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Origin: http://example.com
Server Response:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
The Sec-WebSocket-Accept is computed as:
Base64(SHA-1(Sec-WebSocket-Key + "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"))
This is not for security — it simply confirms the server understands WebSocket and prevents accidental upgrades.
Framing Protocol#
After the handshake, data is transmitted as frames. Let’s walk through the entire echo example — from handshake to close — showing the actual bytes on the wire.
Frame Structure#
Each frame has a small header followed by payload data:
Byte 0 Byte 1 Byte 2 Byte 3
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
Key fields:
- FIN (1 bit): 1 = final fragment, 0 = more fragments follow
- Opcode (4 bits):
0x1text,0x2binary,0x8close,0x9ping,0xApong,0x0continuation - MASK (1 bit): 1 = payload is masked (required for client→server)
- Payload len (7 bits): The 7-bit field can hold values 0–127. Values 126 and 127 are not actual lengths — they are sentinels signaling “read more bytes for the real length”:
| 7-bit value | Meaning | Bytes used for length |
|---|---|---|
| 0–125 | Value is the payload length. Done. | 1 (the 7-bit field itself) |
| 126 | Real length in the next 2 bytes (up to 64 KB) | 3 (7-bit field + 2 bytes) |
| 127 | Real length in the next 8 bytes (up to 2^63) | 9 (7-bit field + 8 bytes) |
This is a space optimization — most WebSocket messages are small, so they only need 1 byte for length instead of wasting 8 bytes every time.
Dry Run: Echo “hello websocket”#
The payload is "hello websocket" = 15 bytes. Assume masking key = 0x37 0xFA 0x21 0x3D.
Step 1 — Opening Handshake (HTTP)
Client → Server:
GET /ws HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Server → Client:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Connection upgraded. From now on, only WebSocket frames are exchanged over this TCP connection.
Step 2 — Client sends “hello websocket” (masked)
Client→server frames must be masked. The masking XORs each payload byte with the key:
payload: h e l l o w e b s o c k e t
hex: 0x68 0x65 0x6C 0x6C 0x6F 0x20 0x77 0x65 0x62 0x73 0x6F 0x63 0x6B 0x65 0x74
mask key: 0x37 0xFA 0x21 0x3D 0x37 0xFA 0x21 0x3D 0x37 0xFA 0x21 0x3D 0x37 0xFA 0x21
XOR: 0x5F 0x9F 0x4D 0x51 0x58 0xDA 0x56 0x58 0x55 0x89 0x4E 0x5E 0x5C 0x9F 0x55
The frame on the wire:
Byte: 0x81 0x8F 0x37 0xFA 0x21 0x3D 0x5F 0x9F 0x4D ... (15 bytes masked payload)
──── ──── ────────────────────── ─────────────────────────────────────────────────
│ │ │ └─ Masked payload data
│ │ └─ Masking key (4 bytes)
│ └─ MASK=1, Payload len=15 (0b_1_0001111 = 0x8F)
└─ FIN=1, opcode=0x1 text (0b_1000_0001 = 0x81)
Total frame size: 2 (header) + 4 (mask key) + 15 (payload) = 21 bytes
Step 3 — Server echoes “hello websocket” (unmasked)
Server→client frames are not masked, so the payload is sent in plaintext:
Byte: 0x81 0x0F 0x68 0x65 0x6C 0x6C 0x6F 0x20 0x77 ... (15 bytes raw payload)
──── ──── ─────────────────────────────────────────────────────────────────────
│ │ └─ Payload data: "hello websocket" in UTF-8
│ └─ MASK=0, Payload len=15 (0b_0_0001111 = 0x0F)
└─ FIN=1, opcode=0x1 text (0x81)
Total frame size: 2 (header) + 15 (payload) = 17 bytes. No mask key needed.
Step 4 — Close Handshake
Client initiates close with status code 1000 (normal closure):
Client → Server (masked close frame):
Byte: 0x88 0x82 0x37 0xFA 0x21 0x3D 0x34 0x12
──── ──── ────────────────────── ────────────
│ │ │ └─ Masked status code 1000
│ │ │ (0x03E8 XOR 0x37FA = 0x3412)
│ │ └─ Masking key
│ └─ MASK=1, Payload len=2
└─ FIN=1, opcode=0x8 close (0x88)
Server → Client (unmasked close frame):
Byte: 0x88 0x02 0x03 0xE8
──── ──── ────────────
│ │ └─ Status code 1000 (0x03E8)
│ └─ MASK=0, Payload len=2
└─ FIN=1, opcode=0x8 close (0x88)
After both sides exchange close frames, the underlying TCP connection is terminated.
Summary#
The full exchange:
Client Server
│ │
│──── HTTP GET (Upgrade: websocket) ──────────▶│
│◀─── HTTP 101 Switching Protocols ────────────│
│ │
│ ---- WebSocket frames over same TCP conn --- │
│ │
│──── [0x81 0x8F ...] text "hello websocket" ─▶│ (masked, 21 bytes)
│◀─── [0x81 0x0F ...] text "hello websocket" ──│ (unmasked, 17 bytes)
│ │
│──── [0x88 0x82 ...] close (1000) ───────────▶│ (masked)
│◀─── [0x88 0x02 ...] close (1000) ────────────│ (unmasked)
│ │
│──── TCP FIN ────────────────────────────────▶│
│◀─── TCP FIN ─────────────────────────────────│
Why mask client→server only? The mask prevents cache-poisoning attacks where a malicious client crafts payloads that look like valid HTTP responses to confuse intermediary proxies. Since servers are trusted, server→client frames don’t need masking.
Common Close Status Codes#
| Code | Meaning |
|---|---|
| 1000 | Normal closure |
| 1001 | Going away (e.g. server shutdown, page navigated away) |
| 1002 | Protocol error |
| 1003 | Unsupported data type |
| 1006 | Abnormal closure (no Close frame received) |
| 1011 | Unexpected server error |
Closing Handshake#
Either side can initiate closure by sending a Close frame (opcode 0x8):
1. Client sends Close frame (status code + optional reason)
2. Server responds with Close frame
3. TCP connection is terminated
Common status codes:
| Code | Meaning |
|---|---|
| 1000 | Normal closure |
| 1001 | Going away (e.g. server shutdown, page navigated away) |
| 1002 | Protocol error |
| 1003 | Unsupported data type |
| 1006 | Abnormal closure (no Close frame received) |
| 1011 | Unexpected server error |
Usage & Use Cases#
When to Use WebSocket#
- Chat / messaging: bidirectional, low-latency message exchange
- Online gaming: real-time game state synchronization
- Collaborative editing: multiple users editing the same document
- Live dashboards: streaming metrics, stock tickers, sports scores
- AI agent workflows: multi-turn interactions with human-in-the-loop approval
When NOT to Use WebSocket#
| Scenario | Better Alternative | Why |
|---|---|---|
| Server pushes updates, client only listens | SSE | Simpler, auto-reconnect, works with HTTP/2 multiplexing |
| Standard request-response API | HTTP REST | No need for persistent connection |
| File upload / download | HTTP | WebSocket has no built-in content-type or caching |
| One-off data fetch | HTTP | Connection setup overhead not justified |
Comparison#
| Feature | WebSocket | SSE | HTTP Polling |
|---|---|---|---|
| Direction | Bidirectional | Server → Client | Client → Server (simulated) |
| Protocol | ws:// / wss:// |
HTTP | HTTP |
| Overhead per message | 2-14 bytes (frame header) | ~50 bytes (event format) | Full HTTP headers each time |
| Binary support | Yes | No (text only) | Yes |
| Auto-reconnect | No (manual) | Yes (built-in) | N/A |
| Connection limit (HTTP/1.1) | Browser-dependent | 6 per domain | 6 per domain |
| Proxy/firewall friendly | Sometimes blocked | Yes | Yes |
Hands-On Demo#
A minimal echo server and client in Go using gorilla/websocket:
Server (server.go):
package main
import (
"fmt"
"log"
"net/http"
"github.com/gorilla/websocket"
)
var upgrader = websocket.Upgrader{
CheckOrigin: func(r *http.Request) bool { return true },
}
func echo(w http.ResponseWriter, r *http.Request) {
conn, err := upgrader.Upgrade(w, r, nil)
if err != nil {
log.Println("upgrade:", err)
return
}
defer conn.Close()
for {
msgType, msg, err := conn.ReadMessage()
if err != nil {
log.Println("read:", err)
break
}
fmt.Printf("recv: %s\n", msg)
if err := conn.WriteMessage(msgType, msg); err != nil {
log.Println("write:", err)
break
}
}
}
func main() {
http.HandleFunc("/ws", echo)
log.Println("server started on :8080")
log.Fatal(http.ListenAndServe(":8080", nil))
}
Client (client.go):
package main
import (
"fmt"
"log"
"github.com/gorilla/websocket"
)
func main() {
conn, _, err := websocket.DefaultDialer.Dial("ws://localhost:8080/ws", nil)
if err != nil {
log.Fatal("dial:", err)
}
defer conn.Close()
// Send a message
msg := "hello websocket"
if err := conn.WriteMessage(websocket.TextMessage, []byte(msg)); err != nil {
log.Fatal("write:", err)
}
fmt.Println("sent:", msg)
// Read the echo
_, reply, err := conn.ReadMessage()
if err != nil {
log.Fatal("read:", err)
}
fmt.Println("recv:", string(reply))
}
Run it:
# Terminal 1
go run server.go
# Terminal 2
go run client.go
# Output:
# sent: hello websocket
# recv: hello websocket
You can also test with wscat (a CLI WebSocket client):
# Install
npm install -g wscat
# Connect to the echo server
wscat -c ws://localhost:8080/ws
> hello
< hello