WebSocket Protocol

A full-duplex communication protocol that runs over a single TCP connection. WebSocket enables real-time, bidirectional data exchange between browsers and servers without the overhead of repeated HTTP requests.

Type

Application Layer

Port

80 (ws) / 443 (wss)

Transport

TCP

Standard

RFC 6455

What is WebSocket?

WebSocket is a communication protocol defined in RFC 6455, published by the IETF in 2011. It provides full-duplex communication channels over a single TCP connection, allowing both the client and the server to send data at any time without waiting for the other side to initiate. The W3C standardized the WebSocket API for browsers alongside the protocol itself, giving JavaScript direct access to persistent, low-latency connections.

Before WebSocket existed, web applications that needed real-time updates had to rely on workarounds. HTTP polling sends repeated requests on a timer, wasting bandwidth when nothing has changed. Long polling holds a request open until the server has data, then immediately reopens another connection. Comet and streaming hacks pushed data through chunked HTTP responses, but they were fragile and inconsistent across browsers. All of these approaches fight against HTTP's request-response design. WebSocket was built from the ground up for bidirectional, persistent communication.

WebSocket is used in production by some of the largest platforms on the internet. Slack and Discord use it for real-time messaging. Google Docs and Figma rely on it for collaborative editing. Bloomberg Terminal and Robinhood use WebSocket connections to stream live financial quotes. Multiplayer games, live sports scoreboards, and notification systems all depend on it. The protocol runs on ports 80 and 443 (the same as HTTP and HTTPS), and it starts life as a normal HTTP request before upgrading. This makes it firewall-friendly and compatible with existing web infrastructure, including load balancers and reverse proxies like Nginx and HAProxy.

How WebSocket Works: The Upgrade Handshake

Every WebSocket connection begins as a standard HTTP request. The client sends an HTTP GET with two special headers: Upgrade: websocket and Connection: Upgrade. It also includes a Sec-WebSocket-Key header containing a base64-encoded 16-byte random value. This key is not for security. It exists to confirm that the server actually understands WebSocket and is not just a generic HTTP server that happens to return 101 by accident.

The server validates the upgrade request, concatenates the client's key with the magic GUID 258EAFA5-E914-47DA-95CA-C5AB0DC85B11, computes the SHA-1 hash, and base64-encodes the result. It sends this back as the Sec-WebSocket-Accept header in its 101 Switching Protocols response. Once the client verifies the accept value matches what it expects, the HTTP connection is upgraded to the WebSocket protocol. From this point on, both sides communicate using WebSocket binary frames instead of HTTP.

The TCP connection established during the initial HTTP handshake stays open for the entire WebSocket session. There is no need to reconnect or re-authenticate for each message. Either side can send a message at any time, making the communication truly full-duplex. This is fundamentally different from HTTP, where only the client can initiate an exchange.

WebSocket Connection LifecycleClientServerUPGRADEGET /chat HTTP/1.1 | Upgrade: websocketHTTP/1.1 101 Switching ProtocolsWebSocket Connection OpenMESSAGESText Frame: "Hello server"Text Frame: "Hello client"Binary Frame: [data]Ping (opcode 0x9)Pong (opcode 0xA)CLOSEClose Frame (code 1000)Close Frame (response)TCP Connection Closedjustprotocols.com
WebSocket connection lifecycle: HTTP upgrade handshake, full-duplex message exchange with ping/pong keepalive, and close handshake.

WebSocket Frame Structure

After the upgrade handshake, all data travels in WebSocket frames. Each frame has a compact binary header followed by the payload. The minimum header size is just 2 bytes for server-to-client frames, or 6 bytes for client-to-server frames (which include a 4-byte masking key). This is dramatically smaller than HTTP headers, which typically add hundreds of bytes per request.

The first byte of the frame contains the FIN bit and the opcode. The FIN bit indicates whether this is the final fragment of a message. If FIN is 0, more fragments will follow. The opcode identifies the frame type: 0x1 for text (UTF-8), 0x2 for binary, 0x0 for continuation frames (subsequent fragments), 0x8 for close, 0x9 for ping, and 0xA for pong. Three reserved bits (RSV1, RSV2, RSV3) are available for protocol extensions like permessage-deflate compression.

The second byte starts with the MASK bit, followed by a 7-bit payload length field. If the length value is 0-125, that is the actual payload size. If it is 126, the next 2 bytes contain the actual length as a 16-bit unsigned integer. If it is 127, the next 8 bytes contain the length as a 64-bit unsigned integer. This encoding keeps small frames compact while supporting payloads up to 2^63 bytes. The MASK bit must be set to 1 for all client-to-server frames. When set, a 4-byte masking key follows the length field, and the payload is XOR-masked with this key. Masking exists to prevent proxy cache poisoning attacks where an attacker could trick intermediary caches into storing malicious content.

WebSocket Frame Structure01531FIN1 bitR11bR21bR31bOpcode4 bitsMASK1 bitPayload Length7 bitsExtended Payload Length (16 or 64 bits, if payload length = 126 or 127)Masking Key (32 bits, present only if MASK = 1, client-to-server only)Payload Data (variable length)Opcode ReferenceOpcodeFrame TypeOpcodeFrame Type0x0Continuation0x1Text0x2Binary0x8Close0x9Ping0xAPongMinimum frame size: 2 bytes (server-to-client) | 6 bytes (client-to-server, with masking key)justprotocols.com
WebSocket frame structure defined in RFC 6455, showing control bits, opcode, masking, payload length encoding, and the opcode values for text, binary, and control frames.

WebSocket Request and Response Examples

The WebSocket connection starts with an HTTP upgrade exchange. Below is a real-world example of the handshake headers, followed by a binary frame breakdown showing how a text message is encoded on the wire.

HTTP Upgrade Handshake

Client Request

GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Sec-WebSocket-Version: 13 Origin: http://example.com

Upgrade: websocket signals intent to switch protocols

Sec-WebSocket-Key is a random base64 value for handshake verification

Sec-WebSocket-Version: 13 identifies RFC 6455

Server Response

HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

101 Switching Protocols confirms the upgrade was accepted

Sec-WebSocket-Accept is the SHA-1 hash of the client key + magic GUID, base64-encoded

Binary Frame Example: Text Message "Hello"

After the upgrade, messages are sent as binary frames. Here is a client-to-server text frame carrying the payload "Hello" (5 bytes), shown in hexadecimal.

Client Frame (Masked)

81 85 37 FA 21 3D 7F 9F 4D 51 58

Frame Header:

81 = FIN=1, opcode=0x1 (text frame)85 = MASK=1, payload length=5

Masking Key:

37 FA 21 3D = 4-byte XOR masking key

Masked Payload:

7F 9F 4D 51 58 = "Hello" XOR-maskedUnmasked: 48 65 6C 6C 6F = H e l l o

Server-to-Client Frame (Unmasked)

Server-to-client frames do not use masking, so they are simpler. A server sending "Hello" back would produce this frame:

Server Frame (Unmasked)

81 05 48 65 6C 6C 6F
81 = FIN=1, opcode=0x1 (text)05 = MASK=0, length=548 65 6C 6C 6F = "Hello" in plain UTF-8

WebSocket vs HTTP Polling, Long Polling, and SSE

WebSocket is one of several approaches for real-time communication between browsers and servers. Each has different trade-offs in terms of complexity, overhead, and browser support. The following table compares the four most common approaches.

FeatureWebSocketHTTP PollingLong PollingSSE
ConnectionPersistent, single TCPNew request each intervalHeld open, then reconnectsPersistent, single HTTP
DirectionFull-duplex (both ways)Client-initiated onlyClient-initiated onlyServer-to-client only
Overhead per Message2-6 bytes frame headerFull HTTP headers each timeFull HTTP headers each timeSmall (event stream format)
LatencyVery low (instant)High (polling interval)Medium (reconnect delay)Low (server push)
Browser SupportAll modern browsersUniversalUniversalAll except IE (polyfill available)
Best Use CaseChat, gaming, tradingSimple status checksInfrequent updatesNews feeds, notifications
ComplexityModerate (connection mgmt)SimpleModerateSimple

Ping, Pong, and Connection Health

WebSocket includes a built-in keepalive mechanism using ping and pong control frames. Either the client or the server can send a ping frame (opcode 0x9) at any time. The receiving side must respond with a pong frame (opcode 0xA) containing the same payload data. This confirms that the connection is still alive and the remote endpoint is responsive.

In practice, most WebSocket servers send pings on a regular interval, typically every 30 to 60 seconds. If no pong comes back within a reasonable timeout, the server assumes the connection is dead and closes it. Browsers handle pong responses automatically through the WebSocket API, so client-side JavaScript does not need to implement pong logic manually. Some proxy servers and load balancers also rely on ping/pong to keep idle connections from being terminated.

The close handshake uses opcode 0x8. Either side can initiate it by sending a close frame with an optional status code and reason string. The other side responds with its own close frame, and then the underlying TCP connection is terminated. Common close codes include 1000 (Normal Closure), 1001 (Going Away, for example when a server is shutting down), 1002 (Protocol Error), 1003 (Unsupported Data), and 1006 (Abnormal Closure, when the connection drops without a close frame). The close handshake ensures both sides have a chance to clean up resources gracefully.

WebSocket Subprotocols and Extensions

The WebSocket protocol itself only defines how to transport messages. It does not specify the format or meaning of the data being sent. For applications that need a structured message format, WebSocket supports subprotocol negotiation through the Sec-WebSocket-Protocol header. The client lists the subprotocols it supports in the upgrade request, and the server picks one and includes it in the 101 response. Common subprotocols include graphql-ws and graphql-transport-ws for GraphQL subscriptions, mqtt for MQTT over WebSocket, stomp for the STOMP messaging protocol, and wamp for the Web Application Messaging Protocol.

WebSocket also supports extensions through the Sec-WebSocket-Extensions header. The most widely used extension is permessage-deflate, which compresses message payloads using the DEFLATE algorithm. This can reduce bandwidth usage significantly for text-heavy messages like JSON payloads. The extension is negotiated during the handshake, and once agreed upon, both sides compress outgoing frames and decompress incoming frames transparently. Extensions modify the framing layer, which is why RSV bits are reserved in the frame header for extension use.

Key Features of WebSocket

  • Full-duplex over a single TCP connection: both client and server can send messages independently at any time, with no request-response pairing required.
  • Low overhead: the minimum frame header is just 2 bytes (server-to-client), compared to hundreds of bytes for HTTP headers on every request.
  • Binary and text message support: frames can carry UTF-8 text or raw binary data, making WebSocket suitable for everything from chat messages to video streams.
  • Firewall-friendly: WebSocket runs on ports 80 and 443, the same ports as HTTP and HTTPS, so it works through most firewalls and proxies without special configuration.
  • Subprotocol negotiation: the handshake can negotiate application-level protocols like graphql-ws, mqtt, or stomp on top of the WebSocket transport.
  • Extension support: features like permessage-deflate compression can be negotiated at the protocol level, reducing bandwidth without application code changes.
  • Built-in ping/pong keepalive: control frames for connection health monitoring are part of the protocol specification.
  • TLS encryption via wss://: WebSocket Secure (wss) runs over TLS on port 443, providing the same encryption guarantees as HTTPS.

Common Use Cases for WebSocket

  • Real-time chat: Slack, Discord, and Microsoft Teams use WebSocket to deliver messages instantly without polling.
  • Collaborative editing: Google Docs, Figma, and Notion use WebSocket connections to synchronize edits across multiple users in real time.
  • Live dashboards and monitoring: operations teams use WebSocket to stream server metrics, application logs, and alerting data to dashboards without page refreshes.
  • Financial trading platforms: Bloomberg Terminal, Robinhood, and cryptocurrency exchanges stream real-time price quotes and order book updates over WebSocket.
  • Multiplayer gaming: browser-based games use WebSocket for low-latency player position updates, game state synchronization, and real-time interactions.
  • Live sports scores and betting: sportsbook platforms push score updates, odds changes, and play-by-play data to thousands of concurrent users.
  • IoT device control panels: web-based interfaces for smart home devices and industrial equipment use WebSocket for responsive, bidirectional control.
  • Notification systems: web applications push alerts, badges, and toast notifications to users in real time without polling an API endpoint.

Frequently Asked Questions About WebSocket

What is the difference between WebSocket and HTTP?

HTTP is a request-response protocol where the client must initiate every exchange. The server cannot send data to the client unless the client asks for it first. WebSocket is a full-duplex protocol where both sides can send data independently after the initial handshake. HTTP creates a new connection (or reuses one via keep-alive) for each request-response pair, while WebSocket maintains a single persistent connection for the entire session. WebSocket also has much lower per-message overhead: a 2-byte frame header versus hundreds of bytes of HTTP headers.

Is WebSocket secure?

WebSocket itself does not mandate encryption, but the wss:// scheme runs the protocol over TLS, providing the same level of encryption as HTTPS. In production, you should always use wss:// instead of ws://. Beyond transport encryption, WebSocket does not define authentication or authorization mechanisms. Applications typically authenticate during the HTTP upgrade handshake using cookies, tokens in query parameters, or custom headers. Once the connection is established, the application layer is responsible for authorization of individual messages.

Do all browsers support WebSocket?

Yes. WebSocket has been supported in all major browsers since around 2012. Chrome, Firefox, Safari, Edge, and Opera all support the WebSocket API. Internet Explorer added support in version 10. Mobile browsers on iOS and Android also support WebSocket fully. There is no need for polyfills in modern web development. The WebSocket API is also available in Node.js through libraries like ws, and native WebSocket support exists in Deno and Bun.

What is the difference between ws:// and wss://?

The ws:// scheme is unencrypted WebSocket, analogous to HTTP. The wss:// scheme is WebSocket over TLS, analogous to HTTPS. The ws:// scheme uses port 80 by default, while wss:// uses port 443. In practice, you should always use wss:// in production. Many browsers block mixed content, meaning a page served over HTTPS cannot open a ws:// connection. Using wss:// also prevents intermediary proxies from interfering with WebSocket traffic.

Can WebSocket work through proxies and firewalls?

WebSocket runs on ports 80 and 443, so it passes through most firewalls that allow standard web traffic. However, some older HTTP proxies that do not understand the Upgrade mechanism may break the handshake. Using wss:// (TLS) typically solves this problem because the proxy cannot inspect the encrypted traffic and simply tunnels it through. Modern reverse proxies like Nginx, HAProxy, and Cloudflare all support WebSocket natively. If you are behind a corporate proxy that strips Upgrade headers, wss:// through a CONNECT tunnel is usually the fix.

When should I use SSE instead of WebSocket?

Server-Sent Events (SSE) is a simpler choice when you only need server-to-client data flow. SSE uses a plain HTTP connection, supports automatic reconnection with event IDs, and works with standard HTTP/2 multiplexing. It is ideal for news feeds, stock tickers, and notification streams where the client only receives data. WebSocket is the better choice when you need bidirectional communication, such as chat applications, collaborative editing, or gaming. If your server needs to receive frequent messages from the client, WebSocket is the right tool. If the client only listens, SSE is simpler to implement and easier to scale behind standard HTTP infrastructure.

Related Protocols

  • HTTP: the request-response protocol that WebSocket upgrades from during the initial handshake
  • HTTPS: HTTP with TLS encryption, analogous to the wss:// WebSocket scheme
  • TCP: the transport protocol that WebSocket runs on, providing reliable, ordered delivery
  • TLS: the encryption layer used by wss:// for secure WebSocket connections
  • MQTT: a publish-subscribe messaging protocol that can run over WebSocket for browser-based IoT dashboards