IPsec Protocol Explained: Tunnel Mode, ESP, AH, IKEv2 & VPN Security

What is IPsec?

IPsec (Internet Protocol Security) is a suite of protocols designed to secure IP communications by authenticating and encrypting every packet in a data stream. Unlike application-layer security protocols such as TLS, IPsec operates at Layer 3 (the network layer) of the OSI model. This means it can protect all traffic flowing between two endpoints transparently — applications do not need to be modified or even aware that encryption is taking place.

The IPsec framework was developed by the IETF (Internet Engineering Task Force) throughout the 1990s, with the original architecture defined in RFC 2401 (1998) and later updated by RFC 4301 (2005). IPsec is mandatory to implement in IPv6 and optional in IPv4, though in practice it is used extensively with both. The protocol suite consists of two core sub-protocols: AH (Authentication Header) for integrity and authentication, and ESP (Encapsulating Security Payload) for encryption plus authentication. In modern deployments, ESP is used almost exclusively because it provides both confidentiality and integrity.

IPsec uses a concept called Security Associations (SAs) to track the state of each secured connection. An SA is a unidirectional agreement between two peers that defines the cryptographic algorithms, keys, and parameters to use for protecting traffic. Each SA is uniquely identified by a Security Parameters Index (SPI), the destination IP address, and the security protocol (AH or ESP). Because SAs are unidirectional, a typical bidirectional IPsec connection requires a pair of SAs — one for each direction.

IPsec is the foundation of most VPN (Virtual Private Network) implementations. It powers site-to-site VPNs connecting corporate offices, remote access VPNs for mobile workers, and cloud interconnects linking on-premises data centers to public cloud providers. Major platforms including Cisco IOS, strongSwan, Juniper Junos, AWS VPN Gateway, and Azure VPN Gateway all rely on IPsec for their VPN services.

IPsec Modes: Transport vs Tunnel

IPsec can operate in two distinct modes, each suited to different network architectures. The choice of mode determines which parts of the original IP packet are protected and how the packet is structured on the wire.

Transport Mode

In transport mode, IPsec protects only the payload of the original IP packet. The original IP header is left intact and unencrypted, with the IPsec header (AH or ESP) inserted between the IP header and the upper-layer protocol data (such as TCP or UDP). This mode is used for host-to-host communication where both endpoints are the actual source and destination of the traffic. A common example is securing communication between two servers on the same network segment.

Because transport mode preserves the original IP header, routers along the path can read the source and destination addresses normally. This makes transport mode slightly more efficient than tunnel mode since there is no overhead from an additional IP header. However, it also means that an observer can see which two hosts are communicating, even though the payload is encrypted.

Tunnel Mode

In tunnel mode, the entire original IP packet — including its header — is encrypted and encapsulated inside a new IP packet with a new outer header. The outer header contains the addresses of the IPsec tunnel endpoints (typically VPN gateways), while the inner header contains the actual source and destination addresses of the communicating hosts. This is the mode used for site-to-site VPNs and remote access VPNs.

Tunnel mode provides stronger privacy because an observer on the network between the two gateways can only see traffic flowing between the gateway IP addresses. The actual endpoints, port numbers, and protocol types of the inner traffic are all hidden inside the encrypted payload. This is why tunnel mode is the default and most widely deployed mode in IPsec VPN configurations.

In Transport Mode, only the payload is encrypted while the original IP header remains visible. In Tunnel Mode, the entire original packet is encapsulated and encrypted, with a new IP header prepended for routing.

IPsec Protocols: AH and ESP

IPsec defines two distinct protocols for protecting traffic, each identified by its own IP protocol number. While both provide authentication and integrity, they differ significantly in their capabilities, particularly regarding encryption and NAT compatibility.

Authentication Header (AH)

AH (IP protocol 51) provides data integrity and origin authentication for IP packets, but it does not provide encryption. AH computes an Integrity Check Value (ICV) over the IP header fields (excluding mutable fields like TTL), the AH header itself, and the payload. This means AH can detect any modification to the packet, including changes to the IP header — a property that ESP does not offer for the outer IP header.

However, AH's protection of the IP header is also its greatest weakness in modern networks. Because NAT (Network Address Translation) modifies the IP header, any packet protected by AH will fail integrity verification after passing through a NAT device. This fundamental incompatibility with NAT has made AH rarely used in practice.

Encapsulating Security Payload (ESP)

ESP (IP protocol 50) is the workhorse of IPsec. It provides confidentiality (encryption), data integrity, and origin authentication. ESP encrypts the payload (and the original IP header in tunnel mode) using symmetric ciphers like AES-CBC or AES-GCM, and appends an authentication tag (ICV) that covers the ESP header and the encrypted payload. Unlike AH, ESP does not protect the outer IP header, which is precisely what makes it compatible with NAT when used with NAT Traversal (NAT-T).

NAT-T (defined in RFC 3948) encapsulates ESP packets inside UDP port 4500, allowing them to pass through NAT devices that would otherwise drop IP protocol 50 packets. Virtually all modern IPsec implementations support NAT-T, and it is negotiated automatically during the IKE exchange when NAT is detected between the peers.

Feature	AH (Protocol 51)	ESP (Protocol 50)
Encryption	No	Yes
Authentication	Yes	Yes
IP Header Protection	Yes	No
NAT Compatible	No	Yes (with NAT-T)
Common Usage	Rare	Standard

IKE: Internet Key Exchange

Before IPsec can protect traffic, the two peers must agree on cryptographic algorithms, exchange keying material, and authenticate each other. This negotiation is handled by the Internet Key Exchange (IKE) protocol. IKEv2, defined in RFC 7296, is the current version and has largely replaced the older IKEv1 (RFC 2409). IKEv2 is simpler, more efficient, and more robust, requiring only two exchanges (four messages total) to establish an IPsec tunnel, compared to IKEv1's six or nine messages depending on the mode used.

IKE_SA_INIT Exchange

The first exchange consists of a single round trip (two messages). The initiator and responder negotiate cryptographic algorithms for protecting the IKE session itself, exchange Diffie-Hellman public values to generate a shared secret, and exchange nonces (random values) to ensure freshness and prevent replay attacks. After this exchange, both sides derive the IKE SA (Security Association) keys that will encrypt all subsequent IKE messages. Notably, the IKE_SA_INIT messages themselves are sent in the clear, though they are integrity-protected by the Diffie-Hellman exchange.

IKE_AUTH Exchange

The second exchange is another single round trip, but this time it is fully encrypted under the IKE SA established in the previous step. In IKE_AUTH, the peers authenticate their identities using either X.509 certificates or pre-shared keys (PSK). They also negotiate the parameters for the first Child SA — the actual IPsec SA that will protect user traffic. The Child SA defines the encryption algorithm (e.g., AES-256-GCM), the integrity algorithm, the traffic selectors (which subnets or hosts to protect), and the SA lifetime.

IKEv2 maintains two important databases: the Security Policy Database (SPD), which defines which traffic should be protected and how, and the Security Association Database (SAD), which stores the active SAs with their keys and parameters. When an outbound packet matches a policy in the SPD, the corresponding SA from the SAD is used to apply the configured protection. Additional Child SAs can be created within an existing IKE SA using the CREATE_CHILD_SA exchange, and SAs are rekeyed periodically to limit the amount of data encrypted under a single key.

IKEv2 handshake: the IKE_SA_INIT exchange negotiates cryptographic parameters, then the IKE_AUTH exchange authenticates both parties and establishes the IPsec tunnel.

ESP Packet Structure

The ESP packet format is carefully designed to provide both confidentiality and integrity while supporting a variety of encryption algorithms and modes. Understanding the structure of an ESP packet helps clarify how IPsec protects data on the wire.

The packet begins with the Security Parameters Index (SPI), a 4-byte field that identifies which Security Association should be used to process the packet. The receiving host looks up the SPI in its SAD to find the correct decryption key and algorithm. Following the SPI is the Sequence Number, also 4 bytes, which provides anti-replay protection. The receiver maintains a sliding window of accepted sequence numbers and rejects any packet with a duplicate or out-of-window sequence number, preventing an attacker from capturing and retransmitting packets.

Next comes the Initialization Vector (IV), whose length depends on the negotiated cipher. For AES-CBC, the IV is 16 bytes; for AES-GCM, it is 8 bytes (combined with a 4-byte implicit salt from the SA). The IV ensures that identical plaintext blocks produce different ciphertext, preventing pattern analysis. The Payload Data follows the IV and contains the actual encrypted content — either the original transport-layer data (in transport mode) or the entire original IP packet (in tunnel mode).

After the encrypted payload, ESP appends Padding bytes to align the data to the block cipher's block size (16 bytes for AES) and to ensure the next two fields fall on a 4-byte boundary. The Pad Length field (1 byte) indicates how many padding bytes were added, allowing the receiver to strip them after decryption. The Next Header field (1 byte) identifies the protocol type of the encapsulated data (e.g., 6 for TCP, 17 for UDP, or 4 for IP-in-IP in tunnel mode). Finally, the Integrity Check Value (ICV) is appended at the very end. This authentication tag (typically 12-16 bytes for HMAC-SHA-256 or AES-GCM) covers the entire ESP packet from the SPI through the Next Header field, enabling the receiver to verify that the packet has not been tampered with before attempting decryption.

IPsec vs TLS

IPsec and TLS are both widely deployed security protocols, but they operate at different layers of the network stack and serve different use cases. IPsec secures traffic at the network layer, making it invisible to applications, while TLS secures individual connections at the transport and application layers. The choice between them depends on the scope of protection needed and the network architecture.

Feature	IPsec	TLS
OSI Layer	Layer 3 (Network)	Layer 4-7 (Transport/Application)
Scope	All IP traffic on tunnel	Per-connection
Transparency	Apps need no changes	Apps must use TLS API
Performance	Hardware offload common	Software-based
NAT Traversal	Requires NAT-T encapsulation	Works naturally
Authentication	Certificates or PSK via IKE	Certificates via handshake
Primary Use Case	VPNs and site-to-site	Web, APIs, and email

In practice, the two protocols are often complementary rather than competing. A remote worker might connect to a corporate network over an IPsec VPN, and then access internal web applications over HTTPS (which uses TLS). The IPsec tunnel protects the network path, while TLS provides end-to-end encryption for sensitive application data. Some organizations use both layers simultaneously for defense in depth.

Common Use Cases for IPsec

Site-to-site VPN: connecting two or more office networks over the public internet with encrypted tunnels between gateway devices, allowing seamless communication between private subnets
Remote access VPN: enabling individual users to securely connect to a corporate network from any location using IKEv2-based VPN clients on laptops and mobile devices
Cloud interconnect: establishing encrypted tunnels between on-premises data centers and cloud providers using services like AWS VPN Gateway, Azure VPN Gateway, and GCP Cloud VPN, all of which use IPsec under the hood
GRE over IPsec: combining Generic Routing Encapsulation (GRE) for multicast and routing protocol support with IPsec for encryption, commonly used in enterprise WAN architectures
IPv6 networks: IPsec is a mandatory part of the IPv6 specification and is used to secure host-to-host communication natively, without the NAT complications common in IPv4 environments
DMVPN (Dynamic Multipoint VPN): a Cisco technology that uses IPsec with GRE and NHRP to create dynamic, scalable VPN topologies where spoke sites can communicate directly without routing traffic through a central hub

Frequently Asked Questions About IPsec

What is the difference between IPsec and TLS/SSL?

IPsec operates at the network layer (Layer 3) and secures all IP traffic between two endpoints, regardless of the application. TLS operates at the transport/application layer and secures individual connections on a per-application basis. IPsec is typically used for VPNs where you need to protect all traffic flowing between two networks or between a user and a network. TLS is used to secure specific application protocols like HTTPS, SMTPS, and database connections. IPsec requires no changes to applications but needs system-level or gateway-level configuration, while TLS must be integrated into each application but works seamlessly through NAT and firewalls.

What is a Security Association (SA)?

A Security Association is a one-way relationship between two IPsec peers that defines how traffic should be protected. Each SA specifies the encryption algorithm (e.g., AES-256-GCM), the authentication method, the cryptographic keys, the SA lifetime, and the mode (transport or tunnel). SAs are identified by three values: the Security Parameters Index (SPI), the destination IP address, and the protocol (AH or ESP). Because SAs are unidirectional, a pair of SAs is needed for bidirectional communication. SAs are stored in the Security Association Database (SAD) and are created and managed by the IKE protocol.

Does IPsec work with NAT?

IPsec ESP can work through NAT devices using NAT Traversal (NAT-T), defined in RFC 3948. NAT-T detects the presence of NAT during the IKE negotiation and automatically encapsulates ESP packets inside UDP on port 4500. This allows the packets to pass through NAT devices that would otherwise drop them because they cannot process IP protocol 50 (ESP) directly. AH, on the other hand, is fundamentally incompatible with NAT because it authenticates the IP header, which NAT modifies. This is one of the main reasons AH is rarely used in modern networks.

What is IKEv2 and how is it different from IKEv1?

IKEv2 (RFC 7296) is the current version of the Internet Key Exchange protocol used to negotiate IPsec Security Associations. Compared to IKEv1, it is significantly simpler and more efficient. IKEv2 establishes a tunnel in just four messages (two exchanges), while IKEv1 required six messages in Main Mode or three in Aggressive Mode (which sacrificed identity protection). IKEv2 also has built-in support for NAT-T, MOBIKE (allowing VPN clients to change IP addresses without re-establishing the tunnel), and EAP (Extensible Authentication Protocol) for flexible user authentication. IKEv2 is more reliable in handling network failures and supports automatic rekeying and dead peer detection natively.

Is IPsec used in IPv6?

Yes. IPsec support is mandatory in all IPv6 implementations, as specified in the original IPv6 standards. The AH and ESP extension headers are part of the IPv6 header chain. In practice, IPv6 networks benefit from IPsec because they typically do not use NAT (since the vast address space eliminates the need for address translation), removing one of the biggest deployment challenges IPsec faces in IPv4 environments. However, while IPv6 mandates the capability to use IPsec, it does not mandate that all IPv6 traffic must be encrypted — that remains a policy decision for network administrators.

What is the difference between transport mode and tunnel mode?

Transport mode encrypts only the payload of the IP packet while leaving the original IP header intact. It is used for direct host-to-host communication where both endpoints are the actual source and destination. Tunnel mode encrypts the entire original IP packet and wraps it in a new IP header with the tunnel endpoint addresses. It is used for VPN scenarios where traffic between two private networks is routed through encrypted gateways. Tunnel mode hides the original source and destination addresses from anyone observing traffic between the gateways, providing stronger privacy. The vast majority of IPsec deployments use tunnel mode.

Related Protocols

TLS: the transport-layer security protocol, commonly compared to IPsec for securing network communication
SSL: the predecessor to TLS, now deprecated but historically significant in the evolution of network security
TCP: the transport protocol that carries IKE negotiations (port 500/4500 over UDP) and is protected when flowing through IPsec tunnels
UDP: used by IKE for key exchange on port 500 and by NAT-T for ESP encapsulation on port 4500
OSPF: a routing protocol frequently deployed over IPsec tunnels to exchange routes between sites in VPN architectures