This page explains the technical reasons behind the performance differences observed in OpenSSL version 3.2 compared to 3.1. Understanding these changes helps system administrators and developers make informed decisions about which OpenSSL version to deploy.
How to Use This Page: Start with the Executive Summary for key takeaways, then dive into the technical details if you want to understand the root causes. The benchmark data table shows actual measurements from our test environment.
Version Analysis Series
This is part of a series analyzing performance changes across OpenSSL versions:
This page: OpenSSL 3.2 vs 3.1 — The performance recovery story
OpenSSL 3.2.0 (released November 23, 2023) delivered significant performance improvements over the 3.1.x series, marking a turning point in recovering the performance lost during the 3.0 architecture transition. This analysis examines the key code changes that enabled 2-2.5× improvement in TLS handshake performance.
~2,500
Commits between 3.1.0 and 3.2.0
+100-150%
TLS handshake improvement
Nov 2023
OpenSSL 3.2.0 Release
Key Finding: The dramatic performance improvement comes from provider architecture optimizations that reduced per-operation overhead. OpenSSL 3.0/3.1 introduced significant dispatch overhead with the provider model; version 3.2 addressed the most critical bottlenecks.
The Primary Cause: Provider Architecture Optimization
OpenSSL 3.0 introduced a new "provider" architecture that added flexibility but also introduced significant per-operation overhead. The 3.2 release included focused optimizations to reduce this overhead.
The Core Improvement: Reduced per-operation provider dispatch overhead through better caching, algorithm fetching optimization, and streamlined context management.
What Was Slow in OpenSSL 3.0/3.1
The Provider Overhead Problem
Algorithm Fetching: Every cryptographic operation required looking up the algorithm implementation through the provider framework
Context Creation: Creating EVP contexts had significant overhead due to provider dispatch
Property Queries: The property-based algorithm selection system added string parsing overhead
Lock Contention: Provider operations required locking that hurt multi-threaded performance
What OpenSSL 3.2 Fixed
Improved Algorithm Caching: Better caching of fetched algorithm implementations
Major optimization effort, approaching 1.1.1w levels
3.4.x
~100-110%
Continued optimization, often exceeds 1.1.1w
Key Insight: The ~2.5× improvement from 3.1 to 3.2 is primarily due to the accumulation of performance fixes addressing the provider overhead introduced in 3.0. This was a focused effort by the OpenSSL team in response to community performance concerns.
Data Source: These results are from our local benchmark runs comparing the 3.1.x and 3.2.x series.
Metric
OpenSSL 3.1.x
OpenSSL 3.2.x
Change
TLS 1.3 RSA Handshakes/sec
~2,500
~6,200
+148%
TLS 1.3 ECDSA Handshakes/sec
~7,500
~11,500
+53%
TLS 1.2 RSA Handshakes/sec
~2,600
~6,600
+154%
TLS 1.2 ECDSA Handshakes/sec
~7,600
~11,300
+49%
RSA-2048 Sign/sec
~8,250
~8,350
+1.2%
ECDSA P-256 Sign/sec
~43,000
~43,500
+1.2%
Analysis: The handshake performance improvements are dramatic (50-150%), while raw cryptographic operations (RSA/ECDSA signing) show minimal change. This confirms the improvement is in the TLS stack and provider dispatch overhead, not in the underlying crypto primitives.
Major Feature Changes in OpenSSL 3.2
Beyond performance, OpenSSL 3.2 introduced several significant features:
Client-Side QUIC Support (RFC 9000)
Full client-side QUIC implementation with multiple streams
New APIs for QUIC connection and stream management
Foundation for future server-side QUIC (added in 3.5)
New in 3.2: OpenSSL 3.2 introduced client-side QUIC support (RFC 9000), marking the beginning of OpenSSL's QUIC journey. Server-side QUIC was later added in OpenSSL 3.5.
What is QUIC?
QUIC (Quick UDP Internet Connections) is a transport layer protocol originally designed by Google and standardized by the IETF as RFC 9000. It fundamentally reimagines how secure connections work by combining the transport layer (like TCP) with the encryption layer (like TLS) into a single, optimized protocol.
API: Use SSL_accept_stream(), SSL_get_accept_stream_queue_len()
Why the phased approach? Server-side QUIC is significantly more complex:
Connection multiplexing: Servers must handle thousands of concurrent connections
Address validation: Servers need to prevent IP spoofing attacks (Retry tokens)
0-RTT anti-replay: Servers must track 0-RTT tickets to prevent replay attacks
Migration handling: Servers must gracefully handle clients changing IP addresses
What is QLOG?
Added in OpenSSL 3.3: QLOG support enables deep protocol inspection and debugging for QUIC connections.
QLOG (QUIC Logging) is a standardized JSON-based logging format specifically designed for QUIC and HTTP/3. It captures detailed protocol events that are essential for debugging, performance analysis, and interoperability testing.
⚠️ The QUIC + Post-Quantum Challenge: MTU and Large Keys
Critical Issue: Post-quantum cryptography creates significant complications for QUIC due to UDP packet size constraints. This is an active area of research and concern in the cryptographic community.
QUIC was designed with specific assumptions about key sizes that post-quantum cryptography fundamentally challenges:
The Problem: Key Size Explosion
Algorithm
Public Key Size
Ciphertext Size
Total Key Share
X25519 (classical)
32 bytes
32 bytes
64 bytes
P-256 (classical)
65 bytes
65 bytes
130 bytes
ML-KEM-768 (PQC)
1,184 bytes
1,088 bytes
2,272 bytes
X25519MLKEM768 (hybrid)
1,216 bytes
1,120 bytes
2,336 bytes
QUIC's MTU Constraint
QUIC Initial packets: Must fit within ~1,200 bytes (minimum QUIC MTU per RFC 9000)
Typical UDP MTU: ~1,472 bytes on most networks (1500 - 28 header bytes)
Problem: A hybrid key share (2,336 bytes) exceeds the entire packet!
Impact: ClientHello may require fragmentation or multiple round trips
Why This Matters for Performance
Classical QUIC Handshake
ClientHello + X25519 key share (~300 bytes)
↓ Fits in 1 packet ✓
1 RTT handshake
PQC QUIC Handshake (Challenge)
ClientHello + ML-KEM key share (~2,500+ bytes)
↓ Exceeds MTU! Multiple packets needed
Potential fragmentation overhead
Current Solutions and Research
Approaches Being Explored
QUIC coalescing: Multiple QUIC packets in a single UDP datagram
Deferred key exchange: Send PQC key share in a subsequent packet
Smaller PQC algorithms: ML-KEM-512 is smaller but provides less security
Path MTU discovery: Negotiate larger MTU when available
Compression techniques: Research into compressible key formats
OpenSSL's Approach (3.5+): OpenSSL handles this by allowing multiple key shares and supporting HelloRetryRequest (HRR) fallback. If the initial key share doesn't work, the server can request a different one. However, this adds latency—potentially negating QUIC's speed benefits.
Testing QUIC: OpenSSL provides QUIC testing capabilities through its s_server and s_client tools with the -quic flag. For benchmarking, we measure the cryptographic operations that underpin QUIC connections (key exchange, packet protection) rather than network-dependent metrics.
Recommendation: For production systems requiring maximum performance, OpenSSL 3.2+ provides the best balance of modern features and performance. Organizations still on 3.0.x or 3.1.x should prioritize upgrading to capture these gains.
Why This Matters for Your Deployment
Understanding the performance changes between OpenSSL versions helps you make informed decisions: