PRELIMINARY RESULT Benchmark Suite v1.0.15 - Results subject to change

OpenSSL Performance Benchmark

Post-Quantum Cryptography3 iterations per version
Last run: 2026-01-05 • Iterations: 1.1.1w (3x), 3.0.18 (3x), 3.1.8 (3x), 3.2.6 (3x), 3.3.5 (3x), 3.4.3 (3x), 3.5.4 (3x), 3.6.0 (3x)

Quantum-Resistant vs Classical Key Exchange

What This Chart Shows

ML-KEM-768 (purple bars) is a post-quantum cryptographic algorithm designed to resist attacks from quantum computers. It's compared against ECDH P-256 and P-384 (green/yellow bars), which are the classical algorithms used today but vulnerable to quantum attacks.

Key Insight: Higher bars = more key exchanges per second. ML-KEM provides quantum resistance with competitive performance!

Important: This chart measures key exchange operations (establishing shared secrets for encryption). This is different from digital signature operations (signing/verifying) shown on the Schmatz page. While both ECDH and ECDSA use elliptic curves, they perform fundamentally different cryptographic operations and their performance metrics are not directly comparable.

QUANTUM RESISTANT

ML-KEM-768 (Post-Quantum)

Security: Resistant to quantum computer attacks

Algorithm: Lattice-based cryptography (CRYSTALS-Kyber)

Standard: NIST FIPS 203 (August 2024)

Key Size: 1,184 bytes public key

Use Case: Future-proof key exchange

QUANTUM VULNERABLE

ECDH P-256 (Classical)

Security: Secure today, vulnerable to quantum

Algorithm: Elliptic curve cryptography

Standard: NIST P-256 curve

Key Size: 32 bytes public key

Use Case: Current standard for TLS

QUANTUM VULNERABLE

ECDH P-384 (Classical)

Security: Secure today, vulnerable to quantum

Algorithm: Elliptic curve cryptography

Standard: NIST P-384 curve

Key Size: 48 bytes public key

Use Case: High-security applications today

Key Takeaways

Performance is Production-Ready

ML-KEM-768 is competitive with classical ECDH algorithms. The performance overhead is minimal—often faster than ECDH P-384 and comparable to P-256.

Bottom line: You can adopt post-quantum cryptography without significant performance penalties.

The Real Tradeoff: Bandwidth, Not Speed

ML-KEM-768 adds ~2 KB per TLS handshake (1,184-byte public key + 1,088-byte ciphertext vs. 32 bytes for ECDH P-256).

Let's do the math for different scenarios:

Scenario New Connections/sec Extra Bandwidth Per Day
[HIGH] E-commerce Peak
(Black Friday, major sale)
40,000 640 Mbps
(80 MB/sec)
6.9 TB
[HIGH] Busy CDN Edge
(Major content distributor)
10,000 160 Mbps
(20 MB/sec)
1.7 TB
[MEDIUM] Popular Website
(News site, SaaS platform)
1,000 16 Mbps
(2 MB/sec)
173 GB
[LOW] Typical Website
(Small business, blog)
100 1.6 Mbps
(200 KB/sec)
17 GB

When ML-KEM Bandwidth Becomes a Problem:

  • High-traffic sites: Extra 100s of Mbps to multi-Gbps bandwidth costs real money (at $0.05-0.15/GB for cloud egress, 6.9 TB/day = $345-1,035/day)
  • DDoS amplification: Connection floods now consume 37x more bandwidth per handshake
  • Mobile networks: 2G/3G connections with limited bandwidth budgets
  • Satellite/IoT: Expensive per-byte costs (satellite can be $5-50/MB)
  • Geographic regions: Countries with expensive or limited internet infrastructure

Context: For most sites, ML-KEM handshake overhead is still <1% of total bandwidth (images, videos, and application data dominate). But for the busiest sites processing tens of thousands of new connections per second, this is hundreds of Mbps to Gbps of additional sustained bandwidth cost.

Latency Impact: How 2 KB Affects Page Load Times

Bandwidth isn't just about cost—it's about user experience. That extra 2 KB must be transmitted during the TLS handshake, adding latency before your application data can flow.

Transmission time for 2 KB by network type:

Network Type Typical Speed Extra Latency Impact
[FAST] Fiber/Cable Broadband 100 Mbps +0.16 ms Imperceptible
[FAST] 5G (Sub-6 GHz) 150 Mbps +0.11 ms Imperceptible
[MODERATE] LTE / Fast 4G 30 Mbps +0.5 ms Barely noticeable
[MODERATE] Average 4G 10 Mbps +1.6 ms Minor
🟠 Slow 4G / Rural 3 Mbps +5.5 ms Noticeable on slow sites
[SLOW] 3G / HSPA+ 1.5 Mbps +11 ms Noticeable delay
[SLOW] 2G / EDGE 250 Kbps +65 ms Significant delay
[SLOW] Satellite / High Latency 2 Mbps +8 ms Adds to existing latency (500-700ms typical)

Real-World Latency Impact by Region:

  • Developing markets: Where 2G/3G is still common (parts of Africa, rural Asia, Latin America), the extra 10-65ms is felt on every new connection
  • Mobile-first regions: In markets where mobile is primary internet access (India, Southeast Asia), slow 4G/3G means 5-11ms added latency
  • Rural areas globally: Limited infrastructure = slower connections = more noticeable delays
  • Network congestion: When towers are overloaded (concerts, stadiums, emergencies), effective bandwidth drops and latency multiplies

Mobile User Experience Impact:

For users on fast connections (LTE+, broadband), the <2ms latency is imperceptible. But for the 2+ billion users still on 2G/3G or slow 4G connections, ML-KEM adds meaningful delay to every new connection. Combined with typical mobile latency (50-200ms), this compounds the "slow web" problem in bandwidth-constrained regions.

Mitigation Strategies:

  • TLS session resumption: Reuse sessions to avoid repeated handshakes (latency hit only on first connection)
  • Connection pooling: Keep connections alive longer (HTTP keep-alive, connection: keep-alive headers)
  • CDN edge nodes: Place content closer to users to reduce base latency
  • Hybrid mode (X25519+MLKEM768): Slightly larger but maintains security if either algorithm fails
  • Gradual rollout: Start with fast-connection markets, delay adoption for 2G/3G-heavy regions

Why This Matters: The Quantum Threat

"Harvest Now, Decrypt Later" attacks: Adversaries are capturing encrypted traffic today to decrypt once quantum computers become available.

Timeline: While large-scale quantum computers don't exist yet, the cryptographic community recommends migrating now, as infrastructure changes take years.

Your data: If it needs protection beyond 2030, consider post-quantum cryptography today.

Real-World Impact: When Does Key Exchange Happen?

Understanding "Operations Per Second"

Key exchange is NOT performed on every HTTP request. It only happens during the initial TLS handshake when establishing a new connection:

  • First connection to a server (full handshake with key exchange)
  • After session expiration (typically hours or days later)
  • New browser tab/window (sometimes, depends on browser session cache)

What happens on every request: Only fast symmetric encryption (AES-256-GCM) using the keys established during the initial handshake. Modern browsers reuse TLS connections for multiple HTTP requests (HTTP keep-alive), so one key exchange can secure hundreds of requests.

How TLS Works (with ML-KEM or ECDH)

Initial TLS Handshake (happens once per connection):

  1. Key Exchange: ML-KEM-768 (or ECDH) establishes a shared secret between client and server
  2. Key Derivation: Both sides derive symmetric encryption keys from that shared secret
  3. Handshake Complete: Secure connection is ready for application data

Every HTTP Request After That:

  • Only symmetric encryption (AES-256-GCM, ChaCha20-Poly1305, etc.)
  • Uses the keys established in step 2 above
  • No more ML-KEM operations - key exchange is done
  • Very fast (symmetric crypto is ~1000x faster than asymmetric)

The Key Point: ML-KEM vs ECDH only affects the initial key exchange. Once you have symmetric keys, there's zero difference in ongoing performance between a connection established with ML-KEM vs ECDH.

So when we say "31,600 ML-KEM operations per second," we're talking about:

  • 31,600 new TLS connections per second
  • Each connection can then serve hundreds or thousands of requests using symmetric crypto
  • The ongoing requests are all the same speed regardless of whether ML-KEM or ECDH was used

This is why the performance overhead of ML-KEM is so minimal - it only affects the initial handshake, which is a tiny fraction of overall traffic!

Who Needs High Key Exchange Performance?

ML-KEM's performance matters most for high-traffic servers processing many NEW connections per second:

  • E-commerce sites during peak sales (thousands of new shoppers/second)
  • News sites during breaking events (traffic spikes)
  • API gateways and load balancers (many service-to-service connections)
  • CDN edge servers (serving millions of unique users)

Example: A server handling 10,000 concurrent users might only need 100-500 key exchanges per second (for new arrivals and expired sessions), while serving 50,000+ HTTP requests per second using existing connections. Both ML-KEM (31K ops/sec) and ECDH (16K ops/sec) easily handle this load.

For typical websites: The ML-KEM performance is more than sufficient. The 1-2ms added to initial connection time is negligible compared to network latency (50-200ms) and is barely noticeable to end users.

Migration Recommendations

Recommended Approach: Hybrid Mode

OpenSSL 3.5+ supports hybrid key exchange that combines classical + post-quantum:

  • X25519MLKEM768 - X25519 + ML-KEM-768
  • SecP256r1MLKEM768 - ECDH P-256 + ML-KEM-768

Why hybrid? You get security if either algorithm is broken. It provides quantum resistance from ML-KEM while maintaining confidence from battle-tested classical crypto.

When to Migrate

Use Case Recommendation Timeline
Government/Defense Start migration now 2025-2026
Financial Services Plan migration, test now 2026-2027
Healthcare/Long-term Data Evaluate and pilot 2026-2028
General Web Services Monitor and prepare 2027-2030

🔬 ML-DSA (Dilithium) Rejection Sampling Analysis

What is Rejection Sampling?

Unlike classical signature algorithms like ECDSA, ML-DSA (Dilithium) uses rejection sampling during signature generation. The algorithm may need to retry internally if certain mathematical conditions aren't met—this is a critical security feature that prevents private key leakage through side-channel attacks.

Why it matters for stress testing: Under high load, the retry mechanism can cause timing variance. Operations that require multiple retries take longer, potentially causing latency spikes.

🔄 How Dilithium Signing Works

1
Generate random masking: The signer generates a random masking polynomial y
2
Compute candidate: Calculate z = y + c·s (where c is the challenge, s is the secret key)
3
Rejection check: If z is "too close" to s (would leak secret key information), REJECT and restart from step 1
4
Success: On average, this takes 4-7 attempts to produce a valid signature

This retry mechanism is fundamental to Dilithium's security—it ensures signatures don't leak information about the private key.

📊 Understanding the Metrics

Metric Description Why It Matters
CV% Coefficient of Variation (stddev/mean × 100) Normalized variance metric. >10% = significant retry activity
Outliers Operations taking >2× the mean time Multi-retry scenarios causing latency spikes
P99 99th percentile latency Tail latency for capacity planning
P99.9 99.9th percentile (1 in 1,000) Extreme tail latency for high-traffic systems
P99.99 99.99th percentile (1 in 10,000) Worst-case latency for SLA guarantees
Max/Min Ratio of slowest to fastest operation Large ratio = high variance in retry counts

Interpreting Results

CV < 5%
Exceptionally stable
CV 5-10%
Normal variance
CV 10-20%
Moderate variance
CV > 20%
High variance - investigate

✅ Expected Behavior

Dilithium is designed to average 4-7 internal retries per signature. This is normal and expected. The benchmark measures whether this variability causes problematic latency spikes under real workloads.

Note: Verification is deterministic—it doesn't use rejection sampling, so verification timing should be very consistent (low CV%).

🆚 ML-DSA vs ECDSA: Timing Characteristics

Algorithm Signing Behavior Timing Variance Security Note
ML-DSA (Dilithium) Rejection sampling with retries Higher variance (expected) Variance is a security feature
ECDSA (RFC 6979) Deterministic, no retries Very consistent timing Constant-time implementation

🧪 Stress Testing Recommendations

To further analyze the retry mechanism under stress:

  • CPU contention: Run concurrent signing operations to see if contention affects retry rates
  • Extended duration: Longer test runs can catch rare high-retry edge cases
  • Compare security levels: ML-DSA-44 vs ML-DSA-65 vs ML-DSA-87 (higher levels = more retries on average)
  • Memory pressure: Test under memory constraints to see if it affects variance

Run the standalone test: ./scripts/test-mldsa-retry.sh 3.6.0

📚 Further Reading

← Back to Overview
View on GitHub
Open Source Benchmark
Found a problem? Have an improvement?
Fork the repository and submit a pull request!
Licensed under Apache 2.0 • Community-driven development • v1.0.15