In my discussion of per-packet versus per-destination load sharing, I've relied on the "accepted wisdom" that out-of-order TCP packets reduce session performance (as a side note, out-of-order UDP packets are a true performance killer; just try running NFS with out-of-order packets).
Today I've discovered another huge show-stopper: stateful firewalls (read: almost everything in use today) might just drop out-of-order packets, resulting in TCP timeouts and retransmissions (and repeated timeouts will totally wreck the session throughput). Here's how Cisco devices handle this problem:
- PIX allows three out-of-order packets per TCP session (cannot be changed, but should be enough)
- You can configure out-of-order packet handling on ASA with the queue-lenght parameter of a tcp-map.
- Cisco IOS firewall (formerly known as CBAC) drops out-of-order packets until release 12.4(11)T where you can use the ip inspect tcp reassembly configuration command (and it looks like the zone-based firewall configuration is not yet supported).