This is an embarrassing response. The second lesson you should’ve learned as a systems engineer, long before any distributed stuff, is “turn off Nagle’s algorithm.” (The first being “it’s always DNS”.)
When the network is unreliable larger TCP packets ain’t gonna fix it.
Usually you have control over one of them only. If you run the whole network, sure, fix that instead. But if you don't, sending fewer larger packets can actually improve the situation even if it doesn't fix it.
When the network is unreliable larger TCP packets ain’t gonna fix it.