UDPspeeder, optimised
A fork of UDPspeeder, an FEC tunnel that masks loss on lossy, high-latency links by sending redundant packets so the receiver can rebuild what gets dropped. The exercise here is to push the per-packet cost towards the theoretical floor: the unavoidable data movement plus the irreducible Reed-Solomon arithmetic, with as little else in between as possible. The hot paths are rewritten with SIMD (x86_64 AVX2/SSE4.2, ARM64 NEON, MIPS, PowerPC e500v2, RISC-V), selected at runtime, with a scalar fallback. On Linux it uses io_uring on receive and GSO batching on send.
The numbers, on Linux: 970 Mbps without FEC and 690 Mbps with FEC+GSO. rs_encode is roughly 14x faster than the original, addmul1 up to 15x. Windows lacks a GSO equivalent and lands at 193 / 122 Mbps via IOCP. Static binaries for Linux, OpenWrt musl and Windows x86_64 are at the GitHub releases.
Built with Claude Opus 4.6.
Project page: https://slartibardfast.github.io/UDPspeeder/