A high-performance async Rust implementation of KCP - A Fast and Reliable ARQ Protocol built on top of Tokio.

https://github.com/leihuxi/rust-kcp A high-performance async Rust implementation of KCP - A Fast and Reliable ARQ Protocol built on top of Tokio. Features Async-First Design: Built from ground up for async/await with Tokio integration Zero-Copy: Efficient buffer management using the bytes crate Lock-Free Buffer Pool: High-performance memory management with crossbeam Connection-Oriented: High-level connection abstractions (KcpStream, KcpListener) Protocol Compatible: Compatible with original C KCP implementation Observability: Integrated tracing and metrics support Memory Efficient: Object pooling and buffer reuse Multiple Performance Modes: Normal, Fast, Turbo, Gaming presets Installation Add this to your Cargo.toml: [dependencies] kcp-tokio = "0.4" tokio = { version = "1.0", features = ["full"] } Quick Start Client use kcp\_tokio::{KcpConfig, KcpStream}; use tokio::io::{AsyncReadExt, AsyncWriteExt}; #[tokio::main] async fn main() -> Result<(), Box> { let config = KcpConfig::new().fast\_mode(); let mut stream = KcpStream::connect("127.0.0.1:12345".parse()?, config).await?; ``` // Send data stream.write_all(b"Hello, KCP!").await?; // Receive response let mut buffer = [0u8; 1024]; let n = stream.read(&mut buffer).await?; println!("Received: {}", String::from_utf8_lossy(&buffer[..n])); Ok(()) ``` } Server use kcp\_tokio::{KcpConfig, KcpListener}; use tokio::io::{AsyncReadExt, AsyncWriteExt}; #[tokio::main] async fn main() -> Result<(), Box> { let config = KcpConfig::realtime(); let mut listener = KcpListener::bind("127.0.0.1:12345".parse()?, config).await?; ``` println!("Server listening on 127.0.0.1:12345"); while let Ok((mut stream, addr)) = listener.accept().await { println!("New connection from {}", addr); tokio::spawn(async move { let mut buf = [0u8; 1024]; while let Ok(n) = stream.read(&mut buf).await { if n == 0 { break; } let _ = stream.write_all(&buf[..n]).await; } }); } Ok(()) ``` } Architecture ┌─────────────────────────────────────────────────────────────┐ │ Application Layer │ │ (User code using KcpStream/KcpListener) │ ├─────────────────────────────────────────────────────────────┤ │ High-Level API Layer │ │ KcpStream KcpListener │ │ (AsyncRead/AsyncWrite, TCP-like interface) │ ├─────────────────────────────────────────────────────────────┤ │ Protocol Core Layer │ │ KcpEngine │ │ (ARQ logic, congestion control, retransmission) │ ├─────────────────────────────────────────────────────────────┤ │ Common Layer │ │ KcpSegment, KcpHeader, BufferPool, Constants │ ├─────────────────────────────────────────────────────────────┤ │ Transport Layer │ │ Generic Transport trait (UDP default) │ └─────────────────────────────────────────────────────────────┘ Configuration Performance Presets // Gaming - ultra-low latency (3ms update interval) let config = KcpConfig::gaming(); // Real-time communication (8ms update interval) let config = KcpConfig::realtime(); // File transfer - high throughput let config = KcpConfig::file\_transfer(); // Testing with packet loss simulation let config = KcpConfig::testing(0.1); // 10% packet loss Performance Modes Mode Update Interval Resend Congestion Control Use Case Normal 40ms 0 Yes General purpose Fast 8ms 2 Yes Low latency Turbo 4ms 1 No Maximum speed Gaming 3ms 1 No Real-time games Custom Configuration use std::time::Duration; let config = KcpConfig::new() .fast\_mode() .window\_size(128, 128) .mtu(1400) .connect\_timeout(Duration::from\_secs(10)) .keep\_alive(Some(Duration::from\_secs(30))) .stream\_mode(true); Examples # Run performance test server cargo run --example perf\_test\_server -- 127.0.0.1:12345 gaming # Run performance test client cargo run --example perf\_test\_client -- 127.0.0.1:12345 # Run simple echo example cargo run --example simple\_echo Testing # Run all tests cargo test # Run resilience tests (packet loss, reorder, concurrent connections) cargo test --test resilience\_test # Run benchmarks cargo bench # Run with logging RUST\_LOG=debug cargo test -- --nocapture # Run clippy cargo clippy --all-targets -- --deny clippy::all Documentation Detailed documentation is available in the doc/ directory: Document Description ARCHITECTURE.md System architecture and design MODULES.md Module reference and APIs USAGE.md Usage guide and examples TESTING.md Testing guide Performance KCP provides significant latency improvements over TCP: 30-40% lower latency in typical network conditions Better performance on lossy networks Configurable trade-offs between latency and bandwidth Optimizations in this Implementation Actor-based lock-free architecture: KcpEngine runs in a single dedicated tokio task, eliminating Arc> contention Generic Transport trait: Associated Addr type with RPITIT — zero heap allocation on hot path (no Pin) DashMap for packet routing: Listener uses lock-free concurrent hashmap on the hot path Lock-free buffer pools: crossbeam::queue::ArrayQueue for zero-allocation fast path BTreeMap receive buffer: O(log n) insertion for out-of-order packets (vs O(n) linear scan) Zero-copy segment encoding: Flush avoids cloning segments, encodes by reference Cached timestamps: Single syscall per input() call instead of 3+ Pre-allocated buffers: VecDeque::with\_capacity based on window sizes, avoiding grow overhead on send burst Zero-copy packet handling with bytes crate Grouped state structs for better cache locality Configurable update intervals (3-40ms) Batch ACK processing Use Cases Gaming: Ultra-low latency for real-time multiplayer VoIP/Video: Real-time communication Live Streaming: Low-latency data delivery File Transfer: Reliable bulk data transfer IoT: Efficient communication for constrained devices Compatibility Protocol: Compatible with original C KCP Rust: Edition 2021, stable toolchain Tokio: 1.0+ License MIT License - see LICENSE file. Contributing Contributions are welcome! Please feel free to submit a Pull Request. Resources Original KCP Protocol KCP Protocol Documentation Tokio Documentation Benchmarks Criterion benchmarks measure engine-level throughput and latency: cargo bench Benchmark Description engine\_throughput 10/100/500 x 1KB messages engine\_small\_messages 1000 x 64B messages engine\_large\_message Single 16KB/64KB message fragmentation + reassembly Version History v0.4.0: Extract kcp-core as standalone protocol crate, restructure source layout (src/ → kcp/, flatten async\_kcp/) v0.3.7: Fix ACK window/UNA fields, generic Transport trait with RPITIT, resilience tests, criterion benchmarks v0.3.4: Engine refactoring, lock-free buffer pools, documentation v0.3.3: Performance optimizations, sub-millisecond latency v0.3.1: Full async support, comprehensive configuration v0.2.x: Performance improvements and bug fixes v0.1.x: Initial implementation