Rust's async runtime has not progressed beyond a minimum viable product (MVP) state, failing to deliver scalable concurrency despite years of development. The ecosystem remains fragmented, with competing libraries like async-std and tokio, and the lack of a unified async API has stalled adoption in high-performance systems programming.
Overview
Rust's async model was designed to provide zero-cost abstractions for concurrent I/O, but in practice, it introduces significant binary bloat, particularly on resource-constrained platforms like microcontrollers. The compiler generates overly complex state machines for async functions, leading to inefficiencies that are less noticeable on desktops or servers but critical in embedded systems. These issues stem from fundamental design choices in how futures and state machines are implemented.
Key Problems and Optimizations
State Machine Bloat Every async function in Rust compiles into a state machine with at least three default states:
Unresumed,Returned, andPanicked. TheReturnedstate panics if polled after completion, adding overhead. A proposed optimization replaces this panic with returningPoll::Pendingin release builds, reducing binary size by 2-5% in embedded firmware.Unnecessary States for Simple Futures Async blocks without
awaitstill generate full state machines, even when they could simply returnPoll::Readyon every poll. This adds ~0.2% binary size overhead. A compiler optimization to eliminate these states for trivial futures could yield modest but worthwhile improvements.Lack of Future Inlining Futures are never inlined by the compiler, leading to nested state machines that degrade performance. For example:
async fn foo(blah: bool) -> i32 { /* ... */ } async fn bar(input: u32) -> i32 { let blah = input > 10; foo(blah).await * 2 }The current compiler generates separate state machines for
fooandbar, even thoughbarcould reusefoo's state. Manual implementations show this pattern can reduce complexity significantly.Duplicate States in Match Arms Code like this:
match get_command() { CommandId::A => send_response(123).await, CommandId::B => send_response(456).await, }generates identical states for each
awaitbranch. Refactoring to compute the response first collapses these states, reducing MIR (Mid-Level IR) output from 456 to 302 lines.LLVM Limitations While LLVM can optimize simple futures at
opt-level=3, it struggles with complex or deeply nested async code, especially when optimizing for size (e.g., in embedded or WASM targets). The compiler's reliance on LLVM to clean up inefficient MIR is unreliable.
Proposed Compiler Improvements
The author has submitted a Project Goal to address these issues, including:
- Removing panics in the
Returnedstate in release builds. - Eliminating state machines for async blocks without
await. - Implementing future inlining for single-
awaitfutures. - Collapsing duplicate states in match arms.
Early tests show these changes could improve performance by ~3% on x86 and reduce binary size by 2-5% on embedded systems. However, real-world benchmarks are needed to validate the impact.
Tradeoffs
- Debug vs. Release Builds: Some optimizations (e.g., removing panics) would only apply to release builds to preserve debuggability.
- Executor Compliance: Optimizations like always returning
Poll::Readyfor trivial futures could break non-compliant executors, though such cases are rare. - Funding: The proposed work requires €30k in funding, with flexible scope for partial implementation.
When to Use It
These optimizations are most relevant for:
- Embedded systems or WASM targets where binary size is critical.
- High-performance applications where nested async code degrades performance.
- Projects using async Rust for abstraction-heavy patterns (e.g., trait implementations).
For now, developers can mitigate bloat by manually refactoring code to avoid duplicate states or unnecessary await points, but compiler-level fixes are needed for systemic improvements.
Bottom Line
Rust's async runtime remains a work in progress, with significant room for optimization. While the current MVP state suffices for many use cases, addressing these inefficiencies could unlock Rust's potential in performance-critical domains. The proposed compiler improvements offer a path forward, but require community and financial support to materialize.