ML Wiki
Search
Search
Explorer
Tag: selective-batching
1 item with this tag.
May 09, 2026
Orca: A Distributed Serving System for Transformer-Based Generative Models
source
inference-serving
continuous-batching
iteration-level-scheduling
selective-batching
systems