ML Wiki

Tag: flash-attention

1 item with this tag.

  • Apr 10, 2026

    FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

    • source
    • flash-attention
    • attention
    • systems
    • inference-efficiency
    • gpu