Fetching download data
We could not load this package right now
The download source did not respond on this attempt. This is usually temporary. Refreshing in a minute will retry the fetch.
Download data isn’t available yet
We’re temporarily rate-limited on fetching fresh data for this package. This is not a sign it has zero downloads. It just hasn’t been pulled yet, so check back in a few minutes.
turbo-attn
Optimized CUDAgraph-enabled kernels and attention backend for vLLM, SGLang and more based on TurboQuant near-lossless KV cache compression. SOTA performance with Gemma 4, Qwen 3.6 and other modern LLMs.
Share this package
The preview that unfurls when the link is pasted into Slack, X, Discord, and other apps. Copy the image straight into a chat, or grab the link or an embed below.
Sending by email needs an account, so the recipient can see who shared it.
Sign in