Metadata
Title
SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression
Category
general
UUID
1ecb59f2372041d9ad439b7d23ba4042
Source URL
https://wsai.iitm.ac.in/preprints/swan-sparse-winnowed-attention-for-reduced-inf...
Parent URL
https://wsai.iitm.ac.in/preprints/
Crawl Time
2026-03-23T19:07:58+00:00
Rendered Raw Markdown
# SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression

**Source**: https://wsai.iitm.ac.in/preprints/swan-sparse-winnowed-attention-for-reduced-inference-memory-via-decompression-free-kv-cache-compression-31/
**Parent**: https://wsai.iitm.ac.in/preprints/

- [Home](https://wsai.iitm.ac.in/)
- [Preprints](https://wsai.iitm.ac.in/preprints/)
- [SWAN: Sparse Winnowed Attention for Reduced Inference Memory via …](#)

## SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression

<https://doi.org/10.48550/arXiv.2511.18936>

Authors

S, Santhosh G
,
Prakash, Saurav
,
Ravindran, Balaraman

Preprint Server

arXiv

Santhosh G S, Saurav Prakash, Balaraman Ravindran, SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression

Preprint link: <https://arxiv.org/abs/2511.18936>