Metadata
Title
SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression
Category
general
UUID
1ecb59f2372041d9ad439b7d23ba4042
Source URL
https://wsai.iitm.ac.in/preprints/swan-sparse-winnowed-attention-for-reduced-inf...
Parent URL
https://wsai.iitm.ac.in/preprints/
Crawl Time
2026-03-23T19:07:58+00:00
Rendered Raw Markdown

SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression

Source: https://wsai.iitm.ac.in/preprints/swan-sparse-winnowed-attention-for-reduced-inference-memory-via-decompression-free-kv-cache-compression-31/ Parent: https://wsai.iitm.ac.in/preprints/

SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression

https://doi.org/10.48550/arXiv.2511.18936

Authors

S, Santhosh G , Prakash, Saurav , Ravindran, Balaraman

Preprint Server

arXiv

Santhosh G S, Saurav Prakash, Balaraman Ravindran, SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression

Preprint link: https://arxiv.org/abs/2511.18936