Links
Clean code implementation of Mamba
Gated linear attention (transformer)
New Mamba model 12th December 3B parameters, 600B tokens
Mamba, Memory, and the SSM Moment (Cog Rev Podcast)
Sparse Notes Mamba walk through
Papers Inspired by Mamba
Is Mamba Capable Of In-Context Learning?
The Hidden Attention of Mamba Models
Theoretical Foundations of Deep Selective State-Space Models
Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy
A multi-cohort study on prediction of acute brain dysfunction states
Universality of Linear Recurrences Followed by Non-linear Projections
Large Window-based Mamba UNet for Medical Image Segmentation
Multichannel Long-Term Streaming Neural Speech Enhancement for Static and Moving Speakers
Activating Wider Areas in Image Super-Resolution
On the low-shot transferability of [V]-Mamba
Is Mamba Effective for Time Series Forecasting?
Music to Dance as Language Translation using Sequence Models
Uncovering Selective State Space Model’s Capabilities in Lifelong Sequential Recommendation
State Space Models as Foundation Models
Proprioception Is All You Need
Locating and Editing Factual Associations in Mamba
Does Transformer Interpretability Transfer to RNNs?
A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion
State Space Model for New-Generation Network Alternative to Transformers
Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting