talks | Songlin Yang

2025-08	Modern Position Encoding in Transformers: RoPE/Yarn and PaTH Remote Introduction of RoPE/Yarn and PaTH for modern position encoding in transformers. PDF Video
2025-05	Toward More Expressive yet Scalable RNNs: DeltaNet and Its Variants Remote Introduction of DeltaNet and its variants for more expressive yet scalable RNNs. PDF
2025-02	Linear attention and beyond or What's Next for Mamba? Towards More Expressive Recurrent Update Rules Remote Discussion on developments in linear attention and its variants. PDF Video
2024-08	Linear Transformers for Efficient Sequence Modeling HazyResearch @ Standford Introduction of linear transformers for efficient sequence modeling. PDF
2024-04	Gated linear Recurrence for Efficient Sequence Modeling Cornell Tech Introduction of gated linear recurrence for efficient sequence modeling. PDF