talks

invited talks and presentations in reversed chronological order.

2025-08
Modern Position Encoding in Transformers: RoPE/Yarn and PaTH
Remote
Introduction of RoPE/Yarn and PaTH for modern position encoding in transformers.
2025-05
Toward More Expressive yet Scalable RNNs: DeltaNet and Its Variants
Remote
Introduction of DeltaNet and its variants for more expressive yet scalable RNNs.
2025-02
Linear attention and beyond or What's Next for Mamba? Towards More Expressive Recurrent Update Rules
Remote
Discussion on developments in linear attention and its variants.
2024-08
Linear Transformers for Efficient Sequence Modeling
HazyResearch @ Standford
Introduction of linear transformers for efficient sequence modeling.
2024-04
Gated linear Recurrence for Efficient Sequence Modeling
Cornell Tech
Introduction of gated linear recurrence for efficient sequence modeling.