Explore tweets tagged as #mLSTM
@iScienceLuvr
Tanishq Mathew Abraham, Ph.D.
1 year
xLSTM: Extended Long Short-Term Memory. abs: Leveraging the latest techniques from modern LLMs, mitigating known limitations of LSTMs (introducing sLSTM and mLSTM memory cells that form the xLSTM blocks), and scaling up results in a highly competitive
Tweet media one
8
90
509
@benediktalkin
Benedikt Alkin
1 year
Excited to introduce Vision-LSTM (ViL): a new backbone for vision built on the xLSTM architecture. ViL creates patch tokens from an image and processes them with alternating bi-directional mLSTM blocks, where odd blocks process the sequence from the opposite direction. ๐Ÿงต
Tweet media one
@jo_brandstetter
Johannes Brandstetter
1 year
Introducing Vision-LSTM - making xLSTM read images ๐Ÿง It works . pretty, pretty well ๐Ÿš€๐Ÿš€ But convince yourself :) We are happy to share code already!. ๐Ÿ“œ: ๐Ÿ–ฅ๏ธ: All credits to my stellar PhD @benediktalkin
Tweet media one
2
34
138
@fly51fly
fly51fly
1 year
[CV] Vision-LSTM: xLSTM as Generic Vision Backbone.B Alkin, M Beck, K Pรถppel, S Hochreiter, J Brandstetter [ELLIS Unit Linz] (2024). - Vision-LSTM (ViL) is an adaptation of the xLSTM architecture to computer vision tasks. It uses alternating mLSTM blocks
Tweet media one
Tweet media two
Tweet media three
1
10
45
@itsandrewgao
andrew gao
1 year
6/.They did a lot of complicated evals, but this one i felt was the clearest. They trained several models (Transformer LLM, RWKV, and XLSTM) on 15B tokens of text, and measured the perplexity (how good each model is at predicting the next token). the xLSTM[1:0] (1 mLSTM, 0 sLSTM
Tweet media one
Tweet media two
2
0
23
@itsandrewgao
andrew gao
1 year
7/.A very important thing to take note of is that the XLSTM architecture is composed of a flexible ratio of MLSTM and SLSTM blocks. MLSTM is the matrix memory parallelizable LSTMs (like Transformers, that can operate over all tokens at once). SLSTM is the LSTM that is NOT
Tweet media one
1
0
16
@itsandrewgao
andrew gao
1 year
no code/weights shared yet for the #xLSTM, so I tried implementing mLSTM myself!. Colab notebook ๐Ÿ‘‡๐Ÿ‘‡๐Ÿ”— in comments. LMK if you want the sLSTM as well.
Tweet media one
@itsandrewgao
andrew gao
1 year
๐Ÿ””the guy who invented the LSTM just dropped a new LLM architecture! (Sepp Hochreiter). Major component is a new parallelizable LSTM. โš ๏ธone of the major weaknesses of prior LSTMs was the sequential nature (can't be done at once). Everything we know about the XLSTM: ๐Ÿ‘‡๐Ÿ‘‡๐Ÿงต
Tweet media one
6
16
135
@MikeE_3_14
Mike Erlihson, Math PhD, AI
1 year
๐Ÿคทโ€โ™‚๏ธืขื•ื“ ืืจื›ื™ื˜ืงื˜ื•ืจื” ืฉืžื™ื™ืง ืœื ืžืชืœื”ื‘ ืžืžื ื” ๐Ÿคทโ€โ™‚๏ธ . ื™ืฆื ืื™ื–ื” ืฉื“ืจื•ื’ ืฉืœ LSTM ื”ื ืงืจื xLSTM ื•ื™ืฉ ื”ืจื‘ื” ื”ืชืœื”ื‘ื•ืช ืžืžื ื•. ืื ื™ ืœื ืžื‘ื™ืŸ ืืช ื”ื”ืชืœื”ื‘ื•ืช ื”ืจื‘ื”: ื—ื™ื‘ืจื• 2 ืจืขื™ื•ื ื•ืช ืžืœืคื ื™ 5-6 ืฉื ื™ื sLSTM ื•- mLSTM ื‘ืฆื•ืจื” ืฉืœ ResNet ืขื ืงืฆืช gating ื‘ื™ื ื™ื”ื ื•ืงื™ื‘ืœื• ื›ื‘ื™ื›ื•ืœ ืžืฉื”ื• ื™ื•ืชืจ ื˜ื•ื‘. ื•ื™ืฉ ืฉื ืขื•ื“ ืžื ื’ื ื•ืŸ ืขื“ื›ื•ืŸ ื–ื›ืจื•ืŸ
Tweet media one
2
0
9
@KorbiPoeppel
Korbinian Poeppel
3 months
Ever wondered how linear RNNs like #mLSTM (#xLSTM) or #Mamba can be extended to multiple dimensions?.Check out "pLSTM: parallelizable Linear Source Transition Mark networks". #pLSTM works on sequences, images, (directed acyclic) graphs. Paper link:
Tweet media one
4
40
138
@MonikaVenck
Monika AI
1 year
#LSTM is making a huge comeback with a revamped architecture. #sLSTM introduces memory mixing, #mLSTM is parallelizable with a matrix memory and a covariance update rule.
Tweet media one
1
0
0
@TheMLSofChoice
TheMLS.com
2 years
Another productive week for the win! Stay cozy this weekend, Southern California. Happy Friday from the MLSโ„ข ๐ŸŒป
Tweet media one
0
0
0
@TheMLSofChoice
TheMLS.com
2 years
Love is in the air โค๏ธ May it be shared with all those dearest to your heart! Happy Valentine's Day from The MLSโ„ข.
Tweet media one
0
0
0
@cedric_chee
cedric
1 year
xLSTM explore the limits of LM by scaling LSTMs to billions of params, addressing their limitations, integrating modern techniques:.-exponential gating w/ stabilization.-modified memory structures.yielding:.-sLSTM: scalar mem.-mLSTM: parallelizable matrix mem. Perf & scalability:
Tweet media one
@HochreiterSepp
Sepp Hochreiter
1 year
I am so excited that xLSTM is out. LSTM is close to my heart - for more than 30 years now. With xLSTM we close the gap to existing state-of-the-art LLMs. With NXAI we have started to build our own European LLMs. I am very proud of my team.
1
0
0
@TheMLSofChoice
TheMLS.com
2 years
This brisk Southern California weather is calling for a nice hike to admire one of our many breathtaking views. ๐Ÿ˜ Happy Friday from The MLSโ„ข โœจ.
Tweet media one
0
0
1
@vestaplus1
VESTAPLUSโ„ข
2 years
Join the discussion with CEO of VestaPlusโ„ข and The MLSโ„ข, Annie Ives! Discover strategies to bridge the gap between the role of MLS and its perception among real estate professionals, ensuring a clear understanding of the impact MLS organizations have on the RE industry.
Tweet media one
0
0
1
@TheMLSofChoice
TheMLS.com
2 years
Congratulations to Christie Thomas on her new position as 2024 President of The MLSโ„ข! ๐ŸŽ‰ We are excited and looking forward to a fantastic year under Christie's leadership and guidance. โœจ
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
0
0
@TheMLSofChoice
TheMLS.com
2 years
Join the discussion with CEO of VestaPlusโ„ข and The MLSโ„ข, Annie Ives! Discover strategies to bridge the gap between the role of MLS and its perception among real estate professionals, ensuring a clear understanding of the impact MLS organizations have on the RE industry.
Tweet media one
0
0
2
@TheMLSofChoice
TheMLS.com
2 years
We are excited to have The MLSโ„ข Executive team represented at this year's Clareity24 Workshop in Scottsdale, Arizona! #themls #realestate
Tweet media one
0
0
1
@123wimi
Happy
1 year
๐Ÿš€ xLSTM: A Leap in Long Short-Term Memory Technology ๐Ÿš€. 1.Introducing xLSTM: Enhanced Gating and Memory Structures ๐Ÿง . - xLSTM revolutionizes LSTM with exponential gating and innovative memory structures. - Features include scalar memory in sLSTM and matrix memory in mLSTM
Tweet media one
0
1
1
@gm8xx8
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ
6 months
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels. Tiled Flash Linear Attention (TFLA) optimizes linear RNNs by enabling larger chunk sizes, reducing memory and IO costs for long-context training. Applied to xLSTM and mLSTM, it introduces a faster variant
Tweet media one
2
1
12