cranialxix Profile Banner
Bo Liu Profile
Bo Liu

@cranialxix

Followers
354
Following
167
Media
1
Statuses
32

Research Scientist @Meta FAIR | CS PhD @UT Austin | Former Research Intern @DeepMind, @Nvidia, @Baidu

Mountain View, CA
Joined January 2018
Don't wanna be here? Send us removal request.
@cranialxix
Bo Liu
1 year
How to design State Space Models (SSM) from principles? We propose to view SSM's recurrence as the per-step closed-form solution to an online learning problem. To this end, we present Longhorn, a novel SSM that achieves 1.8x better sampling efficiency against Mamba.
Tweet media one
6
37
179
@cranialxix
Bo Liu
9 days
RT @qiwang067: πŸš€ Excited to announce our workshop β€œEmbodied World Models for Decision Making” at #NeurIPS2025! πŸŽ‰. Keynote speakers, panelis….
0
13
0
@cranialxix
Bo Liu
3 months
RT @TheOfficialACM: πŸ™Œ Meet the 2024 ACM Technical Awards Recipients!.We’re proud to honor this year’s innovators in autonomous systems, cry….
0
14
0
@cranialxix
Bo Liu
6 months
If you are interested in learning/using flow/diffusion models, please check this thread from the original author of rectified flow (RF). It contains:. 1. a tutorial blog (to quickly get a sense of what RF is and some interesting findings we had lately).2. a codebase (a minimal.
@lqiang67
Qiang Liu
6 months
πŸš€ New Rectified Flow materials (WIP)!. πŸ“– Tutorials: πŸ’» Code: πŸ“œ Notes: Contributions from @RunlongLiao, @XixiHu12, @cranialxix, and many others! πŸ”₯. Let us know your thoughts! πŸš€.
0
3
8
@cranialxix
Bo Liu
7 months
For imitation learning in robotics: as cheap as behavioral cloning, as expressive as diffusion policy. From the original group that designed the rectified flow.
@XixiHu12
Xixi Hu
7 months
πŸš€ Excited to share AdaFlow at #NeurIPS2024!. A fast, adaptive method for training robots to act with one-step efficiencyβ€”no distillation needed! 🌟. Come check out our poster! . πŸ“ West Ballroom A-D #7309.πŸ“… Today 11 am - 2 pm. #MachineLearning #Robotics
Tweet media one
0
0
12
@cranialxix
Bo Liu
8 months
RT @wightmanr: One of the last minute papers I added support for that delayed this release was 'Cautious Optimizers' As I promised, I pushe….
Tweet card summary image
huggingface.co
0
7
0
@cranialxix
Bo Liu
8 months
RT @wightmanr: I was going to publish a new timm release yesterday with significant Optimizer updates: Adopt, Big Vision Adafactor, MARS, a….
0
12
0
@cranialxix
Bo Liu
8 months
One line of code for improved training by ensuring the update aligns with the gradient. Note that there is no need to tune hyperparameters; just use those from AdamW or Lion.
@KyleLiang5
Kaizhao Liang
8 months
TLDR: 1⃣ line modification, satisfaction (theoretically and empirically) guaranteed πŸ˜€πŸ˜€πŸ˜€.Core idea: 🚨Do not update if you are not sure.πŸ‘¨β€πŸ’»πŸ€—πŸ“š @cranialxix @lqiang67 @Tim38463182
Tweet media one
0
4
16
@cranialxix
Bo Liu
10 months
RT @JiahengHu1: πŸš€ Despite efforts to scale up Behavior Cloning for Robots, large-scale BC has yet to live up to its promise. How can we bre….
0
37
0
@cranialxix
Bo Liu
10 months
RWKV-7'update is pretty similar to the Longhorn model's update (, which is derived explicitly from solving online associative recall in closed form. The household transform used in the RWKV-7, (diag(w) - a \alpha^\top \beta), stems from optimizing a.
@BlinkDL_AI
BlinkDL
10 months
RWKV-7 "Goose" πŸͺΏ preview rc2 => Peak RNN architecture?πŸ˜ƒWill try to squeeze more performance for the final release. Preview code:
Tweet media one
0
2
17
@cranialxix
Bo Liu
10 months
RT @yzhang_cs: πŸΎπŸΎπŸΎπ™€π™­π™˜π™žπ™©π™šπ™™ 𝙩𝙀 π™žπ™£π™©π™§π™€π™™π™ͺπ™˜π™š 𝙀π™ͺ𝙧 π™‘π™–π™©π™šπ™¨π™© 𝙬𝙀𝙧𝙠: π™‚π™–π™©π™šπ™™ π™Žπ™‘π™€π™© π˜Όπ™©π™©π™šπ™£π™©π™žπ™€π™£ (π™‚π™Žπ˜Ό), a new linear attention model inspired by ABC @haopeng_n….
Tweet card summary image
huggingface.co
0
35
0
@cranialxix
Bo Liu
11 months
RT @KyleLiang5: SVD in Galore is an OVERKILL! Lyapunov analysis says any reasonable projection matrix works. Here comes Online Subspace De….
0
7
0
@cranialxix
Bo Liu
11 months
Interested in the continual adaptation of large AI models? Join us by submitting your work to our NeurIPS workshop :) This is a great opportunity to engage with experts and advance the dialogue on how foundation models can be dynamically updated. Deadline is Sept 9th AoE.
@arslan_mac
Arslan Chaudhry
11 months
[1/4] Happy to announce that we are organizing a workshop on continuous development of foundation models at NeurIPS’24. Website:
0
1
12
@cranialxix
Bo Liu
1 year
The HF paper page link:
Tweet card summary image
huggingface.co
2
1
10
@cranialxix
Bo Liu
1 year
Unlike previous SSMs, Longhorn shares the same architecture as Mamba, except that we replace the Mamba SSM with the Longhorn's SSM. We believe this forms a fair comparison. Moreover, as Longhorn's update is derived, it does not require a separately parameterized forget gate.
1
0
5
@cranialxix
Bo Liu
1 year
We release our code following the NanoGPT style for the convenience of the community. It contains the one-file implementation of:. LLaMA, RetNet, GLA, RWKV, Mamba, and Longhorn,. and uses the OpenWebText dataset. Code:
Tweet card summary image
github.com
Official PyTorch Implementation of the Longhorn Deep State Space Model - Cranial-XIX/longhorn
1
0
19
@cranialxix
Bo Liu
2 years
RT @Ar_Douillard: We release the async extension of DiLoCo shared in November, led by our amazing intern @cranialxix!. πŸ‘€ TL;DR: we do distr….
0
8
0
@cranialxix
Bo Liu
2 years
RT @_akhaliq: Google Deepmind present Asynchronous Local-SGD Training for Language Modeling. paper page: Local sto….
0
29
0
@cranialxix
Bo Liu
2 years
RT @RL_Conference: Thrilled to announce the first annual Reinforcement Learning Conference @RL_Conference, which will be held at UMass Amhe….
0
87
0
@cranialxix
Bo Liu
2 years
RT @konstmish: Constrained optimization perspective on what Lion optimizer is doing. They also generalize Lion to operations other than sig….
0
14
0