
Amrith Setlur
@setlur_amrith
Followers
808
Following
401
Media
41
Statuses
128
This work was an amazing collaboration 🤝 with an awesome set of co-authors ✨: @matthewyryang @ianwu97 @sea_snell @JeremyGreerOumi @gingsmith @max_simchowitz and @aviral_kumar2! Big thanks to @gneubig and @XiongChenyan for generous compute support 🙏.
0
1
5
In summary we have the power of three 🔌 in e3.- Asymmetries in base LLM.- Neg. gradients in RL.- Coupled task & budget curriculum. What do we get?.✅ In-context exploration.✅ Extrapolation of test compute.😉 Also a pun on the popular RL alg for exploration E3 @mkearnsupenn.
1
1
3
#3 Coupled budget & task curriculum where we jointly increase token budget and length to incentivize in-context exploration: first train on easy problems at a lower budget (8k) and then train on harder problems at a longer budget of 16k tokens!
1
1
3
#1 Chaining ⛓️ asymmetric capabilities in base LLM.When base LLM has a bias to chain verification (easy) with generation (hard), & exploits Ver-Gen (VG) gap, RL amplifies the chaining of asymmetries to discover strategies, diff. from sharpening, as it composes useful primitives!
1
2
5
Attend our Scaling Self-improvement workshop @iclr_conf (Garnet 214-215) for some amazing talks and a fiery panel discussion (5-6pm)🔥.
With a stellar lineup of speakers and panelists, including Yoshua Bengio 🙀, the Scaling Self-Improving Foundation Models at @iclr_conf promises to be 🔥. ⏰ Sunday, April 27.📍 Garnet 214-215
0
4
29
I couldn't be there @iclr_conf but if you are interested in process verifiers that can boost exploration and get LLMs to solve hard problems, check out our spotlight poster on PAVs at 3pm Hall 3+2B #548. Also chat with the amazing @ianwu97 who will be presenting on our behalf!
0
5
37
It's easy to (pre-)train LLMs by imitating discrete actions (next tokens). Surprisingly, imitating *continuous* actions (eg in robots 🤖) is "exponentially" hard for *any* algorithm🤯 that only uses expert data, even when the expert is deterministic🙀! Check out this cool work:.
There’s a lot of awesome research about LLM reasoning right now. But how is learning in the physical world 🤖different than in language 📚?. In a new paper, show that imitation learning in continuous spaces can be exponentially harder than for discrete state spaces, even when
0
0
15
This was a cool collaboration led by Kevin Kuo, with @AdtRaghunathan and @gingsmith. Questions and feedback always welcome 🙏.For details check out:
0
0
0