
Pulkit Agrawal
@pulkitology
Followers
13K
Following
446
Media
110
Statuses
358
Presenting Visual Dexterity: Object re-orientation in its full generality! Single camera. Novel objects. Any orientation. Downward-facing hand that fights gravity. Real-time dynamic control. Open source setup. Learn more: Led by @taochenshh #robots #rl
5
60
317
DexWrist is built for compliance, force control and is easy to simulate. It uses quasi direct drive actuators, is backdrivable and has high control bandwidth. Work led by @martinpeticco @JohnMarangola and Gabriella. @MIT_CSAIL @csail_alliances @MIT.
1
1
3
RT @martinpeticco: What’s keeping robot arms from working like human arms?. They're big, slow, have the wrong joints, and can't conform to….
0
53
0
What if an LLM can decide what data to use, potentially generate its own data and decide how to update itself 👇.
What if an LLM could update its own weights?. Meet SEALđź¦: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.
2
5
19
Llama 4 (@Meta) results are consistent with what we hypothesized will unleash the next generation of AI reasoning. A new paradigm for pre-training is around the corner
arxiv.org
Large Language Models (LLMs) have demonstrated impressive real-world utility, exemplifying artificial useful intelligence (AUI). However, their ability to reason adaptively and robustly -- the...
Llama 4 (@Meta) shows too much SFT limits RL exploration — something we also found in our recent work! A new and superior pretraining paradigm is around the corner to unleash a new era of reasoning. Check out our paper: Thread:
1
6
23
Initial tests, even on pre-trained language models, suggest that directly doing reward-based fine-tuning and skipping supervised fine-tuning works better! . Joint work with @seungwookh, @jyo_pari and @gershbrain.
2
0
18