
Pratyush Ranjan Tiwari
@PratyushRT
Followers
1K
Following
7K
Media
79
Statuses
1K
Building privacy-preserving personal AI @eternisai, lots of RL lots of reward hacking, prev. PhD @JohnsHopkins, 3X EF cryptography grantee, built @ketlxyz
Joined November 2018
We introduce a better recipe for collecting post-training data when using GRPO. Collecting samples from experts is expensive, annotation budgets are limited. Which examples are actually worth paying for? We find that focusing on hard samples results in a 30-40% improvement. 1/7
3
51
329
RT @PratyushRT: Everything will be open-source for our models here. - Models released on huggingface already ✅.- Dataset live on huggingfa….
0
3
0
RT @PratyushRT: New blogpost: Reinforcement Learning for Privacy. We post-train small language models (SLMs) to be as good as GPT 4.1 at re….
0
12
0
Everything will be open-source for our models here. - Models released on huggingface already ✅.- Dataset live on huggingface already ✅.- Codebase for RL/GRPO with LLM Judge will soon be live 🚧. I'll answer any questions here or in DMs about using this model or learning RL.
1
3
11
RT @freysa_ai: 1/.Enchanted Mobile ✨. Your AI. On your phone. Private by default. Not one controlled by a big lab. Not one that hoards y….
0
21
0
Link to the full blogpost: Models coming to @huggingface in the next hour!. 8/8.
freysa.ai
Enabling sovereign AI and self-owned cognition at global scale.
1
0
6
RT @PratyushRT: We introduce a better recipe for collecting post-training data when using GRPO. Collecting samples from experts is expensiv….
0
51
0
RT @rohanpaul_ai: Training Group Relative Policy Optimization, GRPO, models on the hardest problems delivers the biggest gains when annotat….
0
5
0