Tu Vu Profile
Tu Vu

@tuvllms

Followers
4K
Following
3K
Media
53
Statuses
1K

Research Scientist @GoogleDeepMind & Assistant Professor @VT_CS. PhD from @UMass_NLP. Google FLAMe/FreshLLMs/Flan-T5 Collection/SPoT #NLProc

California, USA
Joined April 2017
Don't wanna be here? Send us removal request.
@tuvllms
Tu Vu
1 year
🚨 New @GoogleDeepMind paper 🚨. We trained Foundational Large Autorater Models (FLAMe) on extensive human evaluations, achieving the best RewardBench perf. among generative models trained solely on permissive data, surpassing both GPT-4 & 4o. 📰: 🧵:👇
Tweet media one
Tweet media two
28
98
565
@tuvllms
Tu Vu
1 day
RT @denny_zhou: Slides for my lecture “LLM Reasoning” at Stanford CS 25: Key points: .(1) Reasoning in LLMs just m….
0
3
0
@tuvllms
Tu Vu
3 days
RT @MistralAI: In our continued commitment to open-science, we are releasing the Voxtral Technical Report: The rep….
0
186
0
@tuvllms
Tu Vu
4 days
RT @iScienceLuvr: Kimi K2 paper dropped!. describes:.- MuonClip optimizer.- large-scale agentic data synthesis pipeline that systematically….
0
172
0
@tuvllms
Tu Vu
4 days
RT @demishassabis: Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced ver….
Tweet card summary image
deepmind.google
Our advanced model officially achieved a gold-medal level performance on problems from the International Mathematical Olympiad (IMO), the world’s most prestigious competition for young...
0
766
0
@tuvllms
Tu Vu
4 days
RT @quocleix: Excited to share that a scaled up version of Gemini DeepThink achieves gold-medal standard at the International Mathematical….
Tweet card summary image
deepmind.google
Our advanced model officially achieved a gold-medal level performance on problems from the International Mathematical Olympiad (IMO), the world’s most prestigious competition for young...
0
51
0
@tuvllms
Tu Vu
4 days
RT @lmthang: This year was a major paradigm shift, where we can solve problems end to end in natural language. With novel reinforcement lea….
0
21
0
@tuvllms
Tu Vu
4 days
RT @GoogleDeepMind: An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International….
0
774
0
@tuvllms
Tu Vu
5 days
RT @danielhanchen: My Reinforcement Learning (RL) & Agents 3 hour workshop is out!. I talk about:.1. RL fundamentals & hacks.2. "Luck is al….
0
225
0
@tuvllms
Tu Vu
7 days
RT @alexwei_: 1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI….
0
1K
0
@tuvllms
Tu Vu
7 days
Free 1-year Google Colab Pro subscriptions for verified US students and faculty.
@GoogleColab
Colaboratory
7 days
Big news for data science in higher ed! 🚀Colab now offers 1-year Pro subscriptions free of charge for verified US students/faculty, interactive Slideshow Mode for lectures, & an AI toggle per notebook. Enhance teaching & learning in the upcoming academic year! Read all about it.
0
0
9
@tuvllms
Tu Vu
8 days
RT @MohitIyyer: Excited to talk about long-context models / eval at this panel on Saturday! I'm also looking for a postdoc / PhD students t….
0
11
0
@tuvllms
Tu Vu
8 days
RT @SanghaniCtrVT: Students from @SanghaniCtrVT are working as interns across the country on projects that run the gamut of AI, data analyt….
0
1
0
@tuvllms
Tu Vu
10 days
RT @_jasonwei: Becoming an RL diehard in the past year and thinking about RL for most of my waking hours inadvertently taught me an importa….
0
324
0
@tuvllms
Tu Vu
10 days
RT @soumithchintala: considering Muon is so popular and validated at scale, we've just decided to welcome a PR for it in PyTorch core by de….
0
62
0
@tuvllms
Tu Vu
10 days
RT @sundarpichai: New from our security teams: Our AI agent Big Sleep helped us detect and foil an imminent exploit. We believe this is a f….
0
880
0
@tuvllms
Tu Vu
10 days
RT @Yong18850571: (1/4)🚨 Introducing Goedel-Prover V2 🚨.🔥🔥🔥 The strongest open-source theorem prover to date. 🥇 #1 on PutnamBench: Solves 6….
0
82
0
@tuvllms
Tu Vu
11 days
RT @omarsar0: One Token to Fool LLM-as-a-Judge. Watch out for this one, devs!. Semantically empty tokens, like “Thought process:”, “Solutio….
0
121
0
@tuvllms
Tu Vu
11 days
RT @TuhinChakr: Honored to get the outstanding position paper award at @icmlconf :) Come attend my talk and poster tomorrow on human center….
0
16
0
@tuvllms
Tu Vu
12 days
Our independent evaluation on reasoning over conflicting evidence with SEAL-0 shows that Grok 4 is a strong model, though its performance gaps with other frontier models like Gemini-2.5-Pro and o3-pro are small.
@thinhphp_vt
Thinh
12 days
We just evaluated Grok 4 on our SEAL-0 dataset.👍Try it:
Tweet media one
0
2
20
@tuvllms
Tu Vu
12 days
RT @hardmaru: There’s a secret code if you observe the authors’ first initials in the order of authorship:. “GEMINI MODELS CAN THINK AND GE….
0
29
0