Kalpesh Krishna Profile
Kalpesh Krishna

@kalpeshk2011

Followers
3K
Following
7K
Media
39
Statuses
1K

Staff Research Scientist @GoogleDeepMind (Gemini team), PhD from @umass_nlp. Former undergrad @IITBombay and intern @GoogleDeepMind, @Allen_AI, @TTIC_Connect.

New York, USA
Joined May 2013
Don't wanna be here? Send us removal request.
@kalpeshk2011
Kalpesh Krishna
1 year
Check out our new @GoogleAI paper: we curate a mixture of 5M human judgments to train general-purpose foundational autoraters. Strong LLM-as-judge scores on RewardBench (87.8%), and highest perf among baselines on LLMAggreFact + 6 other benchmarks! 📰 https://t.co/oUN4hDeNWx 👇
Tweet card summary image
arxiv.org
As large language models (LLMs) advance, it becomes more challenging to reliably evaluate their output due to the high costs of human evaluation. To make progress towards better LLM autoraters, we...
@tuvllms
Tu Vu
1 year
🚨 New @GoogleDeepMind paper 🚨 We trained Foundational Large Autorater Models (FLAMe) on extensive human evaluations, achieving the best RewardBench perf. among generative models trained solely on permissive data, surpassing both GPT-4 & 4o. 📰: https://t.co/FIPFiHwXyt 🧵:👇
6
21
124
@sundarpichai
Sundar Pichai
10 days
Our TPUs are headed to space!  Inspired by our history of moonshots, from quantum computing to autonomous driving, Project Suncatcher is exploring how we could one day build scalable ML compute systems in space, harnessing more of the sun’s power (which emits more power than 100
830
2K
17K
@sundarpichai
Sundar Pichai
16 days
Just posted Q3 earnings. We delivered our first-ever $100B quarter driven by double-digit growth across every major part of our business. (Five years ago, our quarterly revenue was at $50B🚀) Our full-stack approach to AI is driving real momentum and we’re shipping at speed.
617
1K
15K
@joshwoodward
Josh Woodward
15 days
It's 4:43am PT on a Thursday, and it's launch time on @GeminiApp! So excited and thankful for our partnership with Reliance - we've just start rolling out FREE Google AI Pro plans across India for eligible Jio customers! (total value ₹35,100) Grab it in the MyJio app or
178
246
3K
@FlynonymousWX
Tropical Cowboy of Danger
18 days
A thread of videos from today’s flight into Hurricane Melissa In this first one we are entering from the southeast just after sunrise and the bright arc on the far northwest eye wall is the light just beginning to make it over the top from behind us.
356
6K
24K
@FlynonymousWX
Tropical Cowboy of Danger
18 days
Third pass through Melissa. GoPro in side window as different camera looking forward shooting in ultra high res 8k. Not sure when that might get processed as the file turned out ridiculous. Barely had HD space for it and MacBook Pro promptly chocked when I tried to edit it
91
2K
10K
@CBP
CBP
2 months
Let’s set the record straight: President Trump’s updated H-1B visa requirement applies only to new, prospective petitions that have not yet been filed. Petitions submitted prior to September 21, 2025 are not affected. Any reports claiming otherwise are flat-out wrong and should
3K
2K
9K
@GoogleDeepMind
Google DeepMind
2 months
An advanced version of Gemini 2.5 Deep Think has achieved gold-medal level performance at the ICPC 2025 - one of the world’s most prestigious programming contests. 🏅 Building on the model's success in math at the IMO, this marks another historic milestone for advanced AI. 🧵
123
346
2K
@arena
lmarena.ai
3 months
🚨🍌Breaking News: Gemini-2.5-Flash-Image-Preview (“nano-banana”) by @GoogleDeepMind now ranks #1 in Image Edit Arena. In just two weeks: 🟡“nano-banana” has driven over 5 million community votes in the Arena 🟡Record-breaking 2.5M+ votes casted for this model alone 🟡It has
@GoogleDeepMind
Google DeepMind
3 months
Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model. 🤯 From photorealistic masterpieces to mind-bending fantasy worlds, you can now natively produce, edit and refine visuals with new levels of reasoning,
36
155
1K
@GoogleDeepMind
Google DeepMind
3 months
Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model. 🤯 From photorealistic masterpieces to mind-bending fantasy worlds, you can now natively produce, edit and refine visuals with new levels of reasoning,
184
539
3K
@53rdWRS
Hurricane Hunters
3 months
Last night, the 53rd Weather Reconnaissance Squadron flew into the eye of Hurricane Erin—and captured imagery of the breathtaking stadium effect. These missions provide critical data to the NHC to improve forecasts, helping keep communities safe before the storm makes
182
3K
10K
@scaling01
Lisan al Gaib
3 months
Gemini 2.5 Pro has a 67% winrate against GPT-5 Thinking
58
58
1K
@demishassabis
Demis Hassabis
3 months
One word: relentless. just in the past two weeks, we’ve shipped: 🌐 Genie 3 - the most advanced world simulator ever 🤔 Gemini 2.5 Pro Deep Think available to Ultra subs 🎓 Gemini Pro free for uni students & $1B for US ed 🌍 AlphaEarth - a geospatial model of the entire planet
504
972
10K
@demishassabis
Demis Hassabis
3 months
Genie 3 is here - it can generate an entire world simulation that you can interact with in real-time, just from a text prompt! It's pretty mind-blowing really when you stop to think about it, and it's rapidly improving - one day we will be able to build the Holodeck for real!
251
735
5K
@GoogleDeepMind
Google DeepMind
3 months
For researchers, scientists, and academics tackling hard problems: Gemini 2.5 Deep Think is here. 🤯 It doesn't just answer, it brainstorms using parallel thinking and reinforcement learning techniques. We put it into the hands of mathematicians who explored what it can do ↓
140
491
3K
@yixiao_song
Yixiao Song
4 months
🎉 BearCubs 🐻 is headed to #COLM2025! We benchmarked the newly released ChatGPT Agent and uncovered some exciting insights. Looking forward to sharing our findings in Montreal this October and giving a sneak peek of at REALM @ACL2025NLP in Vienna! 🎻
@MohitIyyer
Mohit Iyyer
4 months
ChatGPT Agent is a huge step up on BearCubs, esp on multimodal/interactive tasks (e.g., playing web games)! It gets 65.8% accuracy vs Deep Research's 36% and Operator's 23%. Humans are at ~85%, and clearly better/faster at fine control & complex filtering.
0
1
17
@GoogleDeepMind
Google DeepMind
4 months
An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵
157
740
4K
@ChelseaFC
Chelsea FC
4 months
THE TROPHY IS OURS!!! 🔵
3K
22K
152K
@deedydas
Deedy
5 months
AI now beats every single human in the hardest college entrance exam in India, the IIT JEE. Bytedance silently published this result this week. The top scorer was Rajit Gupta with 332/360, but Google's Gemini 2.5 Pro was at rank 1 with 336/360.
151
349
4K
@GoogleDeepMind
Google DeepMind
5 months
Hot Gemini updates off the press. 🚀 Anyone can now use 2.5 Flash and Pro to build and scale production-ready AI applications. 🙌 We’re also launching 2.5 Flash-Lite in preview: the fastest model in the 2.5 family to respond to requests, with the lowest cost too. 🧵
48
134
1K
@joshwoodward
Josh Woodward
5 months
🔥Veo 3 keeps growing like crazy. To keep up, we’re introducing Veo 3 Fast in @GeminiApp and Flow. It’s >2x faster, has the same 720p resolution, and a bunch of serving optimizations. The big headline: we can serve more of it, even for the Yetis! How to get started: 1) Get a
96
111
849