Rajat Monga Profile
Rajat Monga

@rajatmonga

Followers
13K
Following
1K
Media
5
Statuses
624

Inference @ Microsoft, Past: Founder TensorFlow, Inference IO

Joined November 2008
Don't wanna be here? Send us removal request.
@rajatmonga
Rajat Monga
1 year
LSTMs were the early language models we scaled up with DistBelief back in 2013, well before TensorFlow, great to see a retake on that combining it with newer ideas. Evolution at work!
@hardmaru
hardmaru
1 year
Sepp Hochreiter giving a keynote talk at #NeurIPS2024 about xLSTM having key structural advantages such as very fast inference speed and high parameter efficiency compared to flash attention transformers and state-space models. xLSTM resources: https://t.co/jgMY8j2xLe
0
0
10
@rajatmonga
Rajat Monga
2 years
Great work by ONNX Runtime Web team enabling Whisper in the browser!
@xenovacom
Xenova
2 years
It's finally possible: real-time in-browser speech recognition with OpenAI Whisper! 🤯 The model runs fully on-device using Transformers.js and ONNX Runtime Web, and supports multilingual transcription across 100 different languages! 🔥 Check out the demo (+ source code)! 👇
0
0
5
@rajatmonga
Rajat Monga
2 years
Each language has a place and time. Java brought value in speeding up project times, and enabling more developers. Spark in C++ (Photon) is a lot more performant, but not many devs can do that well. Now folks want to rewrite golang (new Java) code in Rust (new C++)!
@lemire
Daniel Lemire
2 years
When Java became popular, people (me included) claimed that it was massively better than C/C++. This was highly controversial and people mocked me for using Java. I was hammered by the referees during my first grant application for picking Java as my language of choice. In some
2
2
6
@rajatmonga
Rajat Monga
2 years
Is the era of massive AI model growth over? We got the last 1000X from better compute & smaller number formats. The path to the next 1000X isn't so clear... https://t.co/kHChv8cgIB
0
2
2
@rajatmonga
Rajat Monga
3 years
Awesome result with #AlphaTensor. Anything we can gamify, DeepRL is ready to go.
@GoogleDeepMind
Google DeepMind
3 years
Today in @Nature: #AlphaTensor, an AI system for discovering novel, efficient, and exact algorithms for matrix multiplication - a building block of modern computations. AlphaTensor finds faster algorithms for many matrix sizes: https://t.co/E18DezRPTL & https://t.co/SvHgsa0SNV 1/
0
1
5
@rajatmonga
Rajat Monga
3 years
Small changes => big returns. Find the right leverage points.
@Neeva
Neeva
3 years
1/ Expensive A100 GPUs being underutilized due to CPU bottlenecks? 💰 TensorRT speedup being held back by a busy Python thread? 🐍 Learn more about our journey getting a 20-40% boost by removing CPU as a bottleneck when applying LLM's to millions of pages in our web index. 🧵
0
1
7
@rajatmonga
Rajat Monga
3 years
Next are AR and VR but they have to wait for the next decade.
@rajatmonga
Rajat Monga
3 years
AI enables entirely new experiences. Just as Cloud and Mobile did over the last two decades. This is the decade of AI.
1
1
6
@rajatmonga
Rajat Monga
3 years
AI enables entirely new experiences. Just as Cloud and Mobile did over the last two decades. This is the decade of AI.
@byersblake
Blake Byers
3 years
At Google Venture a decade ago we searched for AI enabled companies and came up dry. That has changed. AI is going to eat software companies. Primarily because it creates entire new UX that incumbents can’t adopt without breaking their product. 10 year hypercycle just started.
0
1
5
@rajatmonga
Rajat Monga
3 years
We thought we were onto something when we were building DistBelief and writing this paper and we were. Amazing looking back a decade later. Great working with @JeffDean @AndrewYNg @quocleix and the whole Brain Team.
@JeffDean
Jeff Dean
3 years
Honored that our 2012 paper "Building High-level Features Using Large Scale Unsupervised Learning" received an @icmlconf Test of Time Award honorable mention! Joint work with @quocleix, @MarcRanzato, @RajatMonga, Matthieu Devin, Kai Chen, @greg_corrado, myself, & @AndrewYNg.
2
0
34
@rajatmonga
Rajat Monga
3 years
The next bump up ↗️ in *maternal mortality* is here. Time to bring it down ↘️
0
0
1
@rajatmonga
Rajat Monga
4 years
SGD is the worst type of optimizer — except for all the others that have been tried.
@ylecun
Yann LeCun
4 years
I've been trying to convince many of my more theory-oriented colleagues of the unbelievable power of gradient descent for close to 4 decades. 1/2
0
0
4
@rajatmonga
Rajat Monga
4 years
"You have to be willing to open black boxes" True for all systems as you scale. Glad @Neeva is talking about what's under the hood.
@Neeva
Neeva
4 years
1/ Building a search index requires processing lots of documents. Systems like @ApacheSpark are great, but require love and attention to detail at scale. You have to be willing to open black boxes to run the engines smoothly. A few Learnings 🧵
0
0
1
@rajatmonga
Rajat Monga
4 years
TIL.
0
0
0
@rajatmonga
Rajat Monga
4 years
We need solutions, not just "no gun control"
Tweet card summary image
nytimes.com
U.S.
0
1
4
@rajatmonga
Rajat Monga
4 years
Data is the attractor here. Apps will be where the data is, not because it is the best solution, but because of the layers of governance and security that people get comfortable with. All data apps will run on one of 5 data platforms - 3 clouds + Snowflake + Databricks.
@sarahcat21
Sarah Catanzaro
4 years
There’s still chatter of building apps on the data warehouse - but the DW still provides a suboptimal experience when doing so many forms of analyses (ML, scenario planning, graph analysis, causal inference).
0
0
1
@rajatmonga
Rajat Monga
4 years
Thoughtful and elucidating. Ideas are indeed Powerful, and hence, can also be Dangerous.
@yishan
Yishan
4 years
I've now been asked multiple times for my take on Elon's offer for Twitter. So fine, this is what I think about that. I will assume the takeover succeeds, and he takes Twitter private. (I have little knowledge/insight into how actual takeover battles work or play out) (long 🧵)
0
0
3
@olliecarroll
Oliver Carroll
4 years
As Russians look to finally take Mariupol - amid unconfirmed claims of chemical weapons use - its worth reflecting that a Facebook group for relatives looking for missing loved ones now has 140k members.
104
3K
8K
@rajatmonga
Rajat Monga
4 years
Agree with @levie on this one wholeheartedly. Call me a slow learner but I don't see enough real issues being solved with Web3 for all the hype.
@levie
Aaron Levie
4 years
@mmasnick @clamentjohn @jack @alphabreacher I just don’t agree with the main premise. We have more protocols that ever before. Maybe there’s an issue of identity portability, but I don’t think that practically solves that much (without introducing a new set of issues).
0
1
4
@rajatmonga
Rajat Monga
4 years
"Two Refugees, Both on Poland’s Border. But Worlds Apart." https://t.co/N300ohnJFa Albagir was punched in the face, called racial slurs ... Katya wakes up every day to a stocked fridge and fresh bread on the table ...
0
0
7
@rajatmonga
Rajat Monga
4 years
Nailed it.
@esammer
Eric Sammer
4 years
Being an engineer-turned-CEO is jealously watching engineers hack on stream processing systems while you edit a privacy policy.
0
1
4