Delip Rao e/σ Profile Banner
Delip Rao e/σ Profile
Delip Rao e/σ

@deliprao

Followers
46,493
Following
4,915
Media
4,629
Statuses
50,736

Busy inventing the shipwreck. @Penn . Past: @johnshopkins , @UCSC , @Amazon , @Twitter ||Art: #NLProc , Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

NYC, 🇺🇸🇮🇳🇹🇼🏳️‍🌈
Joined October 2008
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@deliprao
Delip Rao e/σ
2 years
At 2100+ (not a typo) pages 🤯, this is almost all the Math you need for Machine Learning without dumbing it down! PDF link:
Tweet media one
169
2K
11K
@deliprao
Delip Rao e/σ
9 months
lol at @Reuters for labeling India as “Other” 🖕
Tweet media one
138
847
7K
@deliprao
Delip Rao e/σ
5 months
NEWS: Apple just entered the AI open source arena by quietly releasing their new DL framework called MLX! It runs code natively on Apple Silicon with a single pip install and no other dependencies. Sharing what I discovered from this initial release:
Tweet media one
105
993
7K
@deliprao
Delip Rao e/σ
6 months
Humans seeing sparks of AGI in LLMs
121
971
7K
@deliprao
Delip Rao e/σ
20 days
Label the green and purple ovals.
Tweet media one
@BronskiJoseph
Joseph Bronski
7 months
Can you predict IQ based on grades? One meta-analysis suggests the answer is yes. It found that the correlation between IQ and grades was r = 0.54. That means a scatterplot of the data would look like this:
Tweet media one
234
111
1K
987
248
5K
@deliprao
Delip Rao e/σ
4 months
AI researcher tuning hyperparameters of their LLM
68
672
5K
@deliprao
Delip Rao e/σ
2 years
How much of math is “scary” because people are primed to think that way? I taught some really advanced math concepts to my pre-teen nephew like we were learning the next letter of the English alphabet and it went swimmingly.
Tweet media one
116
465
4K
@deliprao
Delip Rao e/σ
5 months
ML researcher vs. AI researcher
Tweet media one
32
175
4K
@deliprao
Delip Rao e/σ
9 months
Because division is AI? 🧐
Tweet media one
88
137
3K
@deliprao
Delip Rao e/σ
3 months
When Linus tried GPT-4 for coding for the first and only time:
Tweet media one
50
207
3K
@deliprao
Delip Rao e/σ
5 months
I have been testing mistral-medium & GPT-4’s code generation abilities for non-trivial problems. These are problems even experience engineers will take time to work it out. I am summarizing some examples and overall impression in this thread: 🧶
97
313
3K
@deliprao
Delip Rao e/σ
2 years
Don't confuse these two
Tweet media one
22
498
3K
@deliprao
Delip Rao e/σ
4 years
Going from Python ASTs to LaTex!
Tweet media one
Tweet media two
Tweet media three
28
629
3K
@deliprao
Delip Rao e/σ
3 years
We finally have a word for people who are experts in AI, immunology, and Afghanistan all at once.
Tweet media one
102
810
2K
@deliprao
Delip Rao e/σ
8 months
People who believe in this graph have not actually spent significant time coding (with and without copilot).
@mustafasuleyman
Mustafa Suleyman
8 months
the most important graph in software rn. perhaps the most important graph for human productivity in decades.
Tweet media one
127
314
2K
121
125
2K
@deliprao
Delip Rao e/σ
9 months
All coding projects have two parts: 1. The fun part: where you get to "create" 2. The pain part: where you have to debug Code LLMs are "automating" the fun parts while introducing bugs and not helping much with debugging. As a developer, you’re left with more pain to deal with.
147
249
2K
@deliprao
Delip Rao e/σ
3 years
China using AI models of 3D spatial data to give feedback to their athletes. Not sure if this is the reason for their victories, but when you are competing at those levels, every bit makes a huge difference!
48
529
2K
@deliprao
Delip Rao e/σ
1 year
Would’ve been cool if the dx was not under the radical. Maybe the password is “nonsense”?
Tweet media one
77
104
2K
@deliprao
Delip Rao e/σ
10 months
Tweet media one
@elonmusk
Elon Musk
10 months
𝕏 monthly users reach new high in 2023
Tweet media one
12K
13K
195K
16
122
2K
@deliprao
Delip Rao e/σ
5 months
This is huge! Now watch the LLM API costs dropping even further. [.cn PDF link]
Tweet media one
45
229
2K
@deliprao
Delip Rao e/σ
4 months
Crazy AF. Paper studies @_akhaliq and @arankomatsuzaki paper tweets and finds those papers get 2-3x higher citation counts than control. They are now influencers 😄 Whether you like it or not, the TikTokification of academia is here!
Tweet media one
65
284
2K
@deliprao
Delip Rao e/σ
9 months
India's lunar mission cost approximately $90 million*, considerably less than what some AI startups have raised as "seed" rounds. Let that sink in.
52
115
2K
@deliprao
Delip Rao e/σ
4 years
overfitting
Tweet media one
11
366
1K
@deliprao
Delip Rao e/σ
1 year
the success of ChatGPT has lead to a cottage industry of thoughtbois making vacuous hype threads
Tweet media one
45
63
1K
@deliprao
Delip Rao e/σ
9 months
Drop everything you are doing!! Alex Graves pushed a paper on arXiv, so nothing could be more important than reading it. First thing I did was go look for any comments in the TeX file. Unfortunately, it’s all been scrubbed.
Tweet media one
24
162
1K
@deliprao
Delip Rao e/σ
4 years
Interesting story from the interwebs. 26yo Stephen Wolfram was unhappy at being treated badly at the IAS. He wanted to start his own institute to continue his research and asked his ex-colleague, Feynman, for advice. Feynman’s reply:
Tweet media one
21
218
1K
@deliprao
Delip Rao e/σ
10 months
One way to disclose you never studied physics seriously is to recommend “Feynman’s lectures” as a good physics book. The real deal know it is and it will always be Landau-Lifshitz.
115
78
1K
@deliprao
Delip Rao e/σ
3 years
You all are doing machine learning wrong. @OScharenborg shows how it’s really done!
Tweet media one
22
212
1K
@deliprao
Delip Rao e/σ
1 year
OAI will discontinue support for Codex models starting March 26. And just like that, all papers and ideas built atop codex (> 200 on ArXiv) will not be replicable or usable as is. Do we still think openness doesn’t matter?
Tweet media one
Tweet media two
48
222
1K
@deliprao
Delip Rao e/σ
6 months
🚨 BREAKING: Introducing a new language model with ZERO hallucinations! Best of all, it’s easy to implement, easy to deploy, safe, and low resource!
Tweet media one
78
68
1K
@deliprao
Delip Rao e/σ
2 years
@LongFormMath I think it implies you’ve cultivated a good level of safety with the students for them to be comfortable sharing this with you.
2
4
1K
@deliprao
Delip Rao e/σ
5 months
What current students think of AI literature published before 2018.
37
107
1K
@deliprao
Delip Rao e/σ
1 year
What an absurd line of reasoning here. I think he left out Nvidia for providing the GPUs, the power companies for supplying power, and the construction company for building the data center.
Tweet media one
103
48
1K
@deliprao
Delip Rao e/σ
2 years
When you understand the technique/algorithm/modeling, but have no clue about the domain.
Tweet media one
24
140
1K
@deliprao
Delip Rao e/σ
9 months
imagine replying “ship something better” to the person who leads Pixel’s computational photography ("AI" if you prefer) that's raising the bar of what to expect from a camera.
@paulg
Paul Graham
9 months
@docmilanfar Sam is crushing you guys at AI. Ship something better and then you can make fun of him.
59
130
5K
37
46
996
@deliprao
Delip Rao e/σ
2 years
Introducing Elon-sampling
Tweet media one
47
78
995
@deliprao
Delip Rao e/σ
4 years
LaTeX is amazing and fun, but it hit me today why I love it. It’s productive procrastination. As you work on your document, you feel like you are making progress but it’s not the progress you want and you’re okay with that as you have no idea about the other kind of progress.
24
79
970
@deliprao
Delip Rao e/σ
2 months
I have long maintained LLMs make the poor performers mediocre, the average slightly above average, but do not change, and maybe hinder, the performance of top performers. Here’s a result from a university-level physics coding task.
Tweet media one
41
157
934
@deliprao
Delip Rao e/σ
1 year
History class: This 2015 paper is the mother of all LM based pre-training approaches, including the GPT, but few are aware of it. GPT (Radford et al 2018) was a direct application of Transformers (Vasvani et al 2017) to the result in this paper (w/ lot of work & insight ofc).
Tweet media one
13
125
931
@deliprao
Delip Rao e/σ
2 years
one small patching from Intel, and all your scikit-learn operations will run anywhere between 10-100x faster. Just make sure to import and run before any scikit imports. Details:
Tweet media one
8
136
913
@deliprao
Delip Rao e/σ
3 years
In generative deep learning, if your model works it’s a “product”. If it doesn’t, it’s “art”.
9
81
891
@deliprao
Delip Rao e/σ
20 days
@demaria_michael Pit the greens and purples for your problem, and select winners.
7
5
882
@deliprao
Delip Rao e/σ
7 months
First time realizing Good Will Hunting whiteboard stuff was graph theory 101 homework problems.
@PhysInHistory
Physics In History
7 months
Guess the movie ✍️
Tweet media one
173
84
966
9
25
861
@deliprao
Delip Rao e/σ
1 month
Startups are hiring influencers to peddle bullshit. This is a new kind of misinformation to peddle snake oil AI products with no backing. When you dig in, you see the growth rate has been hovering around 8.5%, with leading and lagging indicators for March showing a -16% drop.
Tweet media one
@8teAPi
Ate-a-Pi
1 month
I just heard that Perplexity is growing at 40% per month. If they keep it up they’re going to start making a real dent by the end of the year.
23
11
146
57
60
860
@deliprao
Delip Rao e/σ
2 years
What really is going on with Twitter’s topic models?
Tweet media one
68
60
837
@deliprao
Delip Rao e/σ
27 days
@jennsun I want to know how this plot was made
Tweet media one
9
11
847
@deliprao
Delip Rao e/σ
2 months
Reminder: this was (part of) the team that thought GPT-2 was too dangerous to release, and now they are making models stronger than GPT-4 available on AWS for anyone with an Amazon account to use. This is why I have little trust in “AI safety” claims by Anthropic/OpenAI. It all…
@ajassy
Andy Jassy
2 months
Congrats to Dario and the @AnthropicAI team on their new Claude 3 family of models. Very impressive benchmarks, and excited to have all of them coming to Amazon Bedrock (w/ Sonnet avail today). Many AWS customers are already building with Anthropic’s foundation models, and…
20
77
697
27
82
827
@deliprao
Delip Rao e/σ
9 months
“Affordable RLHF for all” ❤️ It’s almost like an openly rebellious group at MSR have decided to subvert Microsoft’s investments in ClosedAI.
Tweet media one
14
129
792
@deliprao
Delip Rao e/σ
10 months
PSA: Please don’t do this in your code for _any_ keys, especially if you use 3rdparty libraries. I just discovered some 3rd party code exporting all environment vars for telemetry and debugging. The increasing adoption of DevOps tools/libs can lead to key leaks with this code.
Tweet media one
33
75
781
@deliprao
Delip Rao e/σ
1 year
I cannot understand why folks are selling $GOOG because of the Bard “demo mistake”. ChatGPT and other equivalent models make similar mistakes a million times a day. A quick day trade here would be buy & hold GOOG until it bounces back in a couple days once this hysteria is over.
60
26
742
@deliprao
Delip Rao e/σ
8 months
So let me get this straight. It’s dangerous to have powerful open source models but okay to make the similar or more powerful models available via AWS?
48
48
726
@deliprao
Delip Rao e/σ
4 years
Stroke to stroke translation model using YouNet.
3
157
730
@deliprao
Delip Rao e/σ
3 months
So @karpathy leaves OpenAI, which means we are getting some nice YouTube videos?
23
24
729
@deliprao
Delip Rao e/σ
2 months
Meanwhile, Sergey is out talking about debugging Gemini looking like someone who actually spent the night debugging. Google rumors are overstated.
Tweet media one
@annabelstrauss
Annabel Strauss
2 months
Zuck’s personal stylist and PR team are doing the lord’s work. Give these people a raise.
Tweet media one
Tweet media two
Tweet media three
25
28
675
19
39
704
@deliprao
Delip Rao e/σ
5 months
The Untold History of Deep Learning is yet to be written. After reading this Mikolov post, I will never again teach the Seq2Seq paper the same way. Also, doubt people and organizations when they make proclamations about working for the benefit of humanity.…
Tweet media one
19
101
702
@deliprao
Delip Rao e/σ
2 years
Just discovered @rasbt ’s handy watermark module. Just pip install watermark, and you can do this! Very useful for filing GitHub issues and other bug reports.
Tweet media one
3
77
700
@deliprao
Delip Rao e/σ
1 month
If this were a science paper, you would expect a country that picks its science workforce at random as a “weak baseline” and a leading nation like the US to actively experiment towards state-of-the-art, or at least beat the baseline. Not providing a guaranteed path for…
@rdesh26
Desh Raj
1 month
H1B lottery ❌ It was less than a 1 in 3 chance, but sucks anyway!
116
47
1K
34
123
686
@deliprao
Delip Rao e/σ
5 years
Just days away before the book hits the press. Finalizing the cover was one of the last few things to do. #NLProc
Tweet media one
Tweet media two
22
101
676
@deliprao
Delip Rao e/σ
1 year
OpenAI released their ChatGPT API today. Here’s a deep dive: 1. It’s not only a new model, but a new endpoint. Notice the model name says, “gpt-3.5-turbo”. Turbo model is something the paid ChatGPT users (“PLUS”) got a preview a week or so ago.
Tweet media one
16
101
672
@deliprao
Delip Rao e/σ
2 years
that urge to drop everything and go do a math PhD on an esoteric topic in a quiet and beautiful countryside campus.
10
41
665
@deliprao
Delip Rao e/σ
3 years
Personal news: I am beyond thrilled to start as an Entrepreneur in Residence at @allen_ai beginning today, where I will be building magical consumer products incorporating cutting-edge speech research and NLP!
42
9
666
@deliprao
Delip Rao e/σ
1 year
ChatGPT pricing between $0 and $42, there exists a power law of customers willing to pay that. That’s a big gap this pricing creates. The bigger this gap, the more incentive for competitors (they exist) to accelerate their efforts to fill it, and OpenAI will be forced to reprice.
Tweet media one
65
67
645
@deliprao
Delip Rao e/σ
10 months
This is another one of those ill-thought, fear-mongering scientific disinformation about LLMs, and I will explain why in this long thread. 🧶
@_aidan_clark_
Aidan Clark
10 months
I flip-flop on how bad releasing model weights is, but what is clear to me is that we're in a honeymoon period before something bad happens like mass social manipulation and surely Meta is gonna regret making "we let anyone use our great models for anything" a selling point.
60
12
169
6
158
649
@deliprao
Delip Rao e/σ
4 months
walking into 2024 be like
Tweet media one
15
51
638
@deliprao
Delip Rao e/σ
4 years
arXiv: "The World as a Neural Network -- We discuss a possibility that the entire universe on its most fundamental level is a neural network." Me: Yeah, and 2020 is the year of NaNs
Tweet media one
26
129
625
@deliprao
Delip Rao e/σ
3 years
“Must have 4+ years of applied NLP experience using RoBERTa or BERT” — #nlproc jobs as seen on LinkedIn
Tweet media one
19
68
628
@deliprao
Delip Rao e/σ
2 years
Mathematics Wikipedia is endless fun
Tweet media one
16
59
616
@deliprao
Delip Rao e/σ
4 months
I absolutely love this idea of putting negative results in gray. This is a gift you can give to readers for the low effort of throwing in a few annotations while writing. Also, the gray with black is not jarring at all.
Tweet media one
17
68
588
@deliprao
Delip Rao e/σ
30 days
From @chris_j_paxton . Apparently OpenAI is hitting content farms hard. This is why being open about what is going into your models is so important.
Tweet media one
17
61
618
@deliprao
Delip Rao e/σ
1 year
Where my Indian friends at!? I have missed all the sweet nepo deals! Also where do I go to collect my minority perks?? Feeling livid I missed out on whatever memos my Indian friends are sending themselves while I’ve been slugging it out for the past 15+ years and paying fat taxes
@LanaLokteff
Lana
1 year
Indians leave their country of 1.38 billion, come to America/the west and claim minority status to get all the perks. Then they practice nepotism and get ahead on the shoulders of Europeans while we get attacked. Isn't 'diversity' great?
2K
660
4K
23
36
605
@deliprao
Delip Rao e/σ
5 years
Recommendation: If you use #Pandas for pre-processing data before model training, a minimal effort way to parallelize your .apply's is to use the swifter lib. Swifter automatically splits your data, does multiprocess, and returns the result. #Python
Tweet media one
8
120
603
@deliprao
Delip Rao e/σ
6 months
If you see people flaunting ⏸️ or ⏹️ emojis in their profile, keep in mind that this is the level of their understanding of AI, Biology, and the interaction of the two.
Tweet media one
79
45
608
@deliprao
Delip Rao e/σ
3 months
Academia: Linus Torvalds:
Tweet media one
4
48
598
@deliprao
Delip Rao e/σ
5 months
at this point we have moved away from science to a d*ck measuring contest.
Tweet media one
42
36
602
@deliprao
Delip Rao e/σ
4 years
#deepfakes for cats. Cats can’t trust what’s real anymore.
7
138
587
@deliprao
Delip Rao e/σ
8 months
What’s happening?
Tweet media one
40
56
593
@deliprao
Delip Rao e/σ
5 months
I’ve tested many codegen models and GPT-4 was a clear winner until now. Congratulations to @MistralAI for bringing in mistral-medium as a strong competitor for code generation tasks.
4
20
585
@deliprao
Delip Rao e/σ
4 months
A new LLM leaderboard from @allen_ai dropped! CommonGen tests a model’s commonsense reasoning by generating sentences from everyday concepts over 30K concept-sets via crowd-sourcing and caption data. Look at the gap between human and GPT-4! 😱
Tweet media one
26
81
576
@deliprao
Delip Rao e/σ
3 months
Closed science companies like OpenAI and Anthropic parasitically extract value from open science and open source without giving credit to people or organizations building them. Open science with citations would’ve addressed that, but alas that’s too much to ask.
@jainprateek_
Prateek Jain
3 months
Pace of progress in AI is lightning! @OpenAI released MRL style text embeddings, mirroring our NeurIPS '22 paper (w/ awesome folks from UW and Harvard). However, as an advocate of open science, I am a bit disappointed with rebranding to "shortening embs" without ref to MRL 1/n
11
100
660
25
64
563
@deliprao
Delip Rao e/σ
2 years
Finally, a tree all computer science folks can relate to.
@histories_arch
ArchaeoHistories
2 years
Upside-down fig tree in Bacoli, Italy. "No one is quite sure how the tree ended up there or how it survived, but year after year it continues to grow downwards and bear figs." #archaeohistories
Tweet media one
319
6K
38K
4
78
558
@deliprao
Delip Rao e/σ
5 months
Mistral-Medium: “Write cuda-optimized code for generating a PyTorch Dataset of Fibonacci primes” Verdict: No-nonsense, full code, ✅
Tweet media one
Tweet media two
Tweet media three
13
30
559
@deliprao
Delip Rao e/σ
7 months
This is like insisting on baby-proofing every power tool. If anything, this is an endorsement of how versatile and unlobotomized the mistral model is as a *base model*. If you are capable of making an inference on an LLM, it is your responsibility to use it safely.
@paul_rottger
Paul Röttger
8 months
After spending just 20 minutes with the @MistralAI model, I am shocked by how unsafe it is. It is very rare these days to see a new model so readily reply to even the most malicious instructions. I am super excited about open-source LLMs, but this can't be it! Examples below 🧵
217
107
768
22
42
542
@deliprao
Delip Rao e/σ
2 years
I love my partner to death, but dear god, I made the mistake of peeking at some simulation source code he was writing. Never look at source code written by Physicists.
19
19
542
@deliprao
Delip Rao e/σ
2 years
Graduate school hack: become somebody’s first PhD student. Not joking.
@ShaiBiran
Shai Biran 🎗️
2 years
Young PI and their first #PhD student
57
373
4K
7
20
541
@deliprao
Delip Rao e/σ
5 months
I cannot wait to share my deepest secrets with my toaster. Also, Apple ML research team is putting out bangers after another. 😽
Tweet media one
14
52
537
@deliprao
Delip Rao e/σ
8 months
LLM-based company is using Upwork to hire Ph.D.-level Physicists to RLHF-train their models.
Tweet media one
23
48
538
@deliprao
Delip Rao e/σ
1 year
I have a very niche use of ChatGPT -- knock out some code, stick it in ChatGPT, and ask it to generate Python docstrings. I want to write docstrings, but I'm too lazy. ChatGPT is very good at understanding code and summarizing it, and docstring generation is a subset of that.
22
24
534
@deliprao
Delip Rao e/σ
3 months
Turns out you cannot replace decades of painstaking work optimizing around every kind of web content with an LLM and retrieval augmentation. Who could’ve guessed. 🙃
@wenquai
zachary
3 months
Perplexity only beats Google at answering questions. for general search (e.g. looking up restaurants, movies, celebrities etc) Google is still unparalleled i also much prefer Google’s instant results for quick searches
16
12
481
30
31
524
@deliprao
Delip Rao e/σ
3 years
True story. Startups pay high salaries to hire competent employees, and disempower them by not taking their suggestions. Then they invite consultants like me, who sometimes end up giving similar suggestions and get paid for it. Founders, start trusting the people you hired!
10
54
515
@deliprao
Delip Rao e/σ
8 months
Looking at TIME 100 for AI, it appears like list of AI “influencers” with Hinton and a few other actual AI research folks randomly thrown in to give it legitimacy?
37
29
514
@deliprao
Delip Rao e/σ
4 years
For my next word sense disambiguation class #nlproc
Tweet media one
1
73
519
@deliprao
Delip Rao e/σ
3 months
Don’t join a friend’s startup. It will almost always ruin friendships. Instead, join a startup you like and make a new friend. Never fails.
Tweet media one
14
17
502
@deliprao
Delip Rao e/σ
3 years
Ha! This job title doesn’t mince words on what machine learning research has become.
Tweet media one
15
34
486
@deliprao
Delip Rao e/��
22 days
Price for 26 minutes transcription: Humans: $39 to $130 (*) Rev AI: $6.5 (closed model) Whisper on Mac: pennies Open Source AI will upset so many company business models, but it will enable more company business models and, most importantly the individuals, leaning on it to…
@argmaxinc
argmax
22 days
WhisperKit-v0.6.0 dropped yesterday! In the demo, 200 audio files (~26 minutes) are transcribed in ~13 seconds on an M2 Ultra Mac using whisper-base. WhisperKit harnesses all available compute, roughly 60 TFlops on Mac (GPU + 2xANE). This release is our first step towards…
13
75
516
9
39
483
@deliprao
Delip Rao e/σ
4 years
Something incredible is happening at this company. 🙇‍♂️ @Twitter
Tweet media one
4
80
479
@deliprao
Delip Rao e/σ
2 years
What’s the difference between CBOW and Skip-gram, in terms of performance? Turns out nothing if you fix bugs in popular implementations. I love this genre of papers.
Tweet media one
8
58
477
@deliprao
Delip Rao e/σ
2 years
I am absolutely in love with this mouse-over stuff @huggingface API documentations have. It clearly shows they understand their users' pain -- ML APIs have too many arguments. If you are wondering why 🤗 is so popular, little things like these go a long way in reducing friction.
1
40
478
@deliprao
Delip Rao e/σ
1 year
Anyone underestimating the power of autoregressive models has not fully figured out autoregressive models. Even people actively building autoregressive regressive models have not figured them out fully either. Hat-tip @natfriedman
Tweet media one
16
37
474