Aditya Kusupati Profile Banner
Aditya Kusupati Profile
Aditya Kusupati

@adityakusupati

Followers
2,825
Following
1,523
Media
95
Statuses
1,478

🔬PhD.. @uwcse : @RAIVNLab ; Been places..... Done things....

Seattle
Joined March 2012
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@adityakusupati
Aditya Kusupati
7 months
Announcing MatFormer - a nested🪆(Matryoshka) Transformer that offers elasticity across deployment constraints. MatFormer is an architecture that lets us use 100s of accurate smaller models that we never actually trained for! 1/9
Tweet media one
7
110
618
@adityakusupati
Aditya Kusupati
2 years
Introducting🪆Matryoshka Representations for Adaptive Deployment🪆 TL;DR: up to 14× lower real-world classification & retrival costs at web-scale at no loss in accuracy & w/o any overhead across setups. Paper: Code: [1/11]
Tweet media one
6
101
493
@adityakusupati
Aditya Kusupati
11 months
Introducing💃AdANNS: A Framework for Adaptive Semantic Search🕺 TL;DR: Up to 90× faster nearest neighbor retrieval and 2× lower memory cost for web-scale search. Applies to vector search at scale & improves all "retrieval" augmented models! [1/8]
Tweet media one
5
97
487
@adityakusupati
Aditya Kusupati
3 months
🤯WOW🪆Matryoshka Representation Learning enables "native support for shortening embs" &"very flexible usage" Jokes aside, excited that @OpenAI serves MRL by default in v3 embedding API for retrieval & RAG! Other models & services should catch-up soon😄
Tweet media one
6
48
371
@adityakusupati
Aditya Kusupati
3 months
To be clear, we are very happy that @OpenAI adopted it & now even more people will continue to innovate on it. Products should use academic research! However, from the pov of a grad student, it would have meant so much to us if there was attribution. Thanks for all the love 🩷
Tweet media one
@jainprateek_
Prateek Jain
3 months
Pace of progress in AI is lightning! @OpenAI released MRL style text embeddings, mirroring our NeurIPS '22 paper (w/ awesome folks from UW and Harvard). However, as an advocate of open science, I am a bit disappointed with rebranding to "shortening embs" without ref to MRL 1/n
11
101
660
7
15
263
@adityakusupati
Aditya Kusupati
4 years
PSA(2) - prospective PhD applicants in CS. Right time to share the "One-stop" destination for all grad application related stuff; maintained by the amazing @kalpeshk2011 & will cover 99% of your bases/Qs. Overwhelming? I will highlight the unmissables👇
4
49
192
@adityakusupati
Aditya Kusupati
3 months
Wohooooooooooooooooo! Thank you all!!!!
@owencm
Owen Campbell-Moore ✪
3 months
@jainprateek_ @OpenAI Hey Prateek! We did train this based on MRL - I was responsible for the blog post and it’s my bad not thinking / remembering to cite. We’re updating the blog post to add a citation now!
11
19
268
7
4
189
@adityakusupati
Aditya Kusupati
5 months
📢📢At the last minute, I decided to go on the job market this year!!! Grateful for RTs & promotion at your univ.😇 CV & Statements: Will be at #NeurIPS2023 ! presenting AdANNS, Priming, Objaverse & MADLAD. DM if you are around, would love to catch up👋
2
49
183
@adityakusupati
Aditya Kusupati
3 years
Multi-class classification of *L* classes traditionally need *L* classifiers. 1st thought in 2016: Can we do this w/ *log L* bits? I finally acted on it, we propose LLC that solves 1000 class ImageNet classification with **20-bits**. A thread🧵(1/8)
3
20
153
@adityakusupati
Aditya Kusupati
3 months
Updated blog from OpenAI: Glad this was resolved, last couple of days have been pretty interesting! Thank you everyone for the support. Back to making my job talk, so I can actually create more things like these when I get kicked out of PhD soon😜
@adityakusupati
Aditya Kusupati
3 months
Wohooooooooooooooooo! Thank you all!!!!
7
4
189
0
7
149
@adityakusupati
Aditya Kusupati
5 months
Now that OpenAI's hierarchy is resolved, time to see if "Hierarchical" representations capture any hierarchy! @ethnlshn investigated this to find that "hierarchical" embeddings don't capture hierarchy any better than regular embeddings 🤯 Paper: 1/2
5
15
142
@adityakusupati
Aditya Kusupati
2 years
This is just amazing!!! Check out this seminar @uwcse about thriving in grad school. Great resources and a step in the right direction 🥳 Great initiative by Anna Karlin, Mike Ernst and @_pratyush_patel
1
34
143
@adityakusupati
Aditya Kusupati
3 months
🤯I still cannot believe that a runaway para (pg 9) in our old paper (LLC, NeurIPS'21) lead to all the Matryoshka works🪆 At that time, I did it as I didn't have compute to train 3 models 🤣 Last couple of weeks have been surreal, thanks everyone! LLC:
Tweet media one
4
8
123
@adityakusupati
Aditya Kusupati
6 months
📢🪆MatViT-B/16 & L/16 model checkpoints & code are public - drop-in replacements that enable elastic compute for free!🔥 Try them out; let us know😉 Shout out to @kfrancischen for the release; @anuragarnab & @m__dehghani for the amazing Scenic library.
Tweet media one
@adityakusupati
Aditya Kusupati
7 months
Announcing MatFormer - a nested🪆(Matryoshka) Transformer that offers elasticity across deployment constraints. MatFormer is an architecture that lets us use 100s of accurate smaller models that we never actually trained for! 1/9
Tweet media one
7
110
618
0
23
109
@adityakusupati
Aditya Kusupati
2 years
I am visiting @berkeley_ai & @GoogleAI for the summer working on fun ideas!! I often frequent SF and south bay while staying in Berkeley.Happy to meet people who might be around. 🌉 DM or email me to grab coffee/gelato or talk research 🔥
Tweet media one
0
0
108
@adityakusupati
Aditya Kusupati
4 years
Can we learn pruning thresholds and, in-turn, layer-wise sparsity distribution? Introducing STR, Soft Threshold Reparameterization, a simple trick on weight tensors to achieve SOTA accuracy in sparse DNNs while reducing inference FLOPs significantly! 1/n
Tweet media one
Tweet media two
4
23
100
@adityakusupati
Aditya Kusupati
4 years
Inspired by the other similar initiatives, the students @uwcse started Pre-Application Review Service (PARS) to provide feedback on the CS PhD application materials. Underrepresented students are strongly encouraged to submit. Deadline: 8th Nov 2020.
1
49
100
@adityakusupati
Aditya Kusupati
4 years
I love reviewing and I am glad that the conferences continue to recognize my efforts. My 3 conferences till now. #NeurIPS2020 - Top 10% #ICML2020 - Top 33% #NeurIPS2019 - Top 400 The best tier in each of them and it just gives enough energy to do it again diligently.
1
1
75
@adityakusupati
Aditya Kusupati
4 years
@thegautamkamath Didn't know having high GPA in undergrad mattered in PhD apps more than being at the "best" school. Had 1 constricted admit in '17. GPA haunted even in '19; Stanford called to ask why was it so low? IT WAS NOT! Compete with AIR top 50 to know what 8.63 is at IITB on a bell curve
3
5
67
@adityakusupati
Aditya Kusupati
29 days
📢🔥Next gen. Google text embeddings, Gecko🦎, leveraging knowledge distilled from LLMs & of course with🪆Matryoshka between 256-d and 768-d🪆! Very strong MTEB results 🥳 Google Cloud API 👉… Paper: Colab:
@leejnhk
Jinhyuk Lee
30 days
Introducing Gecko 🦎, a new text embedding model from Google DeepMind! Distilled from LLMs, Gecko offers powerful embeddings for various NLP tasks. Gecko is now available in Google Cloud API 👉 Paper: Colab:
Tweet media one
10
80
353
0
7
68
@adityakusupati
Aditya Kusupati
24 days
Gecko elastic embeddings powered by Matryoshka🪆 announced today at @Google CloudNext today! Available through Cloud (Vertex) and Google AI Studio.
0
9
62
@adityakusupati
Aditya Kusupati
1 month
Nice to see the application of🪆Matryoshka representations as part of @PinterestEng 's LinkSage model used for AI systems across @Pinterest !!!
0
10
58
@adityakusupati
Aditya Kusupati
1 year
I did a thing (MRL & AdANNS -, TBD); & @GoogleAI thought it was useful & interesting to be in the blog articles on Google Research, 2022 & beyond Read the last para before conclusion. @jainprateek_ , @RAIVNLab , @ShamKakade6 , @wregss , @uwcse , @BhattGantavya
@GoogleAI
Google AI
1 year
As #DeepLearning models become more widely used, it is increasingly important that they be both robust and efficient. Today we summarize some of our many efforts to improve #ML efficiency through algorithms research. →
Tweet media one
28
119
656
2
5
58
@adityakusupati
Aditya Kusupati
3 years
. @GoogleAI India's pre-doc program is a must for anyone wanting to figure out what excites them. If it were not for MSR India's RF program, I probably would never have found my research interests. Of course @jainprateek_ is amazing, and maybe you get to work with me as well 🤪
@jainprateek_
Prateek Jain
3 years
Applications now open for pre-doctoral positions at @GoogleAI India! #predoc Come spend 2 years with our ML and Optimization team working on cutting edge problems in DL understanding, efficient ML, RL, privacy! @ManishGuptaMG1 , @divy93t , @PNetrapalli
2
32
125
3
1
58
@adityakusupati
Aditya Kusupati
3 years
Great thread!! Would have been amazing if it were out 3 months ago, but still an amazing thread for people eyeing a PhD in the long run. Apart from a master's, I would also recommend RAships, long research internships and residency programs to figure out research inclinations!
@fadeladib
Fadel Adib
3 years
I've been serving on grad admissions committees at MIT for 5 years - in EECS and Media Lab If you want to get into a PhD at a place like MIT, here's a thread with some advice based on my observations 1/13
34
552
2K
0
4
54
@adityakusupati
Aditya Kusupati
2 years
Been a safe 1.8 yrs and finally the virus found me even after the booster. Mild symptoms!
4
1
51
@adityakusupati
Aditya Kusupati
2 months
🪆embeddings play very well with quantization! MRL and quantization are complementary and user need not just pick one (a common misconception)! See more in cc: @wregss
@neumll
NeuML
2 months
Would you trade 1% of accuracy to only have to store 1% of the data? With Matryoshka Embeddings, we can drastically reduce vector dimensionality while maintaining strong accuracy. Check out this example that combines Matryoshka Embeddings with Faiss 4-bit scalar quantization 🚀
Tweet media one
3
7
36
3
7
48
@adityakusupati
Aditya Kusupati
28 days
Gecko embeddings on MTEB leaderboard🔥🔥! Best in-class for 768-d & 256-d (thanks to🪆) with a single 1B model! Read more on how it is done in the paper: And try it out on Colab:
Tweet media one
@adityakusupati
Aditya Kusupati
29 days
📢🔥Next gen. Google text embeddings, Gecko🦎, leveraging knowledge distilled from LLMs & of course with🪆Matryoshka between 256-d and 768-d🪆! Very strong MTEB results 🥳 Google Cloud API 👉… Paper: Colab:
0
7
68
0
5
48
@adityakusupati
Aditya Kusupati
2 years
Our latest work!! I will share a tweet thread soon🪆 Work co-led w/ @BhattGantavya @wregss Collab between @uwcse / @RAIVNLab & @GoogleAI Co-authors: Matt Wallingford, @adityaasinha , @RamanujanVivek , William, Kaifeng, @ShamKakade6 , @jainprateek_ and Ali Farhadi.
@_akhaliq
AK
2 years
Matryoshka Representations for Adaptive Deployment abs: flexibility within the learned Matryoshka Representations offer: (a) up to 14× smaller embedding size for ImageNet-1K classification at the same level of accuracy
Tweet media one
1
15
69
0
6
46
@adityakusupati
Aditya Kusupati
5 months
📢📢We are releasing MatFormer based OLMo ( @allen_ai ) checkpoints (up to 1.3B params & 160B tokens) and code to further open research powered by @KempnerInst cluster🔥🔥 Checkout our new article summarizing MatFormer as part of the Deeper Learning blog!
@KempnerInst
Kempner Institute at Harvard University
5 months
In our latest Deeper Learning blog post, the authors introduce an algorithmic method to elastically deploy large models, the #MatFormer . Read more: #KempnerInstitute @adityakusupati @snehaark @Devvrit_Khatri @Tim_Dettmers
Tweet media one
0
12
21
1
12
43
@adityakusupati
Aditya Kusupati
3 months
Thanks for the kind words Andrew! Glad people are appreciating utility of MRL.. Been a hard battle convincing people in retrieval world to use it and improve things beyond current scale!
@ZhaiAndrew
Andrew Zhai
4 months
(1) Matryoshka embeddings (): TLDR in training given an embedding of full dimensionality M (e.g. 2048), learn N different distance functions for each prefix of the embedding (e.g. l2_norm(embedding[:32]), l2_norm(embedding[:64]), l2_norm(embedding[:128]),…
Tweet media one
2
19
172
2
1
41
@adityakusupati
Aditya Kusupati
10 months
The code for AdANNS💃 is out at (primarily led by @wregss ) and has integrations with Faiss NN search library along side native GPU/CPU implementations Check it out and let us know!
@adityakusupati
Aditya Kusupati
11 months
Introducing💃AdANNS: A Framework for Adaptive Semantic Search🕺 TL;DR: Up to 90× faster nearest neighbor retrieval and 2× lower memory cost for web-scale search. Applies to vector search at scale & improves all "retrieval" augmented models! [1/8]
Tweet media one
5
97
487
1
10
38
@adityakusupati
Aditya Kusupati
3 months
A beautiful walk through how to enable first pass adaptive retrieval with Matryoshka Representations 🪆 Of course, you can further improve QPS if you can make ANNS take all the granularities, so check out AdANNS that makes it even faster! cc. @wregss
@ggrdson
Greg Richardson
3 months
Matryoshka embeddings allow you to "shorten" their dimensions (eg. OpenAI's v3 embeddings). But if you're like me, you probably want to know: ‣ how does shortening actually work? ‣ how are these models trained differently? ‣ can we take advantage of this in vector search?
4
18
56
2
3
36
@adityakusupati
Aditya Kusupati
2 years
Congrats to @uday_kusupati (his 1st paper in Ph.D.) and his co-authors @GCM_EPFL for winning an inaugural "Best Paper Honorable Mention" @siggraph for their work on "Umbrella Meshes: Elastic Mechanisms for Freeform Shape Deployment" 🥳⛱️☂️🌂
Tweet media one
2
1
35
@adityakusupati
Aditya Kusupati
3 years
My reviewing load started to get to me and I decided to reach out to both @CVPR and @iclr_conf (not submitting) asking for a reduced load. Both of them graciously accommodated my request (Within 5 mins of emailing). Thanks! Always ask when you might not be able to give your 100%
0
1
34
@adityakusupati
Aditya Kusupati
1 year
Happening now!! Hall J #640 MRL!!! 🪆🪆🪆🪆🪆 FTW
Tweet media one
@adityakusupati
Aditya Kusupati
1 year
Excited to be back for an in-person @NeurIPSConf conference after Feb'20!! I will be around the whole week. Feel free to DM me for catch up or a chat. Drop by MRL's 🪆 poster (Tue 11a Hall J #640 ) w/ @wregss , @adityaasinha & @ShamKakade6 . Also at @GoogleAI booth on Tuesday!
1
6
23
0
1
34
@adityakusupati
Aditya Kusupati
2 months
Had fun chatting with @CShorten30 @ZainHasan6 @zach_nussbaum on everything 🪆🪆 Do checkout the podcast for the tidbits!
@CShorten30
Connor Shorten
2 months
Matryoshka embeddings are one of the most recent and powerful innovations in Vector Search! 🪆 The core idea is to learn multiple vectors of increasing lengths, which opens up all sorts of new opportunities for the mechanics of nearest neighbor search! I am SUPER excited to…
Tweet media one
7
42
160
2
4
32
@adityakusupati
Aditya Kusupati
2 years
Dear ML twitter, @wregss , a master's student at UW ECE had his @NeurIPSConf grant application denied. Does anyone have a complimentary registration that could be transferred to him? He will be applying for PhD this cycle & is terrific at getting things to work!! Thanks.
1
3
31
@adityakusupati
Aditya Kusupati
4 years
After a hectic quarter ending w/ a robotic @icmlconf video recording. Yeah, STR got in. Excited to explore new things @NVIDIAAI w/ @FidlerSanja & Antonio Torralba. A great summer in Toronto, Austria & India? - Looking forward to recreating it from Seattle
1
1
30
@adityakusupati
Aditya Kusupati
4 years
PSA - prospective PhD applicants in CS Looking for a clean CV template? I have open-sourced mine and people seem to like it😅 Template: Built this over a span of 2 years, by editing and adding to an amazing .cls file. Has most blocks we typically need.
0
4
29
@adityakusupati
Aditya Kusupati
3 years
MSR RF program is the reason I am doing an ML PhD today. It was 2 beautiful years filled with amazing learning experiences and great mentors!! I strongly recommend RF program to anyone wanting to venture into research. Thanks to @jainprateek_ @SriramRajamani among many others!!
@indrani_mthies
Indrani Medhi Thies
3 years
Microsoft Research India invites applications for its 2-year Research Fellow program. Recent graduates and Bachelors/Masters students majoring in CS or a related area may apply. Deadline Jan 15, 2021. Please RT.
6
74
219
0
2
29
@adityakusupati
Aditya Kusupati
4 years
GRE score sometimes was the reason for not making a PhD offer😕 It doesn't factor PPP & there is nothing worth INR 14K (Indian PhD student stipend: 30K for comparison) in those stupid under-equipped noisy halls. Persnickety is a GRE word which describes it so well #AbolishGRE
@mynkgoel
Mayank Goel
4 years
This is especially true for PhD programs. I don't buy the argument that it is a good initial filter. I hope @SCSatCMU removes this as a requirement (and not just for this year) and does not wait for each phd program to do this separately.
4
2
58
2
3
27
@adityakusupati
Aditya Kusupati
2 months
Go use🪆in Sentence Transformers!
@tomaarsen
tomaarsen
2 months
🔥Sentence Transformers v2.4.0 is released! It introduces Matryoshka Embedding models (training & inference), 2 new state-of-the-art loss functions, prompt templates, instructor model support & more. See the🧵
6
93
377
1
1
26
@adityakusupati
Aditya Kusupati
4 years
@gautamcgoel That deep learning should not be the immediate solution for anything.
0
1
25
@adityakusupati
Aditya Kusupati
4 years
This is the truth, the whole truth, and nothing but the truth. This whole thread is the reason that show can't be looked at like a joke. The trailer was enough to give me jitters. Things are still the same for most of the girls in my hometown (a tier 3 town in India).
2
1
24
@adityakusupati
Aditya Kusupati
4 years
Adding to the meltdown. Just got a desk reject with default feedback (as @sineadwilliamso mentioned) from NeurIPS for a paper we are super proud of. I reviewed 6 papers last week & was fully convinced that desk rejects didn't go out looking at a couple of papers in my pool 🤷‍♂️!
1
0
25
@adityakusupati
Aditya Kusupati
4 years
We will be presenting STR at #ICML2020 this Thursday at noon and again at midnight PDT. Paper: Code: Video: Please reach out here or in the rocket chat to discuss anything!!🥳🥳🥳
2
2
25
@adityakusupati
Aditya Kusupati
1 year
You're meant to say "Clippy is back".
Tweet media one
@tunguz
Bojan Tunguz
1 year
As one of the most unsurprising moves in tech, @Microsoft has announced that they will incorporate all of @OpenAI tools into their products. In particular, this means a wide availability of ChatGPT in various Office products. 1/9
45
205
2K
1
0
23
@adityakusupati
Aditya Kusupati
2 years
Starting the day with a #NeurIPS rant!! A beautiful pair of uninformative yet highly confident and low scoring reviews 🫠
0
1
22
@adityakusupati
Aditya Kusupati
1 year
Excited to be back for an in-person @NeurIPSConf conference after Feb'20!! I will be around the whole week. Feel free to DM me for catch up or a chat. Drop by MRL's 🪆 poster (Tue 11a Hall J #640 ) w/ @wregss , @adityaasinha & @ShamKakade6 . Also at @GoogleAI booth on Tuesday!
@adityakusupati
Aditya Kusupati
2 years
Introducting🪆Matryoshka Representations for Adaptive Deployment🪆 TL;DR: up to 14× lower real-world classification & retrival costs at web-scale at no loss in accuracy & w/o any overhead across setups. Paper: Code: [1/11]
Tweet media one
6
101
493
1
6
23
@adityakusupati
Aditya Kusupati
1 year
Thanks @ducha_aiki for sharing our new work led by Matt Wallingford! We will share a tweet thread with webpage and code soon!! TL;DR: Transferable/shared object-centric representations across scenes using NeRF -- generalizes well from synthetic to real-world datasets!
@ducha_aiki
Dmytro Mishkin 🇺🇦
1 year
Neural Radiance Field Codebooks Matthew Wallingford, Aditya Kusupati, Alex Fang, Vivek Ramanujan, Aniruddha Kembhavi, Roozbeh Mottaghi, Ali Farhadi tl;dr: discovered 3D assets for NERFs?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
15
69
1
0
22
@adityakusupati
Aditya Kusupati
3 years
I am honored to be a student (still am) of this superstar!! Congrats @jainprateek_ 🥳🥳
@DoRA_IITK
Dean of Resources & Alumni, IITK
3 years
#YAA Dr. Prateek Jain ( @jainprateek_ ) received his bachelor's degree in Computer Science & Engineering from @IITK in 2004, and subsequently went on to earn his doctorate degree from the University of Texas in Computer Science & Engineering in 2009.
Tweet media one
4
4
64
0
0
22
@adityakusupati
Aditya Kusupati
6 months
Funny finding these "Matryoshka Transformer" dolls. 🪆🪆
Tweet media one
@adityakusupati
Aditya Kusupati
7 months
Announcing MatFormer - a nested🪆(Matryoshka) Transformer that offers elasticity across deployment constraints. MatFormer is an architecture that lets us use 100s of accurate smaller models that we never actually trained for! 1/9
Tweet media one
7
110
618
1
0
21
@adityakusupati
Aditya Kusupati
5 months
Krishna was and is a guiding light for me whenever I sought guidance in research and life! Super excited that he is going back to Indian academia -- @iitmadras is lucky!! Go work with him and def talk to him at NeurIPS!
@KrishnaPillutla
Krishna Pillutla
5 months
I’m thrilled to announce that I'll be joining @iitmadras as an Assistant Professor in April 2024! I’m immensely grateful to my amazing mentors, family, and friends for their unwavering support. (1/4)
49
58
2K
0
0
21
@adityakusupati
Aditya Kusupati
11 months
Co-led w/ @wregss (headed for a PhD @WisconsinCS ) Joint work w/ Sharan, Alan, @sysnlp , @ShamKakade6 , @jainprateek_ & Ali Farhadi. @uwcse ( @RAIVNLab , @uw_wail & @uwnlp ) & @GoogleAI Find out more about💃AdANNS🕺in the🧵below. [2/8]
Tweet media one
1
1
21
@adityakusupati
Aditya Kusupati
2 years
WHATTT!!!!
@SoundersFC
Seattle Sounders FC
2 years
IT'S HAPPENING! The @FIFAWorldCup is coming to Seattle in 2026!
Tweet media one
110
1K
6K
1
0
21
@adityakusupati
Aditya Kusupati
4 years
🎉🎉🎉🎉. I got reimbursed for @icmlconf 😋, same $25.
@thegautamkamath
Gautam Kamath
4 years
My (incoming) student wanted to register for #STOC2020 . I told him to go ahead and I'll pay for it with my grants. He said not to worry about it, since it's only $25 for students. But I told him this (see gif). Message: Your advisor is there to support all your academic expenses.
11
27
572
0
1
19
@adityakusupati
Aditya Kusupati
2 years
As always @ak92501 beat us in posting about this to the Twitterati!! Thanks😅 Please do comment or reach out to us in case you have any suggestions or comments on the work!!
@_akhaliq
AK
2 years
Matryoshka Representations for Adaptive Deployment abs: flexibility within the learned Matryoshka Representations offer: (a) up to 14× smaller embedding size for ImageNet-1K classification at the same level of accuracy
Tweet media one
1
15
69
1
2
20
@adityakusupati
Aditya Kusupati
2 months
Visit days change your preferences and opinions a lot! Have a fun and make a not so obvious but important decision!
@Tim_Dettmers
Tim Dettmers
2 months
It is currently PhD visit days at UW. Choosing among schools for a PhD is a tough choice. I wrote a blog post about some ways to think about this choice to make it easier and to find the school that is the best fit for you:
0
19
108
0
1
19
@adityakusupati
Aditya Kusupati
3 years
See y'all at @NeurIPSConf (virtually 🥲)
@adityakusupati
Aditya Kusupati
3 years
Multi-class classification of *L* classes traditionally need *L* classifiers. 1st thought in 2016: Can we do this w/ *log L* bits? I finally acted on it, we propose LLC that solves 1000 class ImageNet classification with **20-bits**. A thread🧵(1/8)
3
20
153
2
0
19
@adityakusupati
Aditya Kusupati
7 months
These smaller MatFormer submodels are way more consistent with the largest model than independently trained models. This (1) enables consistent deployment across scales and (2) boosts inference optimization techniques like speculative decoding (6% speedup over baseline) 4/9
Tweet media one
1
0
19
@adityakusupati
Aditya Kusupati
4 years
1/5 Happy to share our recent work: RNNPool, an RNN based pooling op/layer to replace memory-hungry Conv blocks without losing accuracy while reducing working RAM in CNNs w/ Oindrila Saha, Harsha Simhadri, Manik Varma & @jainprateek_ from @MSFTResearch
Tweet media one
2
3
18
@adityakusupati
Aditya Kusupati
1 year
Looks like @elonmusk 's first PR was a success! print("[Test Page 1]")
Tweet media one
0
3
19
@adityakusupati
Aditya Kusupati
3 years
Amazing initiative, I am a volunteer and I am looking forward to the event and learning the perspectives. Extraordinary opportunity for Indian ML Eco system.
@PNetrapalli
Praneeth Netrapalli
3 years
Delighted to announce #MLinIndia social at #NeurIPS2021 . Do attend if you are interested in exploring long term career opportunities in ML in India. More details: RSVP: @iKDD_News @sandeepjuneja66 @amt_shrma @shouryaroy
1
51
193
0
1
18
@adityakusupati
Aditya Kusupati
3 months
The speed at with Nomic is shipping is insane! More 🪆Matryoshka🪆embeddings to play with -- now all public! Check out their visualization, there are some things I couldn't explain, maybe the community can help us with that 🔥
@nomic_ai
Nomic AI
3 months
Announcing Nomic Embed v1.5 🪆🪆🪆 - Variable sized embeddings with matryoshka learning and an 8192 context. - Outperforms OpenAI text-embedding-3-small across output sizes. - Open source, open training code, open data. Day 0 in @LangChainAI , @llama_index and @MongoDB
11
84
477
1
3
18
@adityakusupati
Aditya Kusupati
3 years
So this is the way I become part of an Ivy? Congrats @ShamKakade6 !!!
@boazbaraktcs
Boaz Barak
3 years
1/21 Banner year for Harvard CS! New hires include Sham Kakade @ShamKakade6 and Fernanda Viegas @viegasf (joining @wattenberg ), as well as David Alvarez-Melis, Anurag Anshu @AnuragAnshu4 , Sitan Chen, and Jonathan Frankle @jefrankle
Tweet media one
5
10
198
1
0
18
@adityakusupati
Aditya Kusupati
2 years
When pretraining for say 2048-d representation, MRL🪆also optimizes for a few (O(log(d))) lower dimensions (8, 16, 32..) to solve the same task w/ equal weighting. That's it! That is all the algorithm. Works across setups, models, tasks, modalities w/ default hparams! [4/11]
1
2
17
@adityakusupati
Aditya Kusupati
4 years
@MSFTResearch 's v3 of EdgeML repo: is released. It contains scalable CUDA supported custom RNNCells along with the #NeurIPS19 paper on ShaRNN and #BuildSys19 paper on MSC-RNN. A combined effort of many people over 2.5 years 🤩
1
4
18
@adityakusupati
Aditya Kusupati
7 months
It’s simple - we jointly optimize for just n=4 transformers of different sizes nested in the same weight space. With a trained MatFormer model, we can Mix’n’Match model capacities across layers to get 100s of accurate models. 2/9
1
0
18
@adityakusupati
Aditya Kusupati
3 months
@owencm @jainprateek_ @OpenAI Thanks for being candid with us Owen! Glad this could be served to the users globally 😄🪆🪆🪆
1
0
17
@adityakusupati
Aditya Kusupati
7 months
MatFormer opens up many exciting directions in elastic deployment while being cheaper to train than all the baselines! Co-led w/ amazing @Devvrit_Khatri & @snehaark Collab w/ @Tim_Dettmers @kfrancischen @inderjit_ml Yulia @HannaHajishirzi @ShamKakade6 Ali & @jainprateek_ 8/9
1
0
17
@adityakusupati
Aditya Kusupati
3 years
Check out this awesome work by @Tim_Dettmers ! Tim's papers are always so grounded for real-world utility!! Very few people have the eye for identifying the actual problem and then solve it.
@Tim_Dettmers
Tim Dettmers
3 years
I am excited to share my latest work: 8-bit optimizers – a replacement for regular optimizers. Faster 🚀, 75% less memory 🪶, same performance📈, no hyperparam tuning needed 🔢. 🧵/n Paper: Library: Video:
Tweet media one
18
285
1K
0
0
17
@adityakusupati
Aditya Kusupati
4 years
@Mitchnw from @uwcse presented their work on Discovering Neural Wirings @NeurIPSConf . The poster was a crowd puller. The paper provides a new way of thinking about connections in deep neural networks. Their accompanying demo was a super cool visualization!!!
Tweet media one
2
1
17
@adityakusupati
Aditya Kusupati
3 years
. @ssgrn & I have been seeing some very interesting results through this survey at @uwcse . We would like to see if things hold the same at scale!! Survey: RTs appreciated 🥳🤓😇
@ssgrn
Suchin Gururangan
3 years
Twitter-verse! @adityakusupati & I are doing a user study to understand how people perceive Internet text, possibly generated by machines. We'd appreciate if you could take our fun 6-question survey, which should take about 15 mins. Thanks a ton! Survey:
2
12
21
0
6
16
@adityakusupati
Aditya Kusupati
2 years
Co-led w/ @BhattGantavya , @wregss Joint work w/ Matt Wallingford, @adityaasinha , @RamanujanVivek , William, Kaifeng, @ShamKakade6 , @jainprateek_ , & Ali Farhadi. @uwcse ( @RAIVNLab & @uw_wail ) & @GoogleAI Find out more about🪆Matryoshka Representations🪆in the🧵below. [2/11]
Tweet media one
1
1
16
@adityakusupati
Aditya Kusupati
3 months
Check this accessible blog post by Aniket all about 🪆🪆
@wregss
Aniket Rege
3 months
With all the recent hype around 🪆Matryoshka Representation Learning 🪆(Thanks @OpenAI !), I finally put my longstanding plan of writing a detailed blog about MRL to action This blog is NOT a paper walkthrough (see @RitvikRastogi19 for that!) (1/7)
6
24
168
0
1
16
@adityakusupati
Aditya Kusupati
3 months
This paper changed some of my thoughts on how to think about LLMs! Highly recommend it to everyone 🔥🔥
@vishalmisra
Vishal Misra
3 months
New work by us on Large Language Models - how and why they work, and what is “in-context-learning” We show that (to quote Dave Blei when he saw our work) “ICL is not magic: it is (consistent with) LM smoothing and Bayesian statistics!” (1/n)
3
21
123
2
0
16
@adityakusupati
Aditya Kusupati
3 years
AI residencies are newer versions of the RF program!! Amazing program and great people :D
@Sridhar2k
Sridhar Vedantham
3 years
How can young students prepare for a career in research? Join me as I speak to @shrutirij and Vivek Seshadri about MSR India’s Research Fellow program.
1
15
58
0
1
15
@adityakusupati
Aditya Kusupati
2 months
The good old Adaptive Retrieval (two-phase) using 🪆 implemented in @LangChainAI Use it for more saving and better performance!
@LangChainAI
LangChain
2 months
🪆Matryoshka Retriever A recent blog post by @supabase described a new technique for higher performance retrieval, without compromising accuracy. So we've just released a new retriever in LangChain.js 🦜🔗 which implements this exactly! Use any vector store, and two embedding…
Tweet media one
6
102
462
0
1
15
@adityakusupati
Aditya Kusupati
7 months
We find that MatFormer scales as reliably as vanilla Transformers and that we're able to fit a single scaling law for all MatFormer submodels. 5/9
Tweet media one
Tweet media two
1
0
15
@adityakusupati
Aditya Kusupati
11 months
In conclusion, AdANNS achieves SOTA accuracy-compute trade-off for the two main ANNS building blocks: search data structures (AdANNS-IVF) and quantization (AdANNS-OPQ). We shall release the code shortly at Paper:
1
0
15
@adityakusupati
Aditya Kusupati
7 months
For a 2.6B decoder-only MatFormer Language Model (MatLM), the optimized smaller models are as accurate as baselines with even more "free" models that improve predictably with scale! 3/9
Tweet media one
1
0
15
@adityakusupati
Aditya Kusupati
3 years
@aravindr93 . @aravindr93 how did this go so viral? 😋😅 But isn't it the same feeling every year though? So much imposter syndrome on looking at these apps!!
1
0
15
@adityakusupati
Aditya Kusupati
7 months
MatFormer can be further extended to Vision Transformer-based encoders (MatViT), where we also see that we can use Mix’n’Match to extract models that span the accuracy-vs-compute curve. 6/9
Tweet media one
1
0
15
@adityakusupati
Aditya Kusupati
2 years
He was a great friend and I am still unable to process what happened. Please consider contributing to the fund?!
@dakshita10
Dakshita Khurana
2 years
We will miss his infectious smile, and the warmth and energy he brought to our research family. We are raising funds in personal capacity to support his family and to contribute to a future memorial objective, here:
3
26
197
0
1
15
@adityakusupati
Aditya Kusupati
4 years
UW needs serious disambiguation University of: 1) Washington - @UW 2) Wisconsin - @UWMadison 3) Waterloo - @UWaterloo 4) Wyoming - @UWyonews 5) Windsor - @UWindsor 6) Warwick - @warwickuni 7) Worcester - @worcester_uni Too much confusion. Add if I missed any!!
1
1
14
@adityakusupati
Aditya Kusupati
4 years
TIL, Mercury is the mostest closest planet to Earth😱. Also, Mercury is the mostest closest planet to every significant object in the solar system. 🤯 Mostest closest == on average. Apparently, this wasn't published until 2019!!
2
0
13
@adityakusupati
Aditya Kusupati
7 months
Finally, MatFormer enables truly elastic adaptive retrieval for the first time. MatFormer allows for elastic query-side encoders while having a fixed database index for large-scale retrieval. Why? MatFormer preserves the metric space, unlike independently trained models. 7/9
Tweet media one
1
0
14
@adityakusupati
Aditya Kusupati
2 years
Acknowledge the safety net, aim high, be brave and just break the limits!! Congrats @jainprateek_ !!!
@DoRA_IITK
Dean of Resources & Alumni, IITK
2 years
#YAA2021 #iitk Dr. Prateek Jain ( @jainprateek_ ) was awarded the prestigious IITK Young Alumnus Award 2021 by @IITKanpur on its Foundation Day, 02 Nov. 2021.
1
4
35
0
0
14
@adityakusupati
Aditya Kusupati
7 months
Across @UTAustin @RAIVNLab @uwnlp @GoogleDeepMind @GoogleAI @hseas w/ support from @orf_bnw @tduerig @_arohan_ @LukeZettlemoyer @ManishGuptaMG1 Rahul Sukthankar & @JeffDean There is so much more that can be done by enabling elastic compute for generation and search! 9/9
1
0
14
@adityakusupati
Aditya Kusupati
5 months
Something very natural and what quite a few students have done for their deep learning course projects in past few years! Good to see this used in real-world and especially in Seattle!
@GoogleAI
Google AI
5 months
Post-event traffic congestion around stadiums is a problem worldwide. Today, in partnership with the Seattle Dept. of Transportation, we demonstrate a potential simulation-based plan calibrated for the specific dynamics of the area for better traffic flow.
14
61
279
0
0
14
@adityakusupati
Aditya Kusupati
3 years
Openreview crashed!!! 😂😂😂😂😂
2
0
14
@adityakusupati
Aditya Kusupati
3 years
I was meaning to say the "emeritus" thing somewhere in the tweetstorm, glad @uwcse made it public. I think the discussion went off rails really fast and went into rubbing on the old but healing wounds. The language was insensitive and the logic was non-existent!!
@uwcse
Allen School
3 years
#UWAllen leadership is aware of recent “discussions” involving Pedro Domingos, a professor emeritus (retired) in our school. We do not condone a member of our community engaging in a Twitter flame war belittling individuals and downplaying valid concerns over ethics in AI. 1/11
30
217
2K
0
0
13
@adityakusupati
Aditya Kusupati
11 months
Approximate nearest neighbor search (ANNS) powers web-scale vector-based retrieval ( @pinecone , @weaviate_io , @vespaengine ). However, ANNS pipelines use a fixed high-dimensional "rigid" representation for all components (768d embedding)➡️computationally expensive retrieval. [3/8]
1
0
13
@adityakusupati
Aditya Kusupati
8 months
Check out our recent dataset release -- 2.8T covering 419 languages led by @snehaark !! It was fun understanding the sources of these datasets and auditing them.
@snehaark
Sneha Kudugunta
8 months
Excited to announce MADLAD-400 - a 2.8T token web-domain dataset that covers 419 languages(!). Arxiv: Github:   1/n
Tweet media one
24
138
809
0
1
13
@adityakusupati
Aditya Kusupati
4 years
Woww!! Our #BuildSys19 paper on low-resource RNNs for radar classification won the "Best Paper Runner-Up" Award. Real-time on-device deep learning on ARM Cortex-M3. 🤩
Tweet media one
@adityakusupati
Aditya Kusupati
5 years
Excited to share our work on "Multi-Scale Cascaded RNNs for Radar Classification" (oral & demo at BuildSys19) . Collab @MSFTResearch + @OhioState + @iitdelhi to enable DL based noise rejection and target classification on resource-constrained radars.
0
0
1
1
0
13
@adityakusupati
Aditya Kusupati
1 month
Really cool and blazing fast!! Great work @jefrankle and @DbrxMosaicAI team! P.S: With all the meme game on folks knowing about best open weights model in advance -- I kinda knew 😅😜
@jefrankle
Jonathan Frankle
1 month
Meet DBRX, a new sota open llm from @databricks . It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.
Tweet media one
34
267
1K
1
0
13
@adityakusupati
Aditya Kusupati
4 years
Will be attending @NeurIPSConf this year as well, not as an author but as a "Best Reviewer" 😅😝. Hit me up if you want your next big idea be reviewed early. Let's grab some coffee and chat about anything and everything ✌️✌️☃️. Be ready for an inquisitive visitor at your posters.
1
0
13