
Tanay Mehta
@serious_mehta
Followers
10K
Following
39K
Media
440
Statuses
6K
AI eng @Aleph__Alpha | prev: Postgraduate @UniofBath | @Kaggle Notebooks Grandmaster | in the CUDA trenches | opinions are my own
Joined July 2020
@Samhanknr Oh that I agree with, but people selling $NVDA because they think Deepseek somehow bypassed GPU requirement is hilarious. If anything it shows that $NVDA should be more valued because even with export ban, Chinese companies are able to get their hands on Nvidia chips.
65
43
2K
@stats_feed 1. Parts of China were colonies of several European nations.2. Nepal and Bhutan, though not directly controlled by Britain, they were British Protectorates.3. Liberia was an American colony (so technically not Europe but they were colonised).4. Mongolia was under the control of.
85
31
1K
I scored 7 marks in JEE Mains in 2018 and ended up going to a tier-3 private university no one had ever heard of. Most people around me thought I would end up in a 7 LPA job somewhere, if I am lucky. I just kept solving problems, one at a time. Never kill yourself.
I failed JEE badly my rank was like 3lakh smth. I haven't achieved much till now but surely have done quite well (sih24 win, 2x internships) & life is going pretty well & I know I will figure everything out eventually. This JEE race has completely hollowed down the Indian youth
30
46
991
@UpdatingOnRome Oh I miss those times when the Germanic people fought with the Romans over the control of Moscow.
2
2
506
@O42nl Imagine multiplying matrices like hell, left and right, only for the results to start with, “As a Large Language Model trained by OpenAI, I cannot…”.
9
12
481
What I read: Globalisation is making Indian youth realise they deserve more which is causing considerable concern among the Indian companies who now can’t exploit the gullible for laughable pay, even by Indian standards.
Interviewing candidates for our firm, most of them demanding very high salaries. Now that is not an issue, if these people were extraordinarily brilliant, or some IIT, NIT passout. Most of them are from ordinary engineering college, and forget about being extraordinary, they are.
0
52
474
@hkproj I am sorry Umar, but you are wrong here. I can live all I want for my "ideals" until a terrorist comes to my home, asks me to remove my pants and then shoot me because I am a Hindu in front of my loved ones. My maternal great grandfather was killed the same way by Pakistani.
4
8
454
@Ashwin_S18 Quite the opposite, no?.If even one independent company replicates Deepseek-r1 and verifies a training cost in under $6M, this will mean even lesser funded startups can compete. They may even be able to distill the RL trained models too (not the SFT models like Deepseek has done).
17
4
302
This morning, I became a Kaggle Grandmaster ✨🥺. Looking forward to continuing my work on making informative training and inference kernels in different competitions and datasets and helping the community 🚀. Also, I don't think someone has ever become a GM at #69th rank 😅
26
4
288
@TheAnkurTyagi No amount of experience justifies a 3 LPA job in India in 2023. A roadside pani puri seller earns way more than that. For the love of god please don’t justify these low paying exploitative jobs and toxic work culture as “great potential to grow”.
6
4
231
Here you go! I have published my GPT training notebook on kaggle. It features a *new* way of Data loading using PyTorch data loaders and is powered by @LightningAI for quick, clean and elegant model training along with @weights_biases logging!.
4
35
241
Super happy to announce that I just became an Open Source contributor @huggingface transformers🤗. I have added the Poolformer model (from paper: "Metaformer is actually all you need for Vision") by Sea AI Labs. Example snippet down below 👇
16
22
230
Just found out that @kaggle has actually included one of my notebooks that used Jax + @huggingface transformers + @weights_biases tracking for Sentiment Classification as one of the example notebooks for the upcoming Google Open-Source Expert Prize!.
14
7
202
When fine-tuning an LM with newly added tokens, set the new token embedding to be the average of existing embeddings, which bounds the KL-divergence. This will make fine-tuning smoother. I added this in my last @huggingface transformers contribution:.
11
13
196
@therealnaomib Dumbest take I have read this week. “Have high pollution, stop growing” is like saying “Have trouble understanding maths, start solving easier problems”.
1
1
171
When you have no clue what LLMs actually are and what we mean by parameters but you still tweet —.
13
8
179
@latestinspace Just imagine how big a star that once was if it "collapsed" and can still fit 30 Billion suns!.
8
3
171
Announcing the LLM Adventures Notebook series on @kaggle, where I will be making notebooks on various interesting use-cases of LLMs and RAG pipelines using Open LLMs and datasets from Kaggle ✨. Check it out:
1
26
164
The notebook that followed my talk in London is now out!. If you want to understand, code and train your own GPT, take a look at it! Modify it, pull it apart and change it as you see fit 🚀. The data loading part is now chill thanks to @lancedb!.
2
35
149
I've published yet another PyTorch training notebook in the AI4Code competition on #Kaggle. This one's using Microsoft's CodeBERT model (thanks to @huggingface🤗). It includes optimizations, a trainer module and @weights_biases logging & exp. tracking 🚀.
1
17
146
Update on an experiment I did a month ago: I asked here on twitter if I should add a separate "Open Source Contributions" section on my CV, which I did. Of the 3 companies I had applied to with this new CV, I received an initial interview call from 2. Success!. cc @amuldotexe.
5
4
145
Thrilled to announce that I will be joining @tuBraunschweig as a Masters's Student in Data Science for the Summer Semester of 2023!. Can't wait to move to Germany and embark on this new journey 🚀
22
0
142
@Ravisutanjani I don’t know what’s worse, people doing the actual thing or the ones justifying it in this comment section.
3
6
129
My Pull Request for adding the Hinge Loss function to @DeepMind's Optax has been merged today! Going to add many more loss functions to Optax (for all you JAX geeks out there 😉)
6
3
131
What's common is that they all left India to make a better life for themselves because our terrible politics, workplace exploitation, casteism and misogny won't let them build one here.
What’s common???. CEO of Google .CEO of Microsoft.CEO of Adobe.CEO of YouTube.CEO of Mastercard.CEO of Pepsi.CEO of IBM.CEO of Netapp.CEO of Nokia.CEO of Novartis .CEO of Deloitte.
7
6
118
Sundays are for creating datasets using @lancedb on @LightningAI Studio so they can be released on Monday 🚀⚡️
4
7
107
@cmizzy1 Well for starters, we weren’t going through major historical events like every other week.
3
0
95
@Txz67 I was working for US and Israeli startups when I was in India during my undergrad. And I paid for much of my foreign college tuition with the money I saved from working for the startups during Undergrad and Postgrad. Try harder.
3
3
98