bneyshabur Profile Banner
Behnam Neyshabur Profile
Behnam Neyshabur

@bneyshabur

Followers
26K
Following
925
Media
120
Statuses
805

Research @AnthropicAI 💼 Past: Gemini @GoogleDeepMind (Co-led Blueshift team) 🧠 LLM Reasoning / AI Scientist 🎒Traveling & Backpacking -- All views are my own!

Joined May 2014
Don't wanna be here? Send us removal request.
@bneyshabur
Behnam Neyshabur
22 days
@ethansdyer and I have started a new team at @AnthropicAI — and we’re hiring!. Our team is organized around the north star goal of building an AI scientist: a system capable of solving the long-term reasoning challenges and core capabilities needed to push the scientific.
6
19
413
@bneyshabur
Behnam Neyshabur
6 months
Thrilled to share that I’m joining @AnthropicAI !. After 5.5 amazing years at Alphabet, including working on Gemini’s reasoning over the past 2 years, I’m looking forward to advancing Claude’s ability to tackle complex reasoning challenges across a diverse range of domains!.
61
23
1K
@bneyshabur
Behnam Neyshabur
4 years
Some people say that one shouldn't care about publication and the quality matters. However, the job market punishes those who don’t have publications in top ML venues. I empathize with students and newcomers to ML whose good papers are not getting accepted. #ICLR2021 .1/.
17
181
1K
@bneyshabur
Behnam Neyshabur
2 years
Excited to announce that the entire Blueshift team has joined @DeepMind! We will be working with @OriolVinyalsML and others to advance capabilities of LLMs developed by DM / Alphabet! We hope to continue to grow DM's presence in Bay Area and New York in the coming months :-)
Tweet media one
32
53
1K
@bneyshabur
Behnam Neyshabur
2 years
My favorite Gemini demo:.
31
217
1K
@bneyshabur
Behnam Neyshabur
2 years
These days, many people are interested in getting a PhD in ML. I think you should think really hard before committing to a PhD program in ML. Why?. I'm going to summarize some thoughts in this thread:. 1/10.
@ylecun
Yann LeCun
2 years
The main author of DALL-E at OpenAI, Aditya Ramesh, has no graduate degree. He has a bachelor from NYU. He worked on a couple of research projects in my lab in his last years. He wanted to do a PhD after graduating. But he did a summer internship at OpenAI, and they kept him.
24
144
1K
@bneyshabur
Behnam Neyshabur
3 years
Totally agree!. Anyone screening applications and any applicant thinking their CV is not representative of their skills/potentials, I think you might want to read the story of my own PhD application in this thread:. 1/.
@docmilanfar
Peyman Milanfar
3 years
Any document claiming an easy way to gauge grad school applicants needs to be challenged. To wit: While 2 or 3 Unis in Iran are far more selective than others, the # of outstanding candidates far exceeds their enrollment. The given ranking is thus opinion not fact & is misleading
Tweet media one
Tweet media two
29
164
1K
@bneyshabur
Behnam Neyshabur
5 years
💡💡What is the best acc an MLP can get on CIFAR10❓. 65%❓ No, 85%‼️. Trying to understand convolutions, we look at MDL and come up with a variant of LASSO that when applied to MLPs, it learns local connections and achieves amazing accuracy!. Paper: 1/n
Tweet media one
11
160
777
@bneyshabur
Behnam Neyshabur
3 years
Very excited to announce a significant milestone in expanding reasoning capabilities of language models! 🎉🎉. We introduce #Minerva🦉: a language model that can solve mathematical questions using step-by-step natural language reasoning: . 🧵. 1/
Tweet media one
Tweet media two
Tweet media three
@alewkowycz
alewkowycz
3 years
Very excited to present Minerva🦉: a language model capable of solving mathematical questions using step-by-step natural language reasoning. Combining scale, data and others dramatically improves performance on the STEM benchmarks MATH and MMLU-STEM.
Tweet media one
11
123
612
@bneyshabur
Behnam Neyshabur
3 years
Looking back, I think the moment that I was asked "How can you prove that you are hardworking and a fast learner?" was an absolutely pivotal event in my life and I am forever grateful for that opportunity. 13/.
12
9
532
@bneyshabur
Behnam Neyshabur
3 years
We often prefer collaborating with people we know or those of high status. That makes it very difficult for hardworking and motivated junior researchers to get enough support to flourish. Is it possible to reduce this barrier?.I'v been running some experiments to find out!. 1/6.
8
52
442
@bneyshabur
Behnam Neyshabur
3 years
You think the RNN era is over? Think again!. We introduce "Block-Recurrent Transformer", which applies a transformer layer in a recurrent fashion & beats transformer XL on LM tasks. Paper: W. DeLesley Hutchins, Imanol Schlag, @Yuhu_ai_ & @ethansdyer. 1/
Tweet media one
5
66
439
@bneyshabur
Behnam Neyshabur
3 years
It turns out that it is possible to get this right with minimal change on the original prompt!
Tweet media one
Tweet media two
@GaryMarcus
Gary Marcus
3 years
🙄 @GoogleAI, “a deep level understanding”? . Seriously?!. Your system can’t distinguish “a horse riding an astronaut” from “an astronaut riding a horse”. 🙄
Tweet media one
12
43
401
@bneyshabur
Behnam Neyshabur
4 years
Excited to announce an internship opportunity for summer or fall 2021🔥The research will explore qualitatively new behaviors of massive (100B+ params🦾) transformers!.If you are interested & have related experience please reach me at neyshabur@google.com. Please retweet & share!.
5
83
393
@bneyshabur
Behnam Neyshabur
5 years
I got a one-way ticket and wasn’t sure if things would work out but working from Alaska turned out to be a great idea!.#WorkFromAlaska #WorkFromAnywhere
Tweet media one
Tweet media two
Tweet media three
Tweet media four
13
5
359
@bneyshabur
Behnam Neyshabur
4 years
My wife and I love traveling & backpacking. A year ago today, we broke our lease & moved our belongings to a storage to try #workation lifestyle. We decided to give it a shot for a few weeks & come back if doesn't work out. Amazingly, we are not back yet and we love it this way!
Tweet media one
6
1
337
@bneyshabur
Behnam Neyshabur
6 years
Very excited to join Google today as a research scientist! I am forever grateful for the opportunity to learn from my postdoc advisors @ylecun, @prfsanjeevarora and my PhD advisor Nati Srebro.
21
5
326
@bneyshabur
Behnam Neyshabur
4 years
I wish visiting parents regularly was possibile for everyone. I wasn’t able to go to Iran for 8 years and could only see my parents once in those 8 years. This is a typical case for Iranian students in US. Two friends of mine lost their dads and could not visit for the funeral. .
3
22
308
@bneyshabur
Behnam Neyshabur
7 years
Our recent work on the role of over-parametrization in generalization of neural nets:.This helps us to understand the phenomenon we reported more than 3 years ago (also observed by some people before deep learning era):.
Tweet media one
3
82
298
@bneyshabur
Behnam Neyshabur
4 years
🆕 📰: Deep Learning Through the Lens of Example Difficulty. We introduce a measure of computational difficulty and show its surprising relationships with different deep learning phenomena. Paper: with @Robert_Baldock & Hartmut Maennel. 1/
Tweet media one
5
60
289
@bneyshabur
Behnam Neyshabur
3 years
If there is a (senior) researcher whose work you like & you'd love to chat with them:. SHUT UP that grumpy voice in your head that tells you are not worthy of their time. Send them a short intro email, request a 20min meeting & briefly tell them how you can benefit from it. 1/2.
2
16
279
@bneyshabur
Behnam Neyshabur
2 years
Very excited to share what we have been working on in the last several months: Gemini 1.0!. Google Blogpost: DeepMind Blogpost:. Technical Report:.
@sundarpichai
Sundar Pichai
2 years
Introducing Gemini 1.0, our most capable and general AI model yet. Built natively to be multimodal, it’s the first step in our Gemini-era of models. Gemini is optimized in three sizes - Ultra, Pro, and Nano. Gemini Ultra’s performance exceeds current state-of-the-art results on
Tweet media one
5
29
284
@bneyshabur
Behnam Neyshabur
4 years
If you believe you are at a disadvantage in ML community (e.g. because of your race, gender, nationality, background or other circumstances) and need guidance & help, I'd love to meet you! Just pick a time from here:. Please RETWEET to spread the word!.
9
87
275
@bneyshabur
Behnam Neyshabur
1 year
2.5 years ago, our team decided to improve reasoning capabilities of LLMs & Hendryks MATH has been a valuable benchmark for tracking progress. It's mind blowing to see the progress since then, from our Minerva paper all the way to this recent update. MATH is now the new MNIST!
Tweet media one
@bneyshabur
Behnam Neyshabur
1 year
I'm excited about this! Our team has been working really hard to improve Gemini 1.5 capabilities significantly on multiple fronts and in particular MATH/STEM! Please see the report here: .
12
33
274
@bneyshabur
Behnam Neyshabur
3 months
Excited to hear that Nicholas Carlini is joining Anthropic!. He wrote a blogpost explaining why he decided to leave GDM and join Anthropic:.
7
7
247
@bneyshabur
Behnam Neyshabur
2 years
I'm only asking people to think hard before committing to an ML PhD program. But an ML PhD could still end up working great for many! Also, I covered particular less discussed cons and did not intend to provide the full picture. 🧵My own PhD was truly a roller coaster 🎢:. 1/n.
@bneyshabur
Behnam Neyshabur
2 years
These days, many people are interested in getting a PhD in ML. I think you should think really hard before committing to a PhD program in ML. Why?. I'm going to summarize some thoughts in this thread:. 1/10.
4
18
233
@bneyshabur
Behnam Neyshabur
3 years
ML twitter is amazing! At this point, arxiv papers and even blogposts are too slow for ML and "twitter papers" can go a long way (particularly when the code is released like this one)😅.
@kellerjordan0
Keller Jordan
3 years
Along with many others, I find the results of Git Re-Basin by @SamuelAinsworth, J. Hayase & .@siddhss5 highly interesting. But I believe there is a crucial detail which deserves attention: The authors replace BatchNorm with LayerNorm in their ResNet and VGG implementations. 1/14.
2
10
222
@bneyshabur
Behnam Neyshabur
6 years
Paper on role of over-parametrization in generalization of neural nets is accepted to #ICLR2019:.….We have also released our code:.….This is a joint work with with Zhiyuan Li, Srinadh Bhojanapalli, @ylecun and Nati Srebro.
Tweet media one
4
61
217
@bneyshabur
Behnam Neyshabur
4 years
Same here. During my PhD (6 years), I wasn't able to go outside of US to attend conferences or visit my family because of my single entry visa. #MultipleEntryVisa should be granted to all international students!.
@FeiziSoheil
Soheil Feizi
4 years
While we are at it, can we grant international students #MultipleEntryVisa for the duration of their studies (instead of single-entry)? It may sound like a minor issue for many but it is actually a big deal for many international students. I explain it below. 👇.
0
26
211
@bneyshabur
Behnam Neyshabur
5 years
Pheeew! Thanks for practicing social distancing! @LakeClarkNPS #alaska #bear #SocialDistancing #COVID19
12
5
199
@bneyshabur
Behnam Neyshabur
5 years
I have an early stopping j.
@rrwilliams
Ryan Williams @rrwilliams.bsky.social
5 years
I have a joke about the computational complexity of nearest neighbors, but it takes too long to get to the point.
0
8
200
@bneyshabur
Behnam Neyshabur
3 years
Without thinking too much I said: "Send me 5 papers and tomorrow at the same time, ask any questions of any depth and difficulty from them and assess my skills.". And he agreed! He sent me 4 papers and a book chapter and we decided to meet at the same time tomorrow. 9/.
3
7
197
@bneyshabur
Behnam Neyshabur
4 years
Excited about trying Vision Transformer, Mixer or other new models on your data? . Don't forget to train with SAM instead of SGD/ADAM or might regret your decision!. By switching to SAM:.ViT and Mixer improve 5% & 11% on ImageNet.ViT and Mixer improve 10% & 15% on ImageNet-C.1/4
Tweet media one
4
27
193
@bneyshabur
Behnam Neyshabur
3 years
If you are a CS graduate applicant this or next year that might be negatively affected by such "quick guide"s, feel free to book a time with me through @ml_collective office hours:.Also, you might want to read the story of my own PhD application: 👇.
@ameerrahmati
Amir Rahmati
4 years
Application review season is coming up! If you are a CS faculty trying to review Iranian applicants, here is a quick guide on how to gauge them: #AcademicChatter.
5
26
177
@bneyshabur
Behnam Neyshabur
4 months
In 2021, when I decided to let go of deep learning theory and focus on improving math and reasoning in LLMs, I was inspired by the idea that one day LLMs could develop a solid theory explaining why deep learning works—and explain it to me in a way I can understand. Not too far.
6
9
180
@bneyshabur
Behnam Neyshabur
2 years
Interested in Reasoning with Large Language Models?. We are hiring!. Internship:.Full-Time Research Scientist:.Full-Time Research Engineer:. Learn more about Blueshift Team:
@bneyshabur
Behnam Neyshabur
3 years
Interested in Large Language Models?. Stop by our 4 posters at #NeurIPS2022 on Tuesday. 👇.
6
18
166
@bneyshabur
Behnam Neyshabur
1 year
I'm excited about this! Our team has been working really hard to improve Gemini 1.5 capabilities significantly on multiple fronts and in particular MATH/STEM! Please see the report here: .
@OriolVinyalsML
Oriol Vinyals
1 year
Today we have published our updated Gemini 1.5 Model Technical Report. As @JeffDean highlights, we have made significant progress in Gemini 1.5 Pro across all key benchmarks; TL;DR: 1.5 Pro > 1.0 Ultra, 1.5 Flash (our fastest model) ~= 1.0 Ultra. As a math undergrad, our drastic
Tweet media one
9
17
166
@bneyshabur
Behnam Neyshabur
6 years
Exhausted after presenting our poster on role of over-parametrization in generalization of neural nets at #ICLR2019 :-) This was a joint work with Zhiyuan Li, Srinadh Bhojanapalli, @ylecun and Nati Srebro.
Tweet media one
2
9
160
@bneyshabur
Behnam Neyshabur
3 years
I didn't sleep that night and spent the whole 23h reading the papers and preparing for the interview next day. I knew that was my moment and didn't want to miss it!. 10/.
1
1
160
@bneyshabur
Behnam Neyshabur
3 years
Fast forward to 2019, with a PhD from @TTIC_Connect, several well-known universities who wouldn't even seriously consider me for their PhD program gave me offers to become a tenure-track faculty, which I declined due to reasons that are outside of the scope of this tweet. 12/.
1
1
158
@bneyshabur
Behnam Neyshabur
3 years
My interview went perfectly. I was able to answer the questions about the papers and I even gave suggestions on future directions. Of course given my CV, Prof. needed more evidence to be convinced so we did a few interviews of this type. Eventually, I got an offer in Spring!. 11/.
1
2
156
@bneyshabur
Behnam Neyshabur
4 years
Come to our talks and posters at #ICLR2021 to discuss our findings on understanding and improving deep learning! Talks and posters are available now! Links to the talks, posters, papers and codes in the thread:. 1/7
Tweet media one
1
23
152
@bneyshabur
Behnam Neyshabur
2 years
Graduate degree in ML is overrated. So is having publications in top ML venues. One can accomplish a lot in this field without any of these. The truth is that you don’t need to cover a lot of background before you can do interesting things in ML. 2/10.
2
7
152
@bneyshabur
Behnam Neyshabur
1 year
6.9%-->91.1% on MATH. AI is definitely hitting a wall😏.
@JeffDean
Jeff Dean
1 year
One other thing in the updated Gemini 1.5 Pro report: we show how a research model that is a mathematics-specialized version of 1.5 Pro achieves a record score of 91.1% on the MATH benchmark (the SOTA just 3 years ago, in May, 2021 was 6.9%!).
9
5
153
@bneyshabur
Behnam Neyshabur
3 years
Instead of ignoring what I said and ending the interview, Prof. said: . "How can you prove that you are hardworking and a faster learner?" . That was when I knew I have the opportunity to design my own interview and prove myself! . 8/.
3
6
154
@bneyshabur
Behnam Neyshabur
11 months
Silver medal in International Math Olympiad! And we were so close to getting a gold medal! Congrats to the AlphaProof, AlphaGeometry and the informal proof teams. At this point, it’s very hard to predict where we get in years ahead of us!.
@GoogleDeepMind
Google DeepMind
11 months
We’re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level.🥈. It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system. 🧵
3
11
148
@bneyshabur
Behnam Neyshabur
4 years
Today’s accomplishment: found the tallest tree in the world after some creek crossing and off trail hiking :-) Hyperion is astonishingly (116 meters -381 feet) tall!
Tweet media one
Tweet media two
4
3
141
@bneyshabur
Behnam Neyshabur
3 years
In 2010 when I applied for PhD programs, I was a Master student in Iran with no published papers, undergrad GPA C and poor to mediocre English test scores. 2/.
1
3
136
@bneyshabur
Behnam Neyshabur
3 years
I have been hosting a weekly ML Collective Office Hour for the last 4 months and it has been a very positive experience. It is open to EVERYONE and it can be about ANYTHING (research, career, etc.).
@ml_collective
ML Collective
3 years
Do you know ML Collective has an "Office Hours" service? You can book 1:1 chats with researchers who kindly open up their calendars to serve the community. Current ongoing sessions are proudly hosted by @bneyshabur @AndreaMadotto @natolambert
Tweet media one
1
9
134
@bneyshabur
Behnam Neyshabur
2 years
Mental health issues are common: 40% of graduate students have moderate or severe depression and 40% have moderate or severe anxiety. Main factors are work-life imbalance and the power dynamic of your relationship with your advisor. Read more here:. 8/10
Tweet media one
5
4
126
@bneyshabur
Behnam Neyshabur
3 years
Interested in Large Language Models?. Stop by our 4 posters at #NeurIPS2022 on Tuesday. 👇.
1
9
110
@bneyshabur
Behnam Neyshabur
5 years
Oh, no! Just noticed I'm reviewer #2 in most of the @NeurIPS2020 papers I'm reviewing!.
6
2
120
@bneyshabur
Behnam Neyshabur
2 years
The number of PhD positions in ML has been increasing at an insane rate while it looks like the number of new ML jobs that require a PhD might decrease in the future. Instead, it looks like the industry is going to be mostly interested in ML engineers who can build things. 5/10.
3
6
117
@bneyshabur
Behnam Neyshabur
3 years
Here is what happens if you ask GPT-3 (text-davinci-v2) to predict the next token after . "I have 4 apples and 5 oranges. Since 4 is more than ". or even an easier one:. "I have 5 apples and 4 oranges. Mathematical Fact: 4 is more than ". (probabilities are shown)
Tweet media one
Tweet media two
8
10
120
@bneyshabur
Behnam Neyshabur
2 years
Acceptance of this paper to #ICLR2023 is particularly rewarding to me because it is a very successful examples of what I was envisioning when I created collaboration request form that is open to everyone as part of @ml_collective : . 1/3.
@kellerjordan0
Keller Jordan
3 years
Why don’t current model merging results generalize to standard ConvNets? And how can this be fixed?. We answer these Qs and present a method that improves merged NN performance for any choice of norm layer. W/ @HanieSedghi @osaukh @rahiment @bneyshabur
Tweet media one
2
16
123
@bneyshabur
Behnam Neyshabur
4 years
Long thread at the risk of being judged:. I just realized that in the last 6 years, 21 of my 24 papers have been accepted to top ML conf in their FIRST submission even though the majority of them were hastily-written borderline papers (not proud of this). How is this possible?.2/.
1
15
117
@bneyshabur
Behnam Neyshabur
3 years
If you don't hear back, that is absolutely normal. At least you know you have tried. But if you do hear back and that leads to getting good advice, mentorship, future collaboration or even friendship down the road, send me an email and thank me for this advice :-).2/2.
2
0
108
@bneyshabur
Behnam Neyshabur
3 years
All US universities rejected my Application without even interviewing me, except @TTIC_Connect where a Prof. decided to interview me. @TTIC_Connect was less known back then and was recommended to me by my dear friend @ArashVahdat. 3/.
2
1
109
@bneyshabur
Behnam Neyshabur
3 years
What would you do if you were in his shoes? . A student from Iran with a poor CV, poor English and unable to answer your basic questions claiming to be "hardworking" and "fast learner"? . Would you even take that seriously?. 7/.
1
1
108
@bneyshabur
Behnam Neyshabur
3 years
At some point, I realized that he is disappointed and is about to end the interview. Instead of accepting my "fate", I decided not to give up. I told him: "Look, I know I have a terrible memory. But I am hardworking and a fast learner!". 6/.
1
3
107
@bneyshabur
Behnam Neyshabur
2 years
The ML field continues to become more and more accessible everyday. Everything you need to learn is available online. There is a lot of push to make ML methods/models open-source and reproducible. Many people are also producing useful educational content. 3/10.
1
1
102
@bneyshabur
Behnam Neyshabur
4 years
2- Write the paper assuming the audience has a very short attention span! Many ML reviewers want to get a sense of the main results of the paper in 5 min so spend plenty of time on the abstract, first figure and contribution section. First impression matters a lot!.7/.
3
7
105
@bneyshabur
Behnam Neyshabur
6 years
Looking back on papers published on generalization of deep networks, a paper published by @KDziugaite and @roydanroy about two years ago wins my "imaginary" test of time award:.There is a lot of novelty in this work!.
4
15
107
@bneyshabur
Behnam Neyshabur
3 years
The left one is generated by "A horse riding on back of an astronaut" and the right one is generated by "A horse riding on shoulders of an astronaut". So simply adding "on back of" or "on shoulders of" helps increase the chance of getting it right!.
3
2
102
@bneyshabur
Behnam Neyshabur
2 years
Sending ❤️ to @ilyasut and all amazing OpenAI colleagues. You didn't deserve to go through this. We are all part of the same small community and no matter what happens, we have each other's back. I'm sure OpenAI team continues to build amazing things wherever they are 💯.
@ilyasut
Ilya Sutskever
2 years
I deeply regret my participation in the board's actions. I never intended to harm OpenAI. I love everything we've built together and I will do everything I can to reunite the company.
2
6
106
@bneyshabur
Behnam Neyshabur
3 years
🔥Internship Opportunity on Improving the Reasoning Capabilities of Massive Language Models🔥: solving challenging problems in areas such as mathematics, science, programming, algorithms, and planning. Please see the following link for more info:.
1
25
99
@bneyshabur
Behnam Neyshabur
2 years
I'm personally super excited about #ICLR2023. I have already booked my flight tickets, planned a guided trek to Nyungwe and Volcanoes National Parks in Rwanda, followed by #workation and safaris in South Africa and Kenya! Will share more details soon for those who are interested!.
@savvyRL
Rosanne Liu
2 years
Let's goooooo! ML conferences are experiencing an identity crisis — they are all the same. Same people, same papers, same talks. What's distinct about this year's ICLR is that the special location will present the most different demographic from all other ML confs. 🧶 1/5.
3
4
102
@bneyshabur
Behnam Neyshabur
3 years
During the interview, he asked me several questions. I wasn't able to answer the questions properly because they required remembering details from the courses. However, I generally have a terrible memory and I couldn't remember those required details. 5/.
1
1
97
@bneyshabur
Behnam Neyshabur
2 years
Learning things by yourself has its own issues but then, I’m not sure spending 5-6 years to get a PhD is the most efficient way to learn those skills. I think you don’t even learn many of needed skills during a PhD. Current PhD programs are still lagging behind the field. 4/10.
3
3
93
@bneyshabur
Behnam Neyshabur
4 years
We have a bold conjecture! We think "there is some truth to it" but "it is not true as stated". We state it as is, show our extensive experiments fall short of refuting it & we hope that people who find it exciting try to refute it and replace it with a better one :-) #science.
@rahiment
RahimEntezari
4 years
🆕The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks. Our conjecture: Taking permutations into account, there is likely no barrier in the linear interpolation between SGD solutions. w @HanieSedghi @osaukh @bneyshabur.1/10
Tweet media one
1
12
96
@bneyshabur
Behnam Neyshabur
3 years
My interview with the Prof. didn't go as I was hoping. When he called me, we had a hard time even communicating with each other partly because of my poor English listening skills. However, he kindly offered to resume the interview in Google Chat. 4/.
1
1
92
@bneyshabur
Behnam Neyshabur
3 years
A short thread including a few points regarding relationship between SAM, sharpness and generalization:. 1/.
@PreetumNakkiran
Preetum Nakkiran
3 years
@jeankaddour @jeremyphoward @CsabaSzepesvari Here’s a SAM author on flatness. SAM has the word “sharpness” in the title but beyond that, its connection to sharpness is poorly understood.
2
15
96
@bneyshabur
Behnam Neyshabur
4 years
Writing the paper:.1- Make your paper look similar to a typical ML paper. I can't emphasize this enough. Figures and tables should follow a similar style to what is usually seen in ML. So is everything else including the abstract, introduction, phrases, organization, etc. 6/.
1
6
92
@bneyshabur
Behnam Neyshabur
2 years
I tried to focus on what I think is the most likely scenario but of course, the right decision is not the same for everyone. There is also a chance that everything works in your favor! Here's a slide from a talk I gave a couple months ago. 9/10
Tweet media one
1
2
88
@bneyshabur
Behnam Neyshabur
5 years
1st day of our 9-day backpacking trip. Highlights: a tricky creek-crossing, hail in a sunny day, seeing a caribou,. #alaska #backpacking #wrangell #glacier #notrail
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
0
94
@bneyshabur
Behnam Neyshabur
2 years
😅
Tweet media one
@ArashVahdat
Arash Vahdat
2 years
The worst idea ever! 😅
Tweet media one
3
1
88
@bneyshabur
Behnam Neyshabur
4 years
After several years of reviewing & AC work for @NeurIPSConf, @iclr_conf & @icmlconf, I have strong opinions about the reviewing system and some suggestions that many may not like or agree with. Summarizing my points in this thread (hastily written & NOT carefully considered):. 1/.
3
8
92
@bneyshabur
Behnam Neyshabur
3 years
To take this experiment further, I have now added a section called "Collaboration request (open to anyone)" to my website which gives specific instructions about sending me an effective collaboration request:. 4/6.
1
5
87
@bneyshabur
Behnam Neyshabur
4 years
3- Many reviewers love papers that have a combination of theory + experiments. If you are writing a theoretical paper, try to include some experiments and if you are writing an empirical paper, try to add a theoretical component. 8/.
2
4
88
@bneyshabur
Behnam Neyshabur
4 years
Climbing #Iztaccihautl (white woman) was such an incredible #mountaineering experience. A 5,230m (17,160ft) dormant #volcano located in #mexico close to #mexicocity and north of its twin, #popocatepetl, which is an active volcano (you can see an erruption in one of the pictures).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
0
89
@bneyshabur
Behnam Neyshabur
5 years
Pictures I took during the flight that dropped us off in the backcountry to start our 9-day backpacking trip in Alaska. #alaska #backpacking #wrangell #glacier #notrail
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
0
84
@bneyshabur
Behnam Neyshabur
4 years
3- Explicitly ask each of reviewers who gave you low scores to increase their scores at the end of your response but make sure your request is polite and respectful. Being explicit here makes a huge difference based on my experience. 12/.
2
2
84
@bneyshabur
Behnam Neyshabur
4 months
No matter your industry, start experimenting with using foundation models in every aspect of your work and life today! Once you learn how to use them, you’ll be shocked by their impact! Adapt and thrive—evolution won’t wait.
3
5
88
@bneyshabur
Behnam Neyshabur
3 years
Never thought such an extreme level of joyfulness is achievable just by seeing the sun!. The cost? Being hit by a rainstorm for a few days during our backpacking last week in a remote Arctic area (no trails, no humans but some grizzly bears). Almost all our stuffs got wet 😄
Tweet media one
3
0
83
@bneyshabur
Behnam Neyshabur
2 years
There are many ways your PhD could go wrong, most of which are outside your control. The success rate is not very high and unfortunately luck plays a big role!. 7/10.
1
2
76
@bneyshabur
Behnam Neyshabur
1 year
Yes! Sergey Brin has been a core technical contributor showing up to work in office together with other Gemini team members! Other than his technical contributions, I have been amazed at how his presence has energized everyone💙.
@_sholtodouglas
Sholto Douglas
2 years
@0interestrates @nearcyan Name order was randomised (except for the first 6 names which spell out Gemini) - Sergey was in with us basically every day, often pairing!.
0
3
78
@bneyshabur
Behnam Neyshabur
4 years
4- Cite many papers! Try to do a good job in reviewing the relevant literature AND be generous by citing many papers and giving them credit for what they have done. That is, if you are not sure, it is safer to cite. 9/.
2
3
79
@bneyshabur
Behnam Neyshabur
4 years
Just finished 12 back-to-back 20-min meetings with 12 amazing people who signed up for this and I'm not even a bit tired. It was a very encouraging experience!.
@bneyshabur
Behnam Neyshabur
4 years
If you believe you are at a disadvantage in ML community (e.g. because of your race, gender, nationality, background or other circumstances) and need guidance & help, I'd love to meet you! Just pick a time from here:. Please RETWEET to spread the word!.
0
0
79
@bneyshabur
Behnam Neyshabur
3 years
🔥Opening in our team – Blueshift🔥. We are looking for a research engineer interested in extending the capabilities of large language models. Learn more about the role & apply here:. Learn about our team:. Please retweet :-) 🙏.
2
24
77
@bneyshabur
Behnam Neyshabur
3 years
Attending #NeurIPS2022 DM me if you want to meet!.
1
2
80
@bneyshabur
Behnam Neyshabur
4 years
At this point, I'm convinced that this cannot be explained by a combination of luck and quality of the papers. My belief is that the current system has lots of unnecessary and sometimes harmful biases which is #unfair to new comers and anyone who is outside of the "norm". 3/.
1
3
76
@bneyshabur
Behnam Neyshabur
4 years
Our team is looking for a full-time ML Engineer with prior industry experience excited about pushing the limits of large transformer models! Please see the job description and fill out this form if you are interested (or just retweet!):.
3
21
74
@bneyshabur
Behnam Neyshabur
4 years
4- Unless your paper is guaranteed to be accepted, write to the AC explaining how you view the situation. If it seems that a reviewer doesn't understand the paper or is unreasonable, let the AC know! As an AC, I pay a lot of attention to direct messages from the authors. 13/.
1
3
72
@bneyshabur
Behnam Neyshabur
1 year
I'm very excited about this release: Gemini 1.5 Pro - A highly capable multimodal model with a 10M(!!!) token context length!.
@JeffDean
Jeff Dean
1 year
Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length. Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long
Tweet media one
0
3
76
@bneyshabur
Behnam Neyshabur
5 years
This plane is going to drop us off in Wrangell-St. Elias’s backcountry for a 9-day backpacking where there is no trail or cell coverage. You can follow our location via this link which gets updated by our sat. communicator:.#alaska #wrangell #backpacking
Tweet media one
2
0
69
@bneyshabur
Behnam Neyshabur
4 years
Think about all issues women deal with in our field. Add to this all restrictions Iranians face in the US! Want to support Iranian Women in our field? IranWiC has a great mentorship program and as one of its board members, I assure you every $1 donation translates to high impact!.
@iranwic
Iranian Women in Computing - IranWiC
4 years
We have started a giving campaign this October to support the mentorship efforts offered by IranWiC team to make a difference in the career path of Iranian women in computing and help them achieve their goals. To contribute, please visit:
Tweet media one
0
11
70
@bneyshabur
Behnam Neyshabur
4 years
Rebuttal:.1- In your response to reviewers, be very nice to all of them even those who have attacked you unfairly. Try to explain things from your viewpoint but be careful that your response should not put the reviewer in the defensive position. 10/.
2
4
66
@bneyshabur
Behnam Neyshabur
2 years
Around the end of my PhD more people started to care about it & I think that helped me when I was on the job market. Looking back, it was a fantastic & totally unpredictable ride! Was is worth it? Totally! But hey, who can ignore the role of luck and survivorship bias. 15/n.
2
0
66
@bneyshabur
Behnam Neyshabur
1 year
Ping me me if you are attending ICLR and want to chat about reasoning with LLMs!. #ICLR2024.
3
3
68
@bneyshabur
Behnam Neyshabur
4 years
See our @googleai blog post on a new framework to study generalization based on an empirically verified conjecture that connects generalization to online optimization. This is a joint work with @PreetumNakkiran and @HanieSedghi.
@GoogleAI
Google AI
4 years
A core challenge in #DeepLearning is the disconnect between the theory of how models generalize and how they perform in practice. A new theoretical framework demonstrates how to understand model generalization through optimization behavior. Check it out at
1
2
68