Alex Alemi Profile
Alex Alemi

@alemi

Followers
1K
Following
90
Media
4
Statuses
65

Machine Learning Researcher

Kissimmee, FL
Joined January 2008
Don't wanna be here? Send us removal request.
@Pavel_Izmailov
Pavel Izmailov
1 year
I am recruiting Ph.D. students for my new lab at @nyuniversity! Please apply, if you want to work with me on reasoning, reinforcement learning, understanding generalization and AI for science. Details on my website: https://t.co/d8uId2LC47. Please spread the word!
17
104
748
@alemi
Alex Alemi
8 months
Recently I've been playing around with a quarter-order-of-magnitude system for simple calculations. It gives better precision than single sig-fig calculations using only four, very intuitive, symbols. https://t.co/BO9mLi8pLF
0
0
8
@alemi
Alex Alemi
1 year
If you miss the NYTimes needle, especially one that is statistically uniform ( https://t.co/uqLw9f69Sw), you can use this page: https://t.co/xQ5cFrtRSD I whipped together to reason about the correlations between the swing states tonight as results come in.
0
1
18
@alemi
Alex Alemi
1 year
Why don't we measure probabilities in degrees? https://t.co/uqLw9f5C2Y
4
11
57
@alemi
Alex Alemi
1 year
In which I try to make sense of most of machine learning:
5
41
294
@blester125
Brian Lester
2 years
Is Kevin onto something? We found that LLMs can struggle to understand compressed text, unless you do some specific tricks. Check out https://t.co/DRO2IbTFCg and help @hoonkp, @alemi, Jeffrey Pennington, @ada_rob, @jaschasd, @noahconst and I make Kevin’s dream a reality.
0
6
15
@noahconst
Noah Constant
2 years
Ever wonder why we don’t train LLMs over highly compressed text? Turns out it’s hard to make it work. Check out our paper for some progress that we’re hoping others can build on. https://t.co/mceqpUfZQo With @blester125, @hoonkp, @alemi, Jeffrey Pennington, @ada_rob, @jaschasd
Tweet card summary image
arxiv.org
In this paper, we explore the idea of training large language models (LLMs) over highly compressed text. While standard subword tokenizers compress text by a small factor, neural text compressors...
2
10
76
@alemi
Alex Alemi
2 years
Each delivery service should use its own distinctive knock.
1
0
2
@alemi
Alex Alemi
3 years
PaLM 540 Billion, Google's large language model used 4.2 moles of flops to train. 4.2 Moles!
0
0
9
@poolio
Ben Poole
3 years
Happy to announce DreamFusion, our new method for Text-to-3D! https://t.co/4xI2VHcoQW We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed! Joint work w/ the incredible team of @BenMildenhall @ajayj_ @jon_barron #dreamfusion
128
1K
6K
@alemi
Alex Alemi
3 years
@dpkingma @poolio To accompany the colab, I've also written a blog post https://t.co/qvmp8pg1g6 attempting to make sense of the VDM Diffusion loss. In it, I try to motivate how the VDM diffusion loss is simply the joint KL between the forward and reverse process.
2
11
51
@dpkingma
Durk Kingma
3 years
Want to understand and/or play with variational diffusion models? - See https://t.co/V1jP11fMmI for a simple stand-alone implementation and explanation. (Thanks @alemi and @poolio for making this)! - See https://t.co/kwlCncttBk for an even more basic implementation on 2D data.
Tweet card summary image
colab.research.google.com
Run, share, and edit Python notebooks
1
63
327
@ziv_ravid
Ravid Shwartz Ziv
3 years
A pretty cool paper (and I also hope useful) on using pre-training models to create highly informative priors for downstream tasks. Thanks to all the collaborators, it was a lot of fun!
@andrewgwils
Andrew Gordon Wilson
3 years
Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors. https://t.co/cglYGiLNeM w/@ziv_ravid, @micahgoldblum, @HosseinSouri8, @snymkpr, @Eiri1114, @ylecun 1/6
2
12
79
@ethansdyer
Ethan Dyer
3 years
1/ Super excited to introduce #Minerva 🦉( https://t.co/UI7zV0IXlS). Minerva was trained on math and science found on the web and can solve many multi-step quantitative reasoning problems.
@alewkowycz
alewkowycz
3 years
Very excited to present Minerva🦉: a language model capable of solving mathematical questions using step-by-step natural language reasoning. Combining scale, data and others dramatically improves performance on the STEM benchmarks MATH and MMLU-STEM. https://t.co/bQJOyMSCD4
29
519
3K
@Chitwan_Saharia
Chitwan Saharia
3 years
We are thrilled to announce Imagen, a text-to-image model with unprecedented photorealism and deep language understanding. Explore https://t.co/mSplg4FlsM and Imagen! A large rusted ship stuck in a frozen lake. Snowy mountains and beautiful sunset in the background. #imagen
57
297
2K
@alemi
Alex Alemi
4 years
you can verify with `echo -n "answer" | md5sum`
0
0
0
@alemi
Alex Alemi
4 years
here are the next few days wordle answers as md5 hashes 2022-01-11 = 0b18a3d7b9c43ff1750d2baa4606b8d0 2022-01-12 = 047fb90408a79f189d51cbcea168b1a5 2022-01-13 = ab3358313efb03210a1babfb372246f1 2022-01-14 = d821e448212defd91ac1e67f9653a34d
3
0
2
@samuel_stanton_
Samuel Stanton
4 years
We are presenting our paper "Does Knowledge Distillation Really Work?" at #NeurIPS2021 poster session 2 today - come check it out! Joint work with @Pavel_Izmailov, @polkirichenko, @alemi, and @andrewgwils. Poster: https://t.co/N4PlsxnpZE Paper: https://t.co/UNSIizi2GG
2
13
78
@venkvis
Venkat Viswanathan
4 years
Excited to kick-start focus #SciML series on #ML meets Info theory and statistical mechanics! Amazing speaker/session chair line-up: @alemi (@wellingmax), @pratikac (Karthik), @ShoYaida (@jaschasd), @yasamanbb (@SuryaGanguli) and Elena Agliari. Details at:
4
33
196
@polkirichenko
Polina Kirichenko
4 years
While most papers on knowledge distillation focus on student accuracy, we investigate the agreement between teacher and student networks. Turns out, it is very challenging to match the teacher (even on train data!), despite the student having enough capacity and lots of data.
@andrewgwils
Andrew Gordon Wilson
4 years
Does knowledge distillation really work? While distillation can improve student generalization, we show it is extremely difficult to achieve good agreement between student and teacher. https://t.co/VpK6Xy2q3S With @samscub, @Pavel_Izmailov, @polkirichenko, Alex Alemi. 1/10
3
15
114