Research scientist at FAIR NY + collab w/ Vector Institute. ❤️ Machine Learning + Information Theory. Previously, PhD at UoAmsterdam, intern at DeepMind + MSRC.
I finally uploaded my PhD thesis
“A coding perspective on deep latent variable models”
() on Gscholar 🙃
It’s my ❤️ letter to the minimum description length principle for machine learning (+ pastel gradients 😋).
🚨 Internship opportunity 🚨
You are interested in working in information theory+neural compression (+representation learning+continual learning+mechanistic interpretability more broadly) + work at FAIR NY + flexible starting date in 2024. DM me + send a CV/link to your website
During my
@deepmind
internship supervised by
@deepspiker
, I have been working on improving the quality of skype/hangout/zoom calls with generative models.
Our paper (w
@FabioViola
) 👇
"Neural Communication Systems with Bandwidth-Limited Channel" () [1/3]
Dear neural compression enthusiasts,
there is a new
@PyTorch
-repo in town
Includes;
🔥neural image and video compression
🔥bits-back coders
🔥GPU entropy coders
We already work on extensions, feel invited to contribute ❤️❤️❤️
🚨Internship application season is open y’all 🚨
If you are interested in information theory, neural compression, and continual learning; let’s learn and explore together @ FAIR New York.
I got one internship spot for 2022 with a flexible starting date. DM me.
Proud to show our work on differentiable graphical models in the Fourier domain for protein reconstruction and other projection methods:
Thanks to my amazing collaborators David Fleet,
@marcusabrubaker
,
@vdbergrianne
and
@wellingmax
. ❤️❤️❤️
Most data is processed by algorithms, but compressors (eg JPEG) are for human eyes.
🤓Our fix: formalize lossy compression that ensures perfect downstream predictions
🔥1000x gains vs JPEG on ImageNet🔥
w. Ben Bloem-Reddy
@karen_ullrich
@cjmaddison
1/9
📢 Neural Compression Enthusiasts @
#Neurips2022
; Tue Nov 29th, 3.30 pm , Room 282 inside the Convention Center.
Let’s meet, chat, and get inspired!
Hope to see ya there ❤️
@y0b1byte
@BahareFatemi
I am sorry to hear about your experience. I would like to add that we should start to understand the depression of grad students as systematic, not just individual problems. Consequently, we should also seek systematic change to how grad school works.
New paper accepted as long talk at ICML! We improve the compression capabilities of latent variable models.
TLDR; 👇🏻
w/ the fantastic
@YangjunR
@_dsevero
@_j_towns
@AliMakhzani
Arnaud Doucet, Ashish Khisti and all held together by
@cjmaddison
You want to compress data with a latent variable model, but bits-back achieves a suboptimal code length (neg. ELBO). We show how to break this barrier with asympt. optimal coders: Monte Carlo Bits-Back (McBits, ) 1st auths
@YangjunR
@karen_ullrich
@_dsevero
2.5y ago I started work on cryo electron microscopy data. My first paper, w David Fleet and
@WellingMax
, introduced fast variational approx. to unknown proteins.
@a_punjani
, also student of David, leads a company that is vital in visualizing
#COVID19
right now.
🤩👇
🤩 Tomorrow the
#ICML2023
workshops shall begin.🎙️ Join me at 9 AM I will discuss absolutely all there is to know about
#BitsBackCoding
@ the Structured Probabilistic Inference & Generative Modeling workshop.
#AI
#ML
#DataCompression
And you thought we couldn't make another bits back paper 😈
This time, we use bits back to strictly remove information🤯Specifically, we turn a sequence into a set by removing permutations
👉🏻 Compress a dataset when all you got is a sequence codec, e.g. an arithmetic coder.
We've found a way to save bits during compression by forgetting the order between examples in a dataset.
No machine learning required!
Authors:
@_j_towns
(eq contr) A. Khisti
@AliMakhzani
@karen_ullrich
1/6
Check out
@wellecks
' podcast the
@thesisreview
. He entertains conversations around the development of research ideas through researchers PhD thesis.
Great resource for undergrads and early PhD students.
Episode 28 of The Thesis Review:
Karen Ullrich (
@karen_ullrich
), "A Coding Perspective on Deep Latent Variable Models"
We discuss information theory & minimum description length, covering her PhD research on compression and communication.
Another one of
@3blue1brown
’s educational master pieces. Great intro to what we understand as information. Will steal for presentations and lectures for sure ❤️😍💗
I will be at the affinity groups poster session (!!) today. Really excited to meet and discuss with everyone there🤩 If you are interested in information theory and coding look for me👇 I can’t wait to meet and discuss your work, potential collaborations and/or a FAIR internship.
Best part: this method doesn't compete with your favorite codec, it only improves it!
🔥 7.6% reduction in the number of bits needed to compress BMNIST using the "Bits-back with ANS" neural codec, with only a 10% increase in compute time
Craystack code:
🌍, I am on a 🇪🇺 tour: I will be giving talks in Leipzig on Sep 7th, Vienna Sep 9th and Amsterdam Sep 16th + 20th.
Come by if you are interested in chatting about machine learning and information theory IRL.
more infos 👇🏻
👇🏻We showed not just THAT but HOW known equivalence relationships reduce compression rates by orders of magnitude.
🔥 And got a spotlight at
#NeurIPS
🔥
Grateful to have worked w.
@yanndubs
, Ben Bloem-Reddy +
@cjmaddison
on this project ❤️
We released the code for our paper (now spotlight at
#NeurIPS2021
🥳):
Pretrained compressors are also on torch hub, use the following few lines to compress your image datasets (1000x gains vs JPEG on ImageNet):
Colab:
Point clouds are very flexible, but due to their irregular nature, learning on them is much more expensive than grid data.
We show how we can process them efficiently without accuracy drops by getting rid of their irregularity.
See how at
@TAGinDS
at
#icml2023
!
2. We propose a design to model missing information instead of ignoring it.
3. By introducing auxiliary latent variables in the decoder, we can sample more realistic messages. [3/3]
@maosbot
TBH I think this statement is tone deaf. It ignores that we are
👉🏻already massively contributing to providing tools that help marginalization at scale
👉🏻often do not understand the consequences of our tools
👉🏻traditionally do not give space to underrep. groups in this community
Huge service to the community by
@julberner
: 1st open-source implementation of bits-back compression for diffusion models (= SOTA for lossless compression)❤️🔥🔥
📢📢New feature in
#NeuralCompression
repo: Bits-Back compression for diffusion models!
Compress image data 🖼️ using diffusion models at an effective rate close to the (negative) ELBO.
See:
Some context ⏩[1/4]
Did any of u try ELIMINATING ALL BAD LOCAL MINIMA (by
@jaschasd
+ Kenji Kawaguchi) for real? what application? what is ur experience?
theoretical argument or actually useful? how did u tune optimization for a and b?
We found 3 modelling choices relevant when the bandwidth of the noisy channel, aka weak wifi, varies.
1. Instead of separating the sub-tasks of compression (source coding) and error correction (channel coding), we propose to model both jointly. [2/3]
I made another tool for bit enthusiasts: Turn your float tensor to binary (and back) according to IEEE-754 standard (or any custom format)
It's differentiable 🤩 but not very efficient 😬 . ❤️
#PyTorch
This is no corporate event: no fancy food, no preparations, it’s just us in a room. Bring your own coffee (BYOC). Maybe I can hook up the stereo with my playlist, that's top fancy to be expected. Also no registration required.
Amsterdam Sep 20th 16.oo @ University of Amsterdam, Science Park LAB42
Hosted by Prof. Jan-Willem van de Meent
(might change, this one is still under construction)
@george_toderici
@mattmucklm
@_dsevero
It does open the door 🤩 and with that we may well blur the boundaries between probabilistic modelling and arithmetic coding. Very exciting, I think.
After many exciting discussions with
@SimoneLini
@mortendahlcs
and Hamish Ivey-Law, we wrote on the implications of AI and digital forgery on modern society and discuss plausible technical solutions:
And we did human vs. machine exp, e.g. which is real?
@AIandMLonly
Having a PhD is actually not a requirement. But being in a PhD program is. Hope I did not sound rude, just wanted to save myself from replying to many emails having to clarify that <3
First blog post in 5 years! Don't cite the "No Free Lunch" theorem! cc
@betatim
@mrocklin
@hug_nicolas
This was probably the longest of the blog-posts I was thinking about writing, maybe the next one won't take 5 years.
@mmbronstein
@adjiboussodieng
@_joaogui1
Doesn’t that very example show that the issue is not 0 and 1? There are different levels of community (family, municipality, state ect) with different levels of solidarity (shared income, taxation, infrastructure ect). We can define the rules of this continuum however we see fit.
@Alii_saays
Preferably yes, but we can also arrange for a remote option if visa is an issue (not for all remote locations though). Pref. will not influence hiring process.
@cjmaddison
Feels like a very big publicity stunt to me. Also MSF drawing up a CEO contract for Altman over the course of a weekend... don't know if I believe that.
@YoSiJo
@hen_drik
Zb Wenn ich meinen Algorithmus darauf trainiere, Katzen von Hunden zu unterscheiden, und ihm dann einen Teller Spagetti zeige, wird er mir sagen das ist 99%ig eine Katze und 1%ig ein Hund. Ein Problem mit dem sich meine Forschungsgruppe in Amsterdam intensiv auseinandersetzt. 2/2
@maosbot
I work in data compression and I worry a lot about biases! I think we could be better at developing standardized tests for AI instead of presenting a novel method yet again.
What do you think can be done by us (concretely)?