
Ellis Brown
@_ellisbrown
Followers
647
Following
5K
Media
27
Statuses
358
Intern @Meta FAIR, PhD Student @NYU_Courant w/ Profs @sainingxie @rob_fergus. Prev @Ai2Prior, @CarnegieMellon
NYC
Joined January 2016
Cambrian-1 🪼. Through a vision-centric lens, we study every aspect of building Multimodal LLMs except the LLMs themselves. As a byproduct, we achieve superior performance at the 8B, 13B, 34B scales. 📄🌎🤗
huggingface.co
Introducing Cambrian-1, a fully open project from our group at NYU. The world doesn't need another MLLM to rival GPT-4V. Cambrian is unique as a vision-centric exploration & here's why I think it's time to shift focus from scaling LLMs to enhancing visual representations.đź§µ[1/n]
2
31
132
RT @lchen915: Self-Questioning Language Models: LLMs that learn to generate their own questions and answers via asymmetric self-play RL. T….
0
170
0
RT @ShivamDuggal4: Compression is the heart of intelligence.From Occam to Kolmogorov—shorter programs=smarter representations. Meet KARL: K….
0
63
0
impressive results! seems like an exciting route for inference-time scaling. also kudos for the intuitive explanations / visualizations — very accessible resources in the paper+blog for understanding how EBMs work 🙇‍♂️.
How can we unlock generalized reasoning?. ⚡️Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-forward) transformers and unlocks generalized reasoning/thinking on any modality/problem without rewards. TLDR:.- EBTs are the first model to outscale the
2
0
15
RT @AlexiGlad: How can we unlock generalized reasoning?. ⚡️Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-….
0
255
0
RT @mihirp98: 1/ Maximizing confidence indeed improves reasoning. We worked with @ShashwatGoel7, @nikhilchandak29 @AmyPrb for the past 3 we….
0
13
0
RT @mattdeitke: Molmo won the Best Paper Honorable Mention award @CVPR!. This work was a long journey over 1.5 years, from failing to get s….
0
17
0
RT @rob_fergus: 1/ Excited to share that I’m taking on the role of leading Fundamental AI Research (FAIR) at Meta. Huge thanks to Joelle fo….
0
23
0
RT @mattdeitke: I’m very excited to introduce Vy, the AI that sees and acts on your computer! It’s a first glimpse of what we’ve been worki….
0
18
0
RT @alexlioralexli: Excited to be presenting at #ICLR2025 at 10am today on how generative classifiers are much more robust to distribution….
0
7
0
RT @xichen_pan: We find training unified multimodal understanding and generation models is so easy, you do not need to tune MLLMs at all. M….
0
67
0
RT @mihirp98: 1/ Happy to share UniDisc - Unified Multimodal Discrete Diffusion – We train a 1.5 billion parameter transformer model from s….
0
113
0
RT @TongPetersb: Vision models have been smaller than language models; what if we scale them up?. Introducing Web-SSL: A family of billion-….
0
86
0
RT @DavidJFan: Can visual SSL match CLIP on VQA?. Yes! We show with controlled experiments that visual SSL can be competitive even on OCR/C….
0
95
0
RT @baifeng_shi: Next-gen vision pre-trained models shouldn’t be short-sighted. Humans can easily perceive 10K x 10K resolution. But today….
0
153
0
RT @codezakh: ✨ Introducing MutaGReP (Mutation-guided Grounded Repository Plan Search) - an approach that uses LLM-guided tree search to fi….
0
38
0
RT @ma_nanye: Inference-time scaling for LLMs drastically improves the model's ability in many perspectives, but what about diffusion model….
0
91
0
RT @sainingxie: When I first saw diffusion models, I was blown away by how naturally they scale during inference: you train them with fixed….
0
70
0
RT @RezaeiKeivan: 🚨Preprint from internship at @allen_ai. 🤖We propose restorative unlearning: not just forgetting knowledge from specific d….
0
23
0