prafull7 Profile Banner
Prafull Sharma Profile
Prafull Sharma

@prafull7

Followers
1K
Following
4K
Media
34
Statuses
352

World models, Computer Vision, Graphics, AI PostDoc @MIT with Josh Tenenbaum and Phillip Isola PhD @MIT with Bill Freeman and Fredo Durand Undergrad @Stanford

Cambridge, MA
Joined September 2010
Don't wanna be here? Send us removal request.
@prafull7
Prafull Sharma
1 year
Graduated with a PhD in Computer Science @MIT! Grateful to my advisors and teachers who helped me learn and grow in this journey! Thanks to all my friends and family members for their support.
117
46
2K
@phillip_isola
Phillip Isola
6 days
Over the past year, my lab has been working on fleshing out theory/applications of the Platonic Representation Hypothesis. Today I want to share two new works on this topic: Eliciting higher alignment: https://t.co/KY4fjNeCBd Unpaired rep learning: https://t.co/vJTMoyJj5J 1/9
9
115
669
@LanceYing42
Lance Ying
3 months
A hallmark of human intelligence is the capacity for rapid adaptation, solving new problems quickly under novel and unfamiliar conditions. How can we build machines to do so? In our new preprint, we propose that any general intelligence system must have an adaptive world model,
14
104
506
@phillip_isola
Phillip Isola
4 months
Our computer vision textbook is now available for free online here: https://t.co/ERy2Spc7c2 We are working on adding some interactive components like search and (beta) integration with LLMs. Hope this is useful and feel free to submit Github issues to help us improve the text!
Tweet card summary image
visionbook.mit.edu
35
620
3K
@ChengTim0708
Ta-Ying Cheng
4 months
Imagine a Van Gogh-style teapot turning into glass with one simple slider🎨 Introducing MARBLE, material edits by simply changing CLIP embedding! πŸ”— https://t.co/VOHGwUGFVZ πŸ‘ Internship project with @prafull7, @markb_boss , @jampani_varun at @StabilityAI
1
5
25
@hyojinbahng
Hyojin Bahng
4 months
Image-text alignment is hard β€” especially as multimodal data gets more detailed. Most methods rely on human labels or proprietary feedback (e.g., GPT-4V). We introduce: 1. CycleReward: a new alignment metric focused on detailed captions, trained without human supervision. 2.
4
38
197
@akarshkumar0101
Akarsh Kumar
5 months
Excited to share our position paper on the Fractured Entangled Representation (FER) Hypothesis! We hypothesize that the standard paradigm of training networks today β€” while producing impressive benchmark results β€” is still failing to create a well-organized internal
@kenneth0stanley
Kenneth Stanley
5 months
Could a major opportunity to improve representation in deep learning be hiding in plain sight? Check out our new position paper: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis. The idea stems from a little-known
5
38
246
@Sa_9810
Shaden
6 months
Excited to share our ICLR 2025 paper, I-Con, a unifying framework that ties together 23 methods across representation learning, from self-supervised learning to dimensionality reduction and clustering. Website: https://t.co/QD6OciHzmt A thread 🧡 1/n
1
24
92
@jxbz
Jeremy Bernstein
7 months
I just wrote my first blog post in four years! It is called "Deriving Muon". It covers the theory that led to Muon and how, for me, Muon is a meaningful example of theory leading practice in deep learning (1/11)
13
135
1K
@vincesitzmann
Vincent Sitzmann
8 months
We wrote a new video diffusion paper! @kiwhansong0 and @BoyuanChen0 and co-authors did absolutely amazing work here. Apart from really working, the method of "variable-length history guidance" is really cool and based on some deep truths about sequence generative modeling....
@BoyuanChen0
Boyuan Chen
8 months
Announcing Diffusion Forcing Transformer (DFoT), our new video diffusion algorithm that generates ultra-long videos of 800+ frames. DFoT enables History Guidance, a simple add-on to any existing video diffusion models for a quality boost. Website: https://t.co/wdZ19yCgjJ (1/7)
3
13
124
@karpathy
Andrej Karpathy
9 months
There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper
1K
3K
30K
@phillip_isola
Phillip Isola
10 months
As a kid I was fascinated the Search for Extraterrestrial Intelligence, SETI Now we live in an era when it's becoming meaningful to search for "extraterrestrial life" not just in our universe but in simulated universes as well This project provides new tools toward that dream:
@SakanaAILabs
Sakana AI
10 months
Introducing ASAL: Automating the Search for Artificial Life with Foundation Models https://t.co/uUq63UNrjv Artificial Life (ALife) research holds key insights that can transform and accelerate progress in AI. By speeding up ALife discovery with AI, we accelerate our
4
19
215
@ShivamDuggal4
Shivam Duggal
11 months
Current vision systems use fixed-length representations for all images. In contrast, human intelligence or LLMs (eg: OpenAI o1) adjust compute budgets based on the input. Since different images demand diff. processing & memory, how can we enable vision systems to be adaptive ? 🧡
10
67
482
@vin_agarwal
Vin Agarwal
1 year
Had a lot of fun working on this. Stay tuned for more research on how human listeners reverse engineer the physics of the world using the sounds they hear
@JoshHMcDermott
Josh McDermott
1 year
We just wrote a primer on how the physics of sound constrains auditory perception: https://t.co/NLgb4Q1ixj Covers sound propagation and object interactions, and touches on their relevance to music and film. I enjoyed working on this with @vin_agarwal and James Traer.
1
4
30
@JoshHMcDermott
Josh McDermott
1 year
We just wrote a primer on how the physics of sound constrains auditory perception: https://t.co/NLgb4Q1ixj Covers sound propagation and object interactions, and touches on their relevance to music and film. I enjoyed working on this with @vin_agarwal and James Traer.
5
38
124
@shobsund
Shobhita Sundaram
1 year
What happens when models see the world as humans do? In our #NeurIPS2024 paper we show that aligning to human perceptual preferences can *improve* general-purpose representations! πŸ“: https://t.co/IPfJUos2O5 🌐: https://t.co/RWjqXmfUiy πŸ’»: https://t.co/XsoJ2cbYDA (1/n)
8
85
454
@csail_alliances
MIT CSAIL Alliances
1 year
.@MIT_CSAIL PhD student Marianne Rakic's most recent project, Tyche, is a medical image segmentation model that aims at generalizing new tasks & capturing uncertainty in the medical image. Learn more about Marianne and her recent projects:
Tweet card summary image
cap.csail.mit.edu
0
4
7
@_akhaliq
AK
1 year
Sakana AI announces The AI Scientist Towards Fully Automated Open-Ended Scientific Discovery discuss: https://t.co/JsqbVgcLHz One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new
8
84
381
@jxbz
Jeremy Bernstein
1 year
New paper and pip package: modula: "Scalable Optimization in the Modular Norm" πŸ“¦ https://t.co/ztWVPShp1p πŸ“ https://t.co/UnVL9iY8kB We re-wrote the @pytorch module tree so that training automatically scales across width and depth.
8
37
176
@GoogleAI
Google AI
1 year
Presenting a novel approach that harnesses generative text-to-image models to enable users to precisely edit specific material properties (like roughness and transparency) of objects in images while retaining their original shape. Learn more β†’ https://t.co/hF9nkgj3WP
27
102
403
@BoyuanChen0
Boyuan Chen
1 year
Introducing Diffusion Forcing, which unifies next-token prediction (eg LLMs) and full-seq. diffusion (eg SORA)! It offers improved performance & new sampling strategies in vision and robotics, such as stable, infinite video generation, better diffusion planning, and more! (1/8)
12
214
1K