Shraman Pramanick Profile
Shraman Pramanick

@Shramanpramani2

Followers
489
Following
848
Media
8
Statuses
96

PostDoc @AIatMeta Ph.D. @JohnsHopkins | Interned @AIatMeta FAIR, GenAI, @google GDM | Multimodal LLMs

Baltimore, MD
Joined October 2016
Don't wanna be here? Send us removal request.
@KevinQHLin
Kevin Lin
15 days
I’m deeply saddened and frustrated to hear that my friend @Shramanpramani2 ( https://t.co/eoQLDVM1tJ) been affected by the recent layoffs at Meta — such a pity, especially for a fresh PhD with so much potential. I’ve had the pleasure of working with Shraman since 2023, a
@Shramanpramani2
Shraman Pramanick
19 days
My role at Meta's SAM team (MSL, previously at FAIR Perception) has been impacted within 3 months of joining after PhD. If you work with multimodal LLMs for grounding or complex reasoning, or have a long-term vision of unified understanding and generation, let's talk. I am on
3
4
68
@Shramanpramani2
Shraman Pramanick
19 days
My role at Meta's SAM team (MSL, previously at FAIR Perception) has been impacted within 3 months of joining after PhD. If you work with multimodal LLMs for grounding or complex reasoning, or have a long-term vision of unified understanding and generation, let's talk. I am on
@cuijiaxun
Jiaxun Cui 🐿️
20 days
Meta has gone crazy on the squid game! Many new PhD NGs are deactivated today (I am also impacted🥲 happy to chat)
27
27
343
@NeurIPSConf
NeurIPS Conference
8 months
NeurIPS 2025 is soliciting self-nominations for reviewers and ACs. Please read our blog post for details on eligibility criteria, and process to self-nominate:
4
29
127
@nagsayan112358
Sayan Nag (সায়ন নাগ)
10 months
🚀 Internship Opportunity at #AdobeResearch🚀 Looking for PhD interns for Summer 2025! Interested in exploring the intersection of multimodal LLMs, diffusion models, etc? 📩 Send me a DM with your CV, website, and GScholar profile. #GenerativeAI
1
1
5
@liuzhuang1234
Zhuang Liu
11 months
How far is an LLM from not only understanding but also generating visually? Not very far! Introducing MetaMorph---a multimodal understanding and generation model. In MetaMorph, understanding and generation benefit each other. Very moderate generation data is needed to elicit
24
136
726
@Shramanpramani2
Shraman Pramanick
11 months
I am at NeurIPS 2024 at Vancouver. I'll be presenting SPIQA on Wednesday AM Poster Session Booth #3700! 📜 arXiv: https://t.co/UoqZ82ibvy 🗄️SPIQA dataset: https://t.co/JvIQibFDGI 👨‍💻 github: https://t.co/Ns4KqVXIAG In this work, we have done a comprehensive analysis of
1
0
8
@imisra_
Ishan Misra
1 year
So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!
@AIatMeta
AI at Meta
1 year
🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in
40
71
890
@irena_gao
Irena Gao
1 year
Applying to Stanford's CS PhD program? Current graduate students are running a SoP + CV feedback program for URM applicants (broadly defined). Apply to SASP by Oct. 25! Info:
cs.stanford.edu
3
96
451
@Shramanpramani2
Shraman Pramanick
1 year
SPIQA is accepted by #NeurIPS2024 D&B! 😃 In this work, we have done a comprehensive analysis of various strong multimodal LLMs for understanding wide range of scientific figures and tables, including schematic diagrams, charts, plots, visualizations etc. Check out our paper,
@Shramanpramani2
Shraman Pramanick
1 year
✨Can multimodal LLMs effectively answer questions in the context of long scientific research papers by thoroughly analyzing the entire text, complex figures, tables, and captions? Our recent project, SPIQA, initiates an exploration into this question by developing the first
0
0
10
@_amirbar
Amir Bar
1 year
great example on how to stick to your research agenda despite temporary distractions.
@liuzhuang1234
Zhuang Liu
1 year
Paper is rejected, but a followup paper that completely depends on the rejected paper is accepted #NeurIPS
0
2
10
@karpathy
Andrej Karpathy
1 year
Programming is changing so fast... I'm trying VS Code Cursor + Sonnet 3.5 instead of GitHub Copilot again and I think it's now a net win. Just empirically, over the last few days most of my "programming" is now writing English (prompting and then reviewing and editing the
526
2K
18K
@_akhaliq
AK
1 year
Salesforce presents xGen-MM (BLIP-3) A Family of Open Large Multimodal Models discuss: https://t.co/e056zqI1Oo This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated
7
75
307
@javilopen
Javi Lopez ⛩️
1 year
Generative AI 3 years ago VS today.
180
1K
14K
@alliseeisgold
Jordan Burroughs
1 year
GIVE VINESH SILVER! 🥈
3K
27K
101K
@Shramanpramani2
Shraman Pramanick
1 year
This work was done during my Student Researcher tenure at @Google. I can not thank enough my rock star host Subhashini Venugopalan and my Ph.D. advisor Professor Rama Chellappa.
0
0
2
@Shramanpramani2
Shraman Pramanick
1 year
Limitations and Future Prospects: SPIQA only consists of papers from Computer Science. Extending SPIQA to encompass other scientific domains remains a future prospect. 7/7 🧶
1
0
2
@Shramanpramani2
Shraman Pramanick
1 year
Well-written related works are often undervalued in the review process. In our paper, we provide an extensive comparison of SPIQA with all existing scientific question answering datasets. 6/7 🧶
1
0
1
@Shramanpramani2
Shraman Pramanick
1 year
Our proposed CoT evaluation prompt guides the models through step-by-step reasoning, which often results in better responses. For instance, GPT-4 Vision shows an increase of 6.70, 1.73, and 2.98 in L3Score when using CoT prompts compared to direct QA. Similar improvements are
1
0
1
@Shramanpramani2
Shraman Pramanick
1 year
We fine-tune InstructBLIP and LLaVA 1.5 and obtain a massive improvement of 28 and 26 point L3Score on average over three test sets of SPIQA compared to corresponding zero-shot models. These finetuned models perform almost equally well as Gemini Pro Vision, a powerful
1
0
1