Jacob Zhiyuan Fang @Jacob292020 X Profile

Jacob Zhiyuan Fang

@Jacob292020

Followers

424

Following

1K

Media

36

Statuses

395

Research Scientist @ TikTok/Bytedance. Video Generation.

https://t.co/oWpH5eLApw

Mountain View, CA

Joined April 2018

Don't wanna be here? Send us removal request.

Cheng Sheng

@chengshengcs

2 years

🚨🚨We are hosting "Frontier Topics in Generative AI" Seminar Series at @ASU . This series delves into the cutting-edge of GenAI, exploring key areas like large-language models, text-to-image, video generation, and more. We have our first speaker this week.

2

4

14

AK

@_akhaliq

2 years

ByteDance announces Diffusion Model with Perceptual Loss paper page: https://t.co/mBYXqlsLul Diffusion models trained with mean squared error loss tend to generate unrealistic samples. Current state-of-the-art models rely on classifier-free guidance to improve sample quality,

12

127

795

'YZ' Yezhou Yang (杨叶舟)

@prof_yz

2 years

Adversarial learning + Bayesian CNNs = ⬆️single-source domain generalization tasks. Joint work with Sheng Cheng, @trgokhale, and from @ApgAsu ArXiv: https://t.co/rVQEnR5dFi To 📢 @ICCVConference PS: I still think this paper could be with just three sentences... 🤠

2

3

16

Jian Kang

@jiank_uiuc

2 years

🎓 Defended my thesis today! 🌟 Big shout out to my advisor Dr. Hanghang Tong, and my thesis committee: Dr. Jiawei Han, Dr. Ross Maciejewski and Dr. Han Zhao (@hanzhao_ml)! 🙌 Thank you all to my friends, collaborators and family who supported me in this journey.

7

1

126

manluo

@manluo12

2 years

1. https://t.co/Nc4RG1r9hP. We introduce a new multimodal-query retrieval benchmark with an end-to-end multimodal retriever, ReMuQ dataset is available online: https://t.co/CaXh3hxlfH. It is a collaboration with @Jacob292020 @trgokhale @Yezhou_Yang @chittabaral

github.com

a multimodal retrieval dataset. Contribute to luomancs/ReMuQ development by creating an account on GitHub.

1

2

6

Martin Ziqiao Ma

@ziqiao_ma

2 years

🎉Thrilled to share that our paper "World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models" was selected for the outstanding paper award at #ACL2023NLP! Thanks @aclmeeting :-) Let's take grounding seriously in VLMs because... 🧵[1/n]

11

14

166

'YZ' Yezhou Yang (杨叶舟)

@prof_yz

2 years

"WOUAF"🐺modifies generative models by each user's unique digital fingerprint, imprinting an identifier onto the resultant content. 🐺incorporates fine-tuning into T2I (Stable Diffusion Model) and demonstrates near-perfect attribution accuracy with a minimal impact on quality.

Kornia

@kornia_foss

2 years

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models @ChanghoonKim3, Kyle Min, @patelmaitreya, Sheng Cheng, @Yezhou_Yang tl;dr: augmentation-robust user signature to SD results #kornia used for data aug. https://t.co/jwD2EDb7B1

1

3

9

Wenhu Chen

@WenhuChen

3 years

Tired of fine-tuning image generation models on each subject you care to generate? Today, we release SuTI, a zero-shot subject-driven text-to-image generator that operates fully in-context without tuning. One SuTI model is all you need! Website: https://t.co/xheWkQjOr8

14

45

214

AK

@_akhaliq

3 years

Anti-DreamBooth: Protecting users from personalized text-to-image synthesis abs: https://t.co/34UcWcIUax project page: https://t.co/QGx0vIcs9S github: https://t.co/5H738ujk8j

12

95

350

Stable Diffusion 🎨 AI Art

@DiffusionPics

3 years

For anyone who isn't already aware of it, Tiled VAE is a way to create giant (4k+) images in automatic1111 without any kind of visible seams or lots of complicated steps. Info in comments. #AIArt #StableDiffusion2 / #StableDiffusion

4

34

331

AI Breakfast

@AiBreakfast

3 years

📽️Text-to-Video? It could revolutionize entertainment as we know it. Here's Phenaki, a model that can synthesize realistic videos from text prompt sequences. More examples below ↓

25

301

2K

Jim Fan

@DrJimFan

3 years

OpenAI just dropped a prototype of 3D DALLE (called “Point-E”) 👀. It isn’t as good as Google’s DreamFusion, but blazing fast! Like ~600x faster to generate 😮. 2D DALLE has already turned the creative world upside down. How will 3D DALLE disrupt games, VR, metaverse, …? 🤯

42

426

2K

Jacob Zhiyuan Fang

@Jacob292020

3 years

ODRUM for #CVPR2023 🍻😄 This time we booked a complete day✌️

0

1

5

Stefan Stojanov

@sstj389

3 years

Objaverse: A Universe of Annotated 3D Objects 800K+ 3D models with descriptive captions arxiv: https://t.co/kWWxDCSSKL website: https://t.co/BMHqJwrUmd

2

18

161

Luma AI

@LumaLabsAI

3 years

✨ Introducing Imagine 3D: a new way to create 3D with text! Our mission is to build the next generation of 3D and Imagine will be a big part of it. Today Imagine is in early access and as we improve we will bring it to everyone https://t.co/VIdilw7kpa

123

640

3K

Jian Kang

@jiank_uiuc

3 years

I am on the academic job market this year. I study data mining, machine learning and trustworthy AI, especially for graphs and multimedia with applications to network science, healthcare, cybersecurity and social good. Please find out more information at

2

20

49

Stability AI

@StabilityAI

3 years

We are excited to announce the release of Stable Diffusion Version 2! Stable Diffusion V1 changed the nature of open source AI & spawned hundreds of other innovations all over the world. We hope V2 also provides many new possibilities! Link → https://t.co/QOSSmSRKpG

133

2K

8K

Patel Maitreya

@patelmaitreya

3 years

Excited to share our #EMNLP2022 paper, “CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering” — w/ @trgokhale , @cbaral , and @Yezhou_Yang . Arxiv: https://t.co/NMsHzgy138 Demo: https://t.co/6tb7YffBSk 🧵1/