
Jacob Zhiyuan Fang
@Jacob292020
Followers
424
Following
1K
Media
36
Statuses
395
Research Scientist @ TikTok/Bytedance. Video Generation.
Mountain View, CA
Joined April 2018
🚨🚨We are hosting "Frontier Topics in Generative AI" Seminar Series at @ASU . This series delves into the cutting-edge of GenAI, exploring key areas like large-language models, text-to-image, video generation, and more. We have our first speaker this week.
2
4
14
ByteDance announces Diffusion Model with Perceptual Loss paper page: https://t.co/mBYXqlsLul Diffusion models trained with mean squared error loss tend to generate unrealistic samples. Current state-of-the-art models rely on classifier-free guidance to improve sample quality,
12
127
795
Adversarial learning + Bayesian CNNs = ⬆️single-source domain generalization tasks. Joint work with Sheng Cheng, @trgokhale, and from @ApgAsu ArXiv: https://t.co/rVQEnR5dFi To 📢 @ICCVConference PS: I still think this paper could be with just three sentences... 🤠
2
3
16
🎓 Defended my thesis today! 🌟 Big shout out to my advisor Dr. Hanghang Tong, and my thesis committee: Dr. Jiawei Han, Dr. Ross Maciejewski and Dr. Han Zhao (@hanzhao_ml)! 🙌 Thank you all to my friends, collaborators and family who supported me in this journey.
7
1
126
1. https://t.co/Nc4RG1r9hP. We introduce a new multimodal-query retrieval benchmark with an end-to-end multimodal retriever, ReMuQ dataset is available online: https://t.co/CaXh3hxlfH. It is a collaboration with @Jacob292020 @trgokhale @Yezhou_Yang @chittabaral
github.com
a multimodal retrieval dataset. Contribute to luomancs/ReMuQ development by creating an account on GitHub.
1
2
6
🎉Thrilled to share that our paper "World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models" was selected for the outstanding paper award at #ACL2023NLP! Thanks @aclmeeting :-) Let's take grounding seriously in VLMs because... 🧵[1/n]
11
14
166
"WOUAF"🐺modifies generative models by each user's unique digital fingerprint, imprinting an identifier onto the resultant content. 🐺incorporates fine-tuning into T2I (Stable Diffusion Model) and demonstrates near-perfect attribution accuracy with a minimal impact on quality.
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models @ChanghoonKim3, Kyle Min, @patelmaitreya, Sheng Cheng, @Yezhou_Yang tl;dr: augmentation-robust user signature to SD results #kornia used for data aug. https://t.co/jwD2EDb7B1
1
3
9
Tired of fine-tuning image generation models on each subject you care to generate? Today, we release SuTI, a zero-shot subject-driven text-to-image generator that operates fully in-context without tuning. One SuTI model is all you need! Website: https://t.co/xheWkQjOr8
14
45
214
Anti-DreamBooth: Protecting users from personalized text-to-image synthesis abs: https://t.co/34UcWcIUax project page: https://t.co/QGx0vIcs9S github: https://t.co/5H738ujk8j
12
95
350
For anyone who isn't already aware of it, Tiled VAE is a way to create giant (4k+) images in automatic1111 without any kind of visible seams or lots of complicated steps. Info in comments. #AIArt
#StableDiffusion2 / #StableDiffusion
4
34
331
📽️Text-to-Video? It could revolutionize entertainment as we know it. Here's Phenaki, a model that can synthesize realistic videos from text prompt sequences. More examples below ↓
25
301
2K
OpenAI just dropped a prototype of 3D DALLE (called “Point-E”) 👀. It isn’t as good as Google’s DreamFusion, but blazing fast! Like ~600x faster to generate 😮. 2D DALLE has already turned the creative world upside down. How will 3D DALLE disrupt games, VR, metaverse, …? 🤯
42
426
2K
ODRUM for #CVPR2023 🍻😄 This time we booked a complete day✌️
0
1
5
Objaverse: A Universe of Annotated 3D Objects 800K+ 3D models with descriptive captions arxiv: https://t.co/kWWxDCSSKL website: https://t.co/BMHqJwrUmd
2
18
161
✨ Introducing Imagine 3D: a new way to create 3D with text! Our mission is to build the next generation of 3D and Imagine will be a big part of it. Today Imagine is in early access and as we improve we will bring it to everyone https://t.co/VIdilw7kpa
123
640
3K
I am on the academic job market this year. I study data mining, machine learning and trustworthy AI, especially for graphs and multimedia with applications to network science, healthcare, cybersecurity and social good. Please find out more information at
2
20
49
We are excited to announce the release of Stable Diffusion Version 2! Stable Diffusion V1 changed the nature of open source AI & spawned hundreds of other innovations all over the world. We hope V2 also provides many new possibilities! Link → https://t.co/QOSSmSRKpG
133
2K
8K
Excited to share our #EMNLP2022 paper, “CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering” — w/ @trgokhale , @cbaral , and @Yezhou_Yang . Arxiv: https://t.co/NMsHzgy138 Demo: https://t.co/6tb7YffBSk 🧵1/
1
5
13
We desperately need to build a diffusion model to do this. Please cite this twit when it's on Arxiv.
0
0
3
Played with optimizing Neural Atlases through Stable Diffusion. So much fun! Here are a few examples of video edits: @RafailFridman @DanahYatim
8
73
370