Jacob Zhiyuan Fang Profile
Jacob Zhiyuan Fang

@Jacob292020

Followers
424
Following
1K
Media
36
Statuses
395

Research Scientist @ TikTok/Bytedance. Video Generation.

Mountain View, CA
Joined April 2018
Don't wanna be here? Send us removal request.
@chengshengcs
Cheng Sheng
2 years
🚨🚨We are hosting "Frontier Topics in Generative AI" Seminar Series at @ASU . This series delves into the cutting-edge of GenAI, exploring key areas like large-language models, text-to-image, video generation, and more. We have our first speaker this week.
2
4
14
@_akhaliq
AK
2 years
ByteDance announces Diffusion Model with Perceptual Loss paper page: https://t.co/mBYXqlsLul Diffusion models trained with mean squared error loss tend to generate unrealistic samples. Current state-of-the-art models rely on classifier-free guidance to improve sample quality,
12
127
795
@prof_yz
'YZ' Yezhou Yang (杨叶舟)
2 years
Adversarial learning + Bayesian CNNs = ⬆️single-source domain generalization tasks. Joint work with Sheng Cheng, @trgokhale, and from @ApgAsu ArXiv: https://t.co/rVQEnR5dFi To 📢 @ICCVConference PS: I still think this paper could be with just three sentences... 🤠
2
3
16
@jiank_uiuc
Jian Kang
2 years
🎓 Defended my thesis today! 🌟 Big shout out to my advisor Dr. Hanghang Tong, and my thesis committee: Dr. Jiawei Han, Dr. Ross Maciejewski and Dr. Han Zhao (@hanzhao_ml)! 🙌 Thank you all to my friends, collaborators and family who supported me in this journey.
7
1
126
@manluo12
manluo
2 years
1. https://t.co/Nc4RG1r9hP. We introduce a new multimodal-query retrieval benchmark with an end-to-end multimodal retriever, ReMuQ dataset is available online: https://t.co/CaXh3hxlfH. It is a collaboration with @Jacob292020 @trgokhale @Yezhou_Yang @chittabaral
Tweet card summary image
github.com
a multimodal retrieval dataset. Contribute to luomancs/ReMuQ development by creating an account on GitHub.
1
2
6
@ziqiao_ma
Martin Ziqiao Ma
2 years
🎉Thrilled to share that our paper "World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models" was selected for the outstanding paper award at #ACL2023NLP! Thanks @aclmeeting :-) Let's take grounding seriously in VLMs because... 🧵[1/n]
11
14
166
@prof_yz
'YZ' Yezhou Yang (杨叶舟)
2 years
"WOUAF"🐺modifies generative models by each user's unique digital fingerprint, imprinting an identifier onto the resultant content. 🐺incorporates fine-tuning into T2I (Stable Diffusion Model) and demonstrates near-perfect attribution accuracy with a minimal impact on quality.
@kornia_foss
Kornia
2 years
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models @ChanghoonKim3, Kyle Min, @patelmaitreya, Sheng Cheng, @Yezhou_Yang tl;dr: augmentation-robust user signature to SD results #kornia used for data aug. https://t.co/jwD2EDb7B1
1
3
9
@WenhuChen
Wenhu Chen
3 years
Tired of fine-tuning image generation models on each subject you care to generate? Today, we release SuTI, a zero-shot subject-driven text-to-image generator that operates fully in-context without tuning. One SuTI model is all you need! Website: https://t.co/xheWkQjOr8
14
45
214
@_akhaliq
AK
3 years
Anti-DreamBooth: Protecting users from personalized text-to-image synthesis abs: https://t.co/34UcWcIUax project page: https://t.co/QGx0vIcs9S github: https://t.co/5H738ujk8j
12
95
350
@DiffusionPics
Stable Diffusion 🎨 AI Art
3 years
For anyone who isn't already aware of it, Tiled VAE is a way to create giant (4k+) images in automatic1111 without any kind of visible seams or lots of complicated steps. Info in comments. #AIArt #StableDiffusion2 / #StableDiffusion
4
34
331
@AiBreakfast
AI Breakfast
3 years
📽️Text-to-Video? It could revolutionize entertainment as we know it. Here's Phenaki, a model that can synthesize realistic videos from text prompt sequences. More examples below ↓
25
301
2K
@DrJimFan
Jim Fan
3 years
OpenAI just dropped a prototype of 3D DALLE (called “Point-E”) 👀. It isn’t as good as Google’s DreamFusion, but blazing fast! Like ~600x faster to generate 😮. 2D DALLE has already turned the creative world upside down. How will 3D DALLE disrupt games, VR, metaverse, …? 🤯
42
426
2K
@Jacob292020
Jacob Zhiyuan Fang
3 years
ODRUM for #CVPR2023 🍻😄 This time we booked a complete day✌️
0
1
5
@sstj389
Stefan Stojanov
3 years
Objaverse: A Universe of Annotated 3D Objects 800K+ 3D models with descriptive captions arxiv: https://t.co/kWWxDCSSKL website: https://t.co/BMHqJwrUmd
2
18
161
@LumaLabsAI
Luma AI
3 years
✨ Introducing Imagine 3D: a new way to create 3D with text! Our mission is to build the next generation of 3D and Imagine will be a big part of it. Today Imagine is in early access and as we improve we will bring it to everyone https://t.co/VIdilw7kpa
123
640
3K
@jiank_uiuc
Jian Kang
3 years
I am on the academic job market this year. I study data mining, machine learning and trustworthy AI, especially for graphs and multimedia with applications to network science, healthcare, cybersecurity and social good. Please find out more information at
2
20
49
@StabilityAI
Stability AI
3 years
We are excited to announce the release of Stable Diffusion Version 2! Stable Diffusion V1 changed the nature of open source AI & spawned hundreds of other innovations all over the world. We hope V2 also provides many new possibilities! Link → https://t.co/QOSSmSRKpG
133
2K
8K
@patelmaitreya
Patel Maitreya
3 years
Excited to share our #EMNLP2022 paper, “CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering” — w/ @trgokhale , @cbaral , and @Yezhou_Yang . Arxiv: https://t.co/NMsHzgy138 Demo: https://t.co/6tb7YffBSk 🧵1/
1
5
13
@Jacob292020
Jacob Zhiyuan Fang
3 years
We desperately need to build a diffusion model to do this. Please cite this twit when it's on Arxiv.
0
0
3
@omerbartal
Omer Bar Tal
3 years
Played with optimizing Neural Atlases through Stable Diffusion. So much fun! Here are a few examples of video edits: @RafailFridman @DanahYatim
8
73
370