
Jack Saunders
@jack_r_saunders
Followers
690
Following
313
Media
136
Statuses
321
Talking about everything to do with Facial Avatars | PhD Student | Founder of @realsyncai
Bath, UK
Joined February 2020
🤖 🗞️ Free AI Generated Digital Humans Newsletter -> https://t.co/VQJODKPgX3 Like everyone else, I've been struggling to keep up with the sheer volume of papers and news in the Digital Human space. Over the past few weeks, I've developed an agentic (ish) AI pipeline to find and
1
2
4
👩🦰 Want to reconstruct hair from your Gaussian Avatars? Here's a way. HairGS: Hair Strand Reconstruction based on 3D Gaussian Splatting TLDR: A Gaussian Avatar is fit to images with an additional angular loss for hair geometry. These Gaussians are then converted into strands
0
1
3
The paper is here: https://t.co/DCEr0hBwI2, and they have a GitHub repo: https://t.co/Gnb9BcJBeN. No word on if they will open-source it. What do you think?
0
0
0
🗄️ A data filtering strategy that works with their pretraining steps. At lower resolution, more data is kept to learn diverse prompts and motion, while more is discarded at high resolution for video quality.
1
0
0
🚿 A hybrid stream architecture using both dual stream (processes modalities separately except at self attention) and single stream (processes both together) blocks in the transformer.
1
0
1
2️⃣ A two stage model: The first generates video at 480p or 720p, while the second refines this to 1080p. Both are rectified flow transformers.
1
0
1
🎶 Bytedance have just released the technical report for their newest video foundation model, Waver 1.0. Some highlights 👇
2
2
17
📦 Looking for a dataset to train your talking head models? Here's one option! TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis TLDR: A curated and filtered talking head dataset consisting of 1244 hours of HD video for nearly 8000 speakers.
0
4
13
Drowning in new AI Avatar papers? I built a newsletter that uses AI to summarise them. 📩 Here’s what I learned: 🤖Agents ≠ always better than workflows ⁉️Prompts fail, error handling saves you 🧑💻APIs are your best friend Full write-up here 👉
pub.towardsai.net
One of the hardest parts of being a researcher in the current world of AI is keeping up with the insane volume of new work that seems to…
0
0
3
🔥 Facial Puppetry just got a serious upgrade with a new open-source ❗ model. FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers TLDR: This is a DiT based model that uses an implicit representation of facial
0
0
9
CVI beta launch week feat. 📚Knowledge Base (RAG) Conversations are now powered by your knowledge. Upload docs, link your content, and let CVI answer with speed & precision. Just 30ms response time. 15x faster than other RAG solutions on the market. See it in action ⬇️
0
2
11
📖 It's always great to get some open-source datasets! Here's one for relighting and novel view synthesis 👇 HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis TLDR: A 21-person dataset where each person does 3 poses and is captured from 40
0
4
58
🏎️ We're starting to see some real-time diffusion models for Avatar generation! RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer TLDR: This DiT model is able to perform audio-to-talking head synthesis in real-time. The key is the use of a very
0
3
17
🎶 👗 ByteDance are making amazing progress for Virtual Try-On! DreamVVT: Mastering Realistic Video Virtual Try-On in the Wild via a Stage-Wise Diffusion Transformer Framework. TLDR: This work overcomes the reliance on hard-to-find paired (video, garment) data. By using a
0
1
4
📣 Accepted Paper 📣 I'm happy to share that our work DEAD: Data Effecient Audiovisual Dubbing using Neural Rendering Priors (previously Dubbing for Everyone) has been accepted to BMVC25! TLDR: We achive high-quality dubbing with just 4 seconds of personalised data. We seperate
1
2
14
🛫 A double whammy from Microsoft this week, with another synthetics-based paper. This time, you can lift a video stream to 3D with free-viewpoint rendering that tracks the viewer! VoluMe – Authentic 3D Video Calls from Live Gaussian Splat Prediction TLDR: A UNET-based model
2
29
109
🧑🦳 👩🦳 Meta's real-time Codec Avatars now have hair control. Imagine being able to customise your hairstyle in VR! HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars TLDR: Trains two separate hyper-networks, one for the face and one for the hair. The data to do
3
35
213
⏱️ Turn yourself into a 3D Avatar in real-time with StreamME from Adobe and the University of Rochester (code coming soon) StreamME: Simplify 3D Gaussian Avatar within Live Stream TLDR: This work speeds up Gaussian reconstruction using motion-aware anchor points to prevent the
1
21
100
Want to get up to three papers like this directly to your inbox and summarised by AI each day? You can sign up for free here:
realsyncai.com
Leading Avatar Consultant and Digital Human Consultant Dr. Jack Saunders. Expert lip sync consultancy and Avatar development guidance.
0
0
0
🔥 Microsoft once again showing off the power of synthetic data for human-centric computer vision! Showing how you can use it for: - 🧐 Depth Prediction - ↗️ Normal Estimation - 👤 Background Segmentation Synthetic data has pixel-perfect label accuracy and is nearly
7
67
438