jack_r_saunders Profile Banner
Jack Saunders Profile
Jack Saunders

@jack_r_saunders

Followers
690
Following
313
Media
136
Statuses
321

Talking about everything to do with Facial Avatars | PhD Student | Founder of @realsyncai

Bath, UK
Joined February 2020
Don't wanna be here? Send us removal request.
@jack_r_saunders
Jack Saunders
2 months
🤖 🗞️ Free AI Generated Digital Humans Newsletter -> https://t.co/VQJODKPgX3 Like everyone else, I've been struggling to keep up with the sheer volume of papers and news in the Digital Human space. Over the past few weeks, I've developed an agentic (ish) AI pipeline to find and
Tweet media one
1
2
4
@jack_r_saunders
Jack Saunders
1 day
👩‍🦰 Want to reconstruct hair from your Gaussian Avatars? Here's a way. HairGS: Hair Strand Reconstruction based on 3D Gaussian Splatting TLDR: A Gaussian Avatar is fit to images with an additional angular loss for hair geometry. These Gaussians are then converted into strands
0
1
3
@jack_r_saunders
Jack Saunders
20 days
The paper is here: https://t.co/DCEr0hBwI2, and they have a GitHub repo: https://t.co/Gnb9BcJBeN. No word on if they will open-source it. What do you think?
0
0
0
@jack_r_saunders
Jack Saunders
20 days
🗄️ A data filtering strategy that works with their pretraining steps. At lower resolution, more data is kept to learn diverse prompts and motion, while more is discarded at high resolution for video quality.
1
0
0
@jack_r_saunders
Jack Saunders
20 days
🚿 A hybrid stream architecture using both dual stream (processes modalities separately except at self attention) and single stream (processes both together) blocks in the transformer.
1
0
1
@jack_r_saunders
Jack Saunders
20 days
2️⃣ A two stage model: The first generates video at 480p or 720p, while the second refines this to 1080p. Both are rectified flow transformers.
1
0
1
@jack_r_saunders
Jack Saunders
20 days
🎶 Bytedance have just released the technical report for their newest video foundation model, Waver 1.0. Some highlights 👇
2
2
17
@jack_r_saunders
Jack Saunders
21 days
📦 Looking for a dataset to train your talking head models? Here's one option! TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis TLDR: A curated and filtered talking head dataset consisting of 1244 hours of HD video for nearly 8000 speakers.
Tweet media one
0
4
13
@jack_r_saunders
Jack Saunders
24 days
Drowning in new AI Avatar papers? I built a newsletter that uses AI to summarise them. 📩 Here’s what I learned: 🤖Agents ≠ always better than workflows ⁉️Prompts fail, error handling saves you 🧑‍💻APIs are your best friend Full write-up here 👉
Tweet card summary image
pub.towardsai.net
One of the hardest parts of being a researcher in the current world of AI is keeping up with the insane volume of new work that seems to…
0
0
3
@jack_r_saunders
Jack Saunders
27 days
🔥 Facial Puppetry just got a serious upgrade with a new open-source ❗ model. FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers TLDR: This is a DiT based model that uses an implicit representation of facial
0
0
9
@heytavus
Tavus
29 days
CVI beta launch week feat. 📚Knowledge Base (RAG) Conversations are now powered by your knowledge. Upload docs, link your content, and let CVI answer with speed & precision. Just 30ms response time. 15x faster than other RAG solutions on the market. See it in action ⬇️
Tweet media one
0
2
11
@jack_r_saunders
Jack Saunders
29 days
📖 It's always great to get some open-source datasets! Here's one for relighting and novel view synthesis 👇 HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis TLDR: A 21-person dataset where each person does 3 poses and is captured from 40
Tweet media one
0
4
58
@jack_r_saunders
Jack Saunders
1 month
🏎️ We're starting to see some real-time diffusion models for Avatar generation! RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer TLDR: This DiT model is able to perform audio-to-talking head synthesis in real-time. The key is the use of a very
0
3
17
@jack_r_saunders
Jack Saunders
1 month
🎶 👗 ByteDance are making amazing progress for Virtual Try-On! DreamVVT: Mastering Realistic Video Virtual Try-On in the Wild via a Stage-Wise Diffusion Transformer Framework. TLDR: This work overcomes the reliance on hard-to-find paired (video, garment) data. By using a
0
1
4
@jack_r_saunders
Jack Saunders
1 month
📣 Accepted Paper 📣 I'm happy to share that our work DEAD: Data Effecient Audiovisual Dubbing using Neural Rendering Priors (previously Dubbing for Everyone) has been accepted to BMVC25! TLDR: We achive high-quality dubbing with just 4 seconds of personalised data. We seperate
1
2
14
@jack_r_saunders
Jack Saunders
1 month
🛫 A double whammy from Microsoft this week, with another synthetics-based paper. This time, you can lift a video stream to 3D with free-viewpoint rendering that tracks the viewer! VoluMe – Authentic 3D Video Calls from Live Gaussian Splat Prediction TLDR: A UNET-based model
2
29
109
@jack_r_saunders
Jack Saunders
2 months
🧑‍🦳 👩‍🦳 Meta's real-time Codec Avatars now have hair control. Imagine being able to customise your hairstyle in VR! HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars TLDR: Trains two separate hyper-networks, one for the face and one for the hair. The data to do
3
35
213
@jack_r_saunders
Jack Saunders
2 months
⏱️ Turn yourself into a 3D Avatar in real-time with StreamME from Adobe and the University of Rochester (code coming soon) StreamME: Simplify 3D Gaussian Avatar within Live Stream TLDR: This work speeds up Gaussian reconstruction using motion-aware anchor points to prevent the
1
21
100
@jack_r_saunders
Jack Saunders
2 months
Want to get up to three papers like this directly to your inbox and summarised by AI each day? You can sign up for free here:
Tweet card summary image
realsyncai.com
Leading Avatar Consultant and Digital Human Consultant Dr. Jack Saunders. Expert lip sync consultancy and Avatar development guidance.
0
0
0
@jack_r_saunders
Jack Saunders
2 months
🔥 Microsoft once again showing off the power of synthetic data for human-centric computer vision! Showing how you can use it for: - 🧐 Depth Prediction - ↗️ Normal Estimation - 👤 Background Segmentation Synthetic data has pixel-perfect label accuracy and is nearly
7
67
438