Jack Saunders @jack_r_saunders X Profile

Jack Saunders

@jack_r_saunders

Followers

690

Following

313

Media

136

Statuses

321

Talking about everything to do with Facial Avatars | PhD Student | Founder of @realsyncai

https://t.co/Yml7ngggpU

Bath, UK

Joined February 2020

Don't wanna be here? Send us removal request.

Jack Saunders

@jack_r_saunders

2 months

🤖 🗞️ Free AI Generated Digital Humans Newsletter -> https://t.co/VQJODKPgX3 Like everyone else, I've been struggling to keep up with the sheer volume of papers and news in the Digital Human space. Over the past few weeks, I've developed an agentic (ish) AI pipeline to find and

1

2

4

Jack Saunders

@jack_r_saunders

1 day

👩‍🦰 Want to reconstruct hair from your Gaussian Avatars? Here's a way. HairGS: Hair Strand Reconstruction based on 3D Gaussian Splatting TLDR: A Gaussian Avatar is fit to images with an additional angular loss for hair geometry. These Gaussians are then converted into strands

0

1

3

Jack Saunders

@jack_r_saunders

20 days

The paper is here: https://t.co/DCEr0hBwI2, and they have a GitHub repo: https://t.co/Gnb9BcJBeN. No word on if they will open-source it. What do you think?

0

Jack Saunders

@jack_r_saunders

20 days

🗄️ A data filtering strategy that works with their pretraining steps. At lower resolution, more data is kept to learn diverse prompts and motion, while more is discarded at high resolution for video quality.

1

0

Jack Saunders

@jack_r_saunders

20 days

🚿 A hybrid stream architecture using both dual stream (processes modalities separately except at self attention) and single stream (processes both together) blocks in the transformer.

1

0

1

Jack Saunders

@jack_r_saunders

20 days

2️⃣ A two stage model: The first generates video at 480p or 720p, while the second refines this to 1080p. Both are rectified flow transformers.

1

0

1

Jack Saunders

@jack_r_saunders

20 days

🎶 Bytedance have just released the technical report for their newest video foundation model, Waver 1.0. Some highlights 👇

2

17

Jack Saunders

@jack_r_saunders

21 days

📦 Looking for a dataset to train your talking head models? Here's one option! TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis TLDR: A curated and filtered talking head dataset consisting of 1244 hours of HD video for nearly 8000 speakers.

0

4

13

Jack Saunders

@jack_r_saunders

24 days

Drowning in new AI Avatar papers? I built a newsletter that uses AI to summarise them. 📩 Here’s what I learned: 🤖Agents ≠ always better than workflows ⁉️Prompts fail, error handling saves you 🧑‍💻APIs are your best friend Full write-up here 👉

pub.towardsai.net

One of the hardest parts of being a researcher in the current world of AI is keeping up with the insane volume of new work that seems to…

0

3

Jack Saunders

@jack_r_saunders

27 days

🔥 Facial Puppetry just got a serious upgrade with a new open-source ❗ model. FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers TLDR: This is a DiT based model that uses an implicit representation of facial

0

9

Tavus

@heytavus

29 days

CVI beta launch week feat. 📚Knowledge Base (RAG) Conversations are now powered by your knowledge. Upload docs, link your content, and let CVI answer with speed & precision. Just 30ms response time. 15x faster than other RAG solutions on the market. See it in action ⬇️

0

2

11

Jack Saunders

@jack_r_saunders

29 days

📖 It's always great to get some open-source datasets! Here's one for relighting and novel view synthesis 👇 HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis TLDR: A 21-person dataset where each person does 3 poses and is captured from 40

0

4

58

Jack Saunders

@jack_r_saunders

1 month

🏎️ We're starting to see some real-time diffusion models for Avatar generation! RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer TLDR: This DiT model is able to perform audio-to-talking head synthesis in real-time. The key is the use of a very

0

3

17

Jack Saunders

@jack_r_saunders

1 month

🎶 👗 ByteDance are making amazing progress for Virtual Try-On! DreamVVT: Mastering Realistic Video Virtual Try-On in the Wild via a Stage-Wise Diffusion Transformer Framework. TLDR: This work overcomes the reliance on hard-to-find paired (video, garment) data. By using a

0

1

4

Jack Saunders

@jack_r_saunders

1 month

📣 Accepted Paper 📣 I'm happy to share that our work DEAD: Data Effecient Audiovisual Dubbing using Neural Rendering Priors (previously Dubbing for Everyone) has been accepted to BMVC25! TLDR: We achive high-quality dubbing with just 4 seconds of personalised data. We seperate

1

2

14

Jack Saunders

@jack_r_saunders

1 month

🛫 A double whammy from Microsoft this week, with another synthetics-based paper. This time, you can lift a video stream to 3D with free-viewpoint rendering that tracks the viewer! VoluMe – Authentic 3D Video Calls from Live Gaussian Splat Prediction TLDR: A UNET-based model

2

29

109

Jack Saunders

@jack_r_saunders

2 months

🧑‍🦳 👩‍🦳 Meta's real-time Codec Avatars now have hair control. Imagine being able to customise your hairstyle in VR! HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars TLDR: Trains two separate hyper-networks, one for the face and one for the hair. The data to do

3

35

213

Jack Saunders

@jack_r_saunders

2 months

⏱️ Turn yourself into a 3D Avatar in real-time with StreamME from Adobe and the University of Rochester (code coming soon) StreamME: Simplify 3D Gaussian Avatar within Live Stream TLDR: This work speeds up Gaussian reconstruction using motion-aware anchor points to prevent the

1

21

100

Jack Saunders

@jack_r_saunders

2 months

Want to get up to three papers like this directly to your inbox and summarised by AI each day? You can sign up for free here:

realsyncai.com

Leading Avatar Consultant and Digital Human Consultant Dr. Jack Saunders. Expert lip sync consultancy and Avatar development guidance.

0

Jack Saunders

@jack_r_saunders

2 months

🔥 Microsoft once again showing off the power of synthetic data for human-centric computer vision! Showing how you can use it for: - 🧐 Depth Prediction - ↗️ Normal Estimation - 👤 Background Segmentation Synthetic data has pixel-perfect label accuracy and is nearly

7

67

438