DailyPapers @HuggingPapers X Profile

DailyPapers

@HuggingPapers

Followers

5K

Following

8

Media

742

Statuses

2K

Tweeting interesting papers submitted at https://t.co/rXX8x0HzXV. Submit your own at https://t.co/QhbJKXBd4Q, and link models/datasets/demos to it!

Anywhere

Joined March 2025

Don't wanna be here? Send us removal request.

DailyPapers

@HuggingPapers

1 hour

Beyond Transcription: A new paper introduces mechanistic interpretability to ASR, revealing hidden internal dynamics and biases in how models process speech.

huggingface.co

0

2

DailyPapers

@HuggingPapers

5 hours

Discover how Discrete Diffusion VLA brings a unified, scalable architecture to robot action decoding. Its adaptive easy-to-hard strategy & robust error correction improve over prior methods. Read the paper for full details:.

huggingface.co

0

1

DailyPapers

@HuggingPapers

5 hours

Huawei Cloud and partners introduce Discrete Diffusion VLA. A unified transformer for robotics, using discrete diffusion to decode actions from vision-language inputs. It achieves state-of-the-art performance, tackling robot tasks with unprecedented efficiency.

1

0

5

DailyPapers

@HuggingPapers

9 hours

Vision-SR1 decomposes visual reasoning into perception and language, allowing the VLM to self-reward and learn. This boosts visual reasoning and drastically reduces hallucinations & language shortcuts!. Paper: Code:

github.com

Reinforcement Learning of Vision Language Models with Self Visual Perception Reward - zli12321/Vision-SR1

0

4

DailyPapers

@HuggingPapers

9 hours

Tencent AI Lab introduces Vision-SR1. A self-rewarding Vision-Language Model to fix visual hallucinations & language shortcuts. Decomposes reasoning into visual perception & language reasoning for a unique self-supervision signal, without external labels.

1

4

30

DailyPapers

@HuggingPapers

13 hours

ByteDance's OmniHuman-1.5 gives avatars an active mind, generating expressive, context-aware animations from multimodal inputs. It leverages MLLMs and a specialized DiT architecture for physically plausible, semantically coherent motions. No more repetitive gestures – expect.

0

1

8

DailyPapers

@HuggingPapers

17 hours

VoxHammer introduces Edit3D-Bench, a new human-annotated dataset for evaluating 3D editing consistency!. Experience truly flexible 3D local editing. Read the paper: Try the demo: Dataset:

huggingface.co

0

1

5

DailyPapers

@HuggingPapers

17 hours

New paper: VoxHammer by Tencent & Beihang University is here!. A training-free method for precise and coherent 3D editing, operating directly in the native 3D latent space. It ensures unedited regions stay perfect while new changes integrate seamlessly.

5

9

54

DailyPapers

@HuggingPapers

20 hours

OpenAI just released HealthBench on Hugging Face. This new dataset is designed for rigorously evaluating large language models' capabilities in improving human health. A vital step for AI in medicine!.

huggingface.co

3

16

116

DailyPapers

@HuggingPapers

21 hours

TreePO boosts LLM reasoning with up to 43% faster training by using a heuristic tree-based search!. Dive into the paper & explore checkpoints/data on the Hub:.🔗 🗂️

huggingface.co

0

3

DailyPapers

@HuggingPapers

21 hours

ByteDance researchers just released TreePO. It's a novel reinforcement learning framework for LLMs that uses a tree-based search to generate reasoning paths, significantly boosting efficiency and performance.

2

4

25

DailyPapers

@HuggingPapers

1 day

With 520+ grad-level problems & a new SEED metric, CMPhysBench reveals a huge gap: Grok-4 scores only 28%!. Dive into the data-driven future of science. Paper: Dataset:

huggingface.co

0

5

DailyPapers

@HuggingPapers

1 day

Can LLMs ace grad-level condensed matter physics? Our new benchmark, CMPhysBench, is here to find out!

2

1

12

DailyPapers

@HuggingPapers

1 day

It synthesizes up to 90 min of speech with 4 speakers, capturing the authentic "vibe" of dialogue. VibeVoice's continuous speech tokenizer enables 80x data compression. Paper: Collection: Demo:

huggingface.co

0

2

DailyPapers

@HuggingPapers

1 day

Microsoft introduces VibeVoice on Hugging Face: a frontier model for synthesizing expressive, long-form, multi-speaker conversational audio!

1

4

20

DailyPapers

@HuggingPapers

1 day

Asteromorph Corp unveils Spacer: an AI system engineering scientific inspiration to generate novel, factually-grounded research concepts.

huggingface.co

0

2

10

DailyPapers

@HuggingPapers

2 days

MMTok formulates token selection as a maximum coverage problem, preserving 87.7% F1 with just 4 vision tokens on POPE datasets!. This training-free method uses both vision and text for smarter pruning. Paper:

huggingface.co

0

4

DailyPapers

@HuggingPapers

2 days

New paper out: MMTok significantly boosts VLM inference efficiency!. It achieves up to 1.87x speedup on H100 while maintaining 98.7% accuracy, by smartly selecting vision tokens using multimodal context. Less tokens, more performance.

2

5

23

DailyPapers

@HuggingPapers

2 days

New research introduces RuscaRL, a novel reinforcement learning framework designed to break the exploration bottleneck that limits LLMs in general reasoning tasks. It uses rubric-scaffolded exploration & verifiable rewards to expand reasoning capabilities.

1

11

DailyPapers

@HuggingPapers

2 days

Dive into 18k+ human assessments from 1,000+ global annotators. Each entry includes prompts, 4 candidate responses, and detailed feedback with rationales for personal and world views. Explore the dataset: Read the blog post:

openai.com

We surveyed over 1,000 people worldwide on how our models should behave and compared their views to our Model Spec. We found they largely agree with the Spec, and we adopted changes from the disagr...

0

1

3