HuggingPapers Profile Banner
DailyPapers Profile
DailyPapers

@HuggingPapers

Followers
5K
Following
8
Media
742
Statuses
2K

Tweeting interesting papers submitted at https://t.co/rXX8x0HzXV. Submit your own at https://t.co/QhbJKXBd4Q, and link models/datasets/demos to it!

Anywhere
Joined March 2025
Don't wanna be here? Send us removal request.
@HuggingPapers
DailyPapers
1 hour
Beyond Transcription: A new paper introduces mechanistic interpretability to ASR, revealing hidden internal dynamics and biases in how models process speech.
Tweet card summary image
huggingface.co
0
0
2
@HuggingPapers
DailyPapers
5 hours
Discover how Discrete Diffusion VLA brings a unified, scalable architecture to robot action decoding. Its adaptive easy-to-hard strategy & robust error correction improve over prior methods. Read the paper for full details:.
Tweet card summary image
huggingface.co
0
0
1
@HuggingPapers
DailyPapers
5 hours
Huawei Cloud and partners introduce Discrete Diffusion VLA. A unified transformer for robotics, using discrete diffusion to decode actions from vision-language inputs. It achieves state-of-the-art performance, tackling robot tasks with unprecedented efficiency.
1
0
5
@HuggingPapers
DailyPapers
9 hours
Vision-SR1 decomposes visual reasoning into perception and language, allowing the VLM to self-reward and learn. This boosts visual reasoning and drastically reduces hallucinations & language shortcuts!. Paper: Code:
Tweet card summary image
github.com
Reinforcement Learning of Vision Language Models with Self Visual Perception Reward - zli12321/Vision-SR1
0
0
4
@HuggingPapers
DailyPapers
9 hours
Tencent AI Lab introduces Vision-SR1. A self-rewarding Vision-Language Model to fix visual hallucinations & language shortcuts. Decomposes reasoning into visual perception & language reasoning for a unique self-supervision signal, without external labels.
Tweet media one
1
4
30
@HuggingPapers
DailyPapers
13 hours
ByteDance's OmniHuman-1.5 gives avatars an active mind, generating expressive, context-aware animations from multimodal inputs. It leverages MLLMs and a specialized DiT architecture for physically plausible, semantically coherent motions. No more repetitive gestures – expect.
0
1
8
@HuggingPapers
DailyPapers
17 hours
VoxHammer introduces Edit3D-Bench, a new human-annotated dataset for evaluating 3D editing consistency!. Experience truly flexible 3D local editing. Read the paper: Try the demo: Dataset:
Tweet card summary image
huggingface.co
0
1
5
@HuggingPapers
DailyPapers
17 hours
New paper: VoxHammer by Tencent & Beihang University is here!. A training-free method for precise and coherent 3D editing, operating directly in the native 3D latent space. It ensures unedited regions stay perfect while new changes integrate seamlessly.
5
9
54
@HuggingPapers
DailyPapers
20 hours
OpenAI just released HealthBench on Hugging Face. This new dataset is designed for rigorously evaluating large language models' capabilities in improving human health. A vital step for AI in medicine!.
Tweet card summary image
huggingface.co
3
16
116
@HuggingPapers
DailyPapers
21 hours
TreePO boosts LLM reasoning with up to 43% faster training by using a heuristic tree-based search!. Dive into the paper & explore checkpoints/data on the Hub:.🔗 🗂️
Tweet card summary image
huggingface.co
0
0
3
@HuggingPapers
DailyPapers
21 hours
ByteDance researchers just released TreePO. It's a novel reinforcement learning framework for LLMs that uses a tree-based search to generate reasoning paths, significantly boosting efficiency and performance.
Tweet media one
2
4
25
@HuggingPapers
DailyPapers
1 day
With 520+ grad-level problems & a new SEED metric, CMPhysBench reveals a huge gap: Grok-4 scores only 28%!. Dive into the data-driven future of science. Paper: Dataset:
Tweet card summary image
huggingface.co
0
0
5
@HuggingPapers
DailyPapers
1 day
Can LLMs ace grad-level condensed matter physics? Our new benchmark, CMPhysBench, is here to find out!
Tweet media one
2
1
12
@HuggingPapers
DailyPapers
1 day
It synthesizes up to 90 min of speech with 4 speakers, capturing the authentic "vibe" of dialogue. VibeVoice's continuous speech tokenizer enables 80x data compression. Paper: Collection: Demo:
Tweet card summary image
huggingface.co
0
0
2
@HuggingPapers
DailyPapers
1 day
Microsoft introduces VibeVoice on Hugging Face: a frontier model for synthesizing expressive, long-form, multi-speaker conversational audio!
Tweet media one
1
4
20
@HuggingPapers
DailyPapers
1 day
Asteromorph Corp unveils Spacer: an AI system engineering scientific inspiration to generate novel, factually-grounded research concepts.
Tweet card summary image
huggingface.co
0
2
10
@HuggingPapers
DailyPapers
2 days
MMTok formulates token selection as a maximum coverage problem, preserving 87.7% F1 with just 4 vision tokens on POPE datasets!. This training-free method uses both vision and text for smarter pruning. Paper:
Tweet card summary image
huggingface.co
0
0
4
@HuggingPapers
DailyPapers
2 days
New paper out: MMTok significantly boosts VLM inference efficiency!. It achieves up to 1.87x speedup on H100 while maintaining 98.7% accuracy, by smartly selecting vision tokens using multimodal context. Less tokens, more performance.
Tweet media one
2
5
23
@HuggingPapers
DailyPapers
2 days
New research introduces RuscaRL, a novel reinforcement learning framework designed to break the exploration bottleneck that limits LLMs in general reasoning tasks. It uses rubric-scaffolded exploration & verifiable rewards to expand reasoning capabilities.
1
1
11
@HuggingPapers
DailyPapers
2 days
Dive into 18k+ human assessments from 1,000+ global annotators. Each entry includes prompts, 4 candidate responses, and detailed feedback with rationales for personal and world views. Explore the dataset: Read the blog post:
Tweet card summary image
openai.com
We surveyed over 1,000 people worldwide on how our models should behave and compared their views to our Model Spec. We found they largely agree with the Spec, and we adopted changes from the disagr...
0
1
3