labelbox Profile Banner
Labelbox Profile
Labelbox

@labelbox

Followers
3K
Following
589
Media
28
Statuses
240

High-quality frontier data for leading AI teams.

San Francisco, CA
Joined January 2018
Don't wanna be here? Send us removal request.
@labelbox
Labelbox
16 days
Essential weekend reading. The Scaling Era: an oral history of AI by @dwarkesh_sp and thank you @stripepress!
4
18
217
@jack
jack
25 days
this is great
@dwarkesh_sp
Dwarkesh Patel
26 days
The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self
101
300
5K
@labelbox
Labelbox
26 days
Link to the full episode:
1
1
6
@labelbox
Labelbox
26 days
Highly recommend tuning into @dwarkesh_sp's episode today with @karpathy. They dive deep into why RL is so information-sparse and what that means for realizing the decade of agents. A few highlights that stood out: - “RL is terrible; it’s just that everything else is much
25
167
2K
@labelbox
Labelbox
2 months
Thrilled to be featured in Dwarkesh’s latest episode with Richard Sutton, widely regarded as the father of reinforcement learning and 2024 Turing Award winner. As Richard explains, we’re entering the Era of Experience, where training AI means creating environments that capture
13
90
739
@labelbox
Labelbox
2 months
See more of Dwarkesh’s visit and get in touch to learn how Labelbox delivers large-scale, high-fidelity data collection to advance next-gen robotics.
Tweet card summary image
labelbox.com
Discover how we partner with researchers to fuel the next wave of AI advancements, powered by experts in post-training and model evaluation.
0
0
5
@labelbox
Labelbox
2 months
As his latest guest, @svlevine (co-founder of @physical_int) predicts, robots could be running households entirely autonomously by 2030.
1
1
8
@labelbox
Labelbox
2 months
We recently invited @dwarkesh_sp to stop by our SF robotics lab. World-class podcaster, rookie robotics intern.
19
180
2K
@dwarkesh_sp
Dwarkesh Patel
2 months
.@svlevine is one of the world's leading robotics researchers (and co-founder of @physical_int). He thinks fully autonomous robots are much closer than people realize - when I pushed him on a prediction, he said 5 years to robots that can autonomously run your household). The
21
121
970
@labelbox
Labelbox
2 months
If you’re a Dwarkesh fan, check out the landing page and follow along, this is just the beginning of something special.
0
0
4
@labelbox
Labelbox
2 months
We’ve always admired how @dwarkesh_sp sparks conversations with top thinkers in AI, academia, and tech. Now we’re teaming up to connect with a community that shares our mission of pushing the limits of what’s possible in AI. The first episode together with one of his most
32
127
1K
@labelbox
Labelbox
3 months
We’ll continue evaluating frontier models on more constraint domains and reporting as the gap between leading AI capabilities closes. Check out our blog post for more info!
Tweet card summary image
labelbox.com
0
0
3
@labelbox
Labelbox
3 months
Lessons learned: Constraint interactions, not just rules, limit performance, and success on synthetic tasks doesn’t always transfer to real-world cases. We observe that high constraint densities tend to also expose weaknesses, and analyzing failures helps guide targeted
1
1
3
@labelbox
Labelbox
3 months
Our initial findings show that no current model maintains consistent feasibility under real-world, high-complexity scenarios. On synthetic stress tests, o3 demonstrates the highest feasibility, closely followed by GPT-5. In a domain-grounded data center migration benchmark, GPT-5
1
0
2
@labelbox
Labelbox
3 months
We tested whether leading models could generate schedules on RCPSP (resource-constrained project scheduling problems) that meet all constraints and remain consistent as complexity increases. To do this, we varied task difficulty across hundreds of levels and applied realistic
1
0
2
@labelbox
Labelbox
3 months
Introducing ConstraintBench: a new benchmark for evaluating LLM reasoning on realistic resource-constrained project scheduling problems (RCPSP), a well-known NP-complete challenge. It tackles some of the toughest planning challenges (such as project management, construction,
10
62
523
@labelbox
Labelbox
4 months
As AI advances, so do the human skills required to shape and align it. Full report: https://t.co/pmiGSriNU6
0
0
4
@labelbox
Labelbox
4 months
Grok-4 just landed on our Complex Reasoning leaderboard, and it’s impressive💥 - Math: 81.8% - Pure Math: 84.8% - Applied Math: 79.9% - CS: 75.4% - Reasoning: 77.8% - Aggregate: 80.7% See how it stacks up:
Tweet card summary image
labelbox.com
The Labelbox complex reasoning leaderboard rigorously assesses top AI models against some of the most demanding tasks available today.
11
2
12
@labelbox
Labelbox
4 months
obligatory AI company SF billboard
3
1
13