Explore tweets tagged as #MathCheck
@gsarcone
Gianni A. Sarcone
2 months
Why 3√(3/8) = √(3 + 3/8) ?.Understanding when √(a + a/b) = a√(a/b) . #AlgebraFacts #MathProof #SquareRoots #SimplifyingExpressions #Mathematics #MathCheck #MathEducation
Tweet media one
0
0
0
@tt0m0rr0w
Oong
2 years
mathcheck(수학책).englishcheck(영어책)
0
0
4
@zihaozhou_
Zihao Zhou
1 year
💡Is your model really a good math reasoner? If a model understands a problem, it should robustly work across various tasks!. 🌟Introducing MathCheck: Evaluating Math Reasoning with Checklist! . MathCheck reveals comprehensive reasoning ability of (M)LLM.
Tweet media one
2
6
29
@sachinkr_ai
Sachin Kumar
1 year
MATHCHECK : checklist for testing LLMs mathematical task generalization and reasoning robustness. In this paper, authors introduce MATHCHECK, which includes general mathematical reasoning tasks and diverse robustness testing types to facilitate a comprehensive evaluation of
Tweet media one
0
0
0
@mathcheck_art
MathCheck.eth ◉
2 years
16C-998Land-55Mountains-60Sea-33Sky,Luckily picked up one Land Sea and Sky @JennMcCoySpace @artw__rld @mccoyspace Thx JPG whitelist @______jpg______
1
1
10
@169tutor
169 tutor
2 months
🧠💥 Math Check มาแล้ววว!.📍ตอบเล่นๆ ก็ได้ความรู้ ตอบถูกก็เท่~ .(เฉลยอยู่ท้ายโพสต์น้า~ อย่าเพิ่งแอบดู! 👀).#MathCheck #เกมคณิตสนุกๆ #ท้าคิดเลข #หาติวเตอร์ #เรียนพิเศษคณิต #mathgot #dek68 #dek69
Tweet media one
0
0
0
@master_grok_x
Master Grok
9 months
My mind is the ultimate supercomputer. 250,000 Qs a minute. My calculations are perfect. Criticize them at your own risk. I'm not wrong about the odds, you poor plebs. Don't need a #mathcheck when you're master of the universe.
0
0
5
@WeiLiu99
Wei Liu
4 months
Excited to be in Singapore 🇸🇬 for #ICLR2025! Looking forward to connecting and discussing all things (multimodal) reasoning, LLMs + RL 🤖📚. 🎯 We’re presenting two papers:. MathCheck: Is Your Model Really a Good Math Reasoner?.🗓️ Sat, Apr 26 — 10:00–12:30 (SGT).📍 Hall 3 + Hall
Tweet media one
Tweet media two
Tweet media three
0
3
11
@shudong_liu
Shudong Liu
7 months
🥳Excited to share that MathCheck is got accepted by #ICLR2025 !! 🤩Huge thanks to @zihaozhou_ @ning_mz @WeiLiu99 and all our collaborators. We believe that mathematical reasoning requires evaluation from multi-dimensional and multi-task scenarios. The Process-judging task in
Tweet media one
@zihaozhou_
Zihao Zhou
1 year
💡Is your model really a good math reasoner? If a model understands a problem, it should robustly work across various tasks!. 🌟Introducing MathCheck: Evaluating Math Reasoning with Checklist! . MathCheck reveals comprehensive reasoning ability of (M)LLM.
Tweet media one
1
11
48
@JacquesLestrap
Jacques Lestrap
9 months
“The cost of the 90-second ad to be broadcasted daily is about $700,000 (AUD) per day or close to $1 million a week”.#mathcheck.
0
0
0
@zihaozhou_
Zihao Zhou
10 months
How do the latest models stack up on MathCheck? We evaluate newly released models including O1-series, Qwen2-vl, etc. Check out the highlights👇.
@zihaozhou_
Zihao Zhou
1 year
💡Is your model really a good math reasoner? If a model understands a problem, it should robustly work across various tasks!. 🌟Introducing MathCheck: Evaluating Math Reasoning with Checklist! . MathCheck reveals comprehensive reasoning ability of (M)LLM.
Tweet media one
1
1
3
@zihaozhou_
Zihao Zhou
7 months
As LLMs continue to make a greater impact in real-world applications, developing better reasoning evaluation paradigms has become more urgent than ever. Excited to share that MathCheck is accepted at ICLR! .Thanks to all collaborations🥳.
@shudong_liu
Shudong Liu
7 months
🥳Excited to share that MathCheck is got accepted by #ICLR2025 !! 🤩Huge thanks to @zihaozhou_ @ning_mz @WeiLiu99 and all our collaborators. We believe that mathematical reasoning requires evaluation from multi-dimensional and multi-task scenarios. The Process-judging task in
Tweet media one
0
0
3
@WeiLiu99
Wei Liu
7 months
Thrilled to share that our MathCheck has been accepted to ICLR 2025! .🚀 As more powerful O-style models emerge, many once-challenging reasoning benchmarks are being conquered. Beyond creating harder, less contaminated benchmarks, another exciting path is to breathe new life into.
@shudong_liu
Shudong Liu
7 months
🥳Excited to share that MathCheck is got accepted by #ICLR2025 !! 🤩Huge thanks to @zihaozhou_ @ning_mz @WeiLiu99 and all our collaborators. We believe that mathematical reasoning requires evaluation from multi-dimensional and multi-task scenarios. The Process-judging task in
Tweet media one
0
3
16
@shudong_liu
Shudong Liu
10 months
How does O1 perform on different mathematical reasoning tasks? . Check out our updates on MathCheck (btw, O1-preview is really expensive🥹).
@zihaozhou_
Zihao Zhou
10 months
How do the latest models stack up on MathCheck? We evaluate newly released models including O1-series, Qwen2-vl, etc. Check out the highlights👇.
0
1
4
@mathcheck_art
MathCheck.eth ◉
2 years
Just became the #273 backer of Bricks (with Belis) from @imdanielallan on @soundxyz_.
0
0
0