Jakub Macina Profile
Jakub Macina

@dmacjam

Followers
217
Following
190
Media
15
Statuses
104

AI/ML Scientist, mountain biker

Zurich
Joined June 2013
Don't wanna be here? Send us removal request.
@dmacjam
Jakub Macina
28 days
Paper:
0
0
0
@dmacjam
Jakub Macina
28 days
TutorRL-7B-think: TutorRL-7B: Github:
1
0
2
@dmacjam
Jakub Macina
28 days
AI alignment for tutoring๐ŸŽ“ We use full online RL with conversation-level rewardsโ€”not just single-turn signals like DPO. Did the student actually learn by the end?.Using GRPO, the model learns real teaching strategies like when to hint or when to correct. Explore models belowโคต๏ธ.
@rohanpaul_ai
Rohan Paul
1 month
This paper introduces an online reinforcement learning framework using simulated student-tutor interactions. It trains LLMs to prioritize guiding students pedagogically instead of simply revealing solutions, aligning models with better teaching methods. This helps students
Tweet media one
1
3
12
@dmacjam
Jakub Macina
4 months
๐Ÿ”ฅ Try it now!.Run MathTutorBench locally with your own models or submit them to our leaderboard. Open-source! ๐Ÿ‘‰@ndaheim_ @idohakimi @ Manu Kapur @IGurevych @mrinmayasachan @ETH_AI_Center.
0
0
2
@dmacjam
Jakub Macina
4 months
๐Ÿค” ๐Œ๐จ๐ซ๐ž ๐ค๐ง๐จ๐ฐ๐ฅ๐ž๐๐ ๐ž โ‰  ๐›๐ž๐ญ๐ญ๐ž๐ซ ๐ญ๐ž๐š๐œ๐ก๐ข๐ง๐ ?.Subject expertise does not always correlate with effective teaching; instead, pedagogy and subject knowledge may present a trade-off.
0
0
3
@dmacjam
Jakub Macina
4 months
๐ŸŽฏ How do we measure teaching quality?.We train a reward model that scores open-ended teacher responses and accurately distinguishes expert-level from novice teaching.
Tweet media one
0
0
1
@dmacjam
Jakub Macina
4 months
๐Ÿ“š Teaching is more than just knowing the answer. Our benchmark goes beyond testing solving ability, evaluating three essential teaching skills (expertise, student understanding, pedagogical ability) across seven diverse tasks using a curated collection of datasets and metrics.
Tweet media one
0
0
1
@dmacjam
Jakub Macina
4 months
๐Ÿš€ ๐‡๐จ๐ฐ ๐ฐ๐ž๐ฅ๐ฅ ๐œ๐š๐ง ๐‹๐‹๐Œ๐ฌ ๐ญ๐ž๐š๐œ๐ก?.Evaluating LLMs for education is key to making real progress, yet we lack a reliable and simple benchmark. Introducing ๐Œ๐š๐ญ๐ก๐“๐ฎ๐ญ๐จ๐ซ๐๐ž๐ง๐œ๐กโ€”an open-source benchmark designed to assess holistic tutoring capabilities in AI.
4
3
8
@dmacjam
Jakub Macina
8 months
I've been recognized as an Outstanding Reviewer for #EMNLP2024 ! ๐Ÿš€๐Ÿš€Contributing to the community is always rewarding and every review is an opportunity to learn and grow.
@emnlpmeeting
EMNLP 2025
8 months
We're kicking off the awards session at #EMNLP2024 by announcing our (many) **Outstanding Reviewers**!
Tweet media one
Tweet media two
Tweet media three
0
0
13
@dmacjam
Jakub Macina
8 months
Mistakes are key learning opportunities!๐Ÿง‘โ€๐ŸŽ“ Can LLMs help students learn from them through dialog? ๐Ÿ’ฌ While they often struggle to diagnose student errors when generating responses directly, adding a verification step โœ… could make a difference. #EMNLP2024.
@UKPLab
UKP Lab
8 months
๐—–๐—ฎ๐—ป ๐—Ÿ๐—Ÿ๐— ๐˜€ ๐—ต๐—ฒ๐—น๐—ฝ ๐˜€๐˜๐˜‚๐—ฑ๐—ฒ๐—ป๐˜๐˜€ ๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป ๐—ณ๐—ฟ๐—ผ๐—บ ๐—บ๐—ถ๐˜€๐˜๐—ฎ๐—ธ๐—ฒ๐˜€?.Models struggle to spot student errors, but a verification step could help. More below!. ๐Ÿงต(1/9) #EMNLP2024. ๐Ÿ“ฐ
Tweet media one
0
3
18
@dmacjam
Jakub Macina
11 months
Chat with Junling about our work of generating and evaluating the quality of multi-turn teacher-student conversations grounded in textbooks. #ACL2024 #ACL2024NLP
Tweet media one
@JunlingWang1999
Junling Wang
11 months
Excited to share that our paper, "Book2Dial: Generating Teacher-Student Interactions from Textbooks for Cost-Effective Development of Educational Chatbots," has been accepted at ACL 2024 Findings! . We will go to Bangkok to attend ACL 2024!.
1
0
7
@dmacjam
Jakub Macina
2 years
Excited to be at #EMNLP2023 in Singapore to present our paper.
@UKPLab
UKP Lab
2 years
We are happy to announce the release of ๐Ÿงฎ MathDial, a dataset of one-to-one teacher-student tutoring dialogues grounded in multi-step math reasoning problems. Discover more about our latest #EMNLP2023 Findings paper in this ๐Ÿงต (1/8). #dialogue #NLProcessing #MathWordProblems
Tweet media one
0
1
20
@dmacjam
Jakub Macina
2 years
Consider joining fellowship programs at ETH AI Center.
@ETH_AI_Center
ETH AI Center
2 years
โฐ Deadline Approaching! . ๐Ÿš€Apply by Nov 22 to become an ETH AI Center #PhD or #Postdoc! . We're hosting an EXTRA Online Q&A Session.๐Ÿ“…Monday, Nov 20, 16:15 - 17:00 CET. Zoom link on our website ๐Ÿ”— Don't miss this chance to get the info you need!
Tweet media one
0
0
2
@dmacjam
Jakub Macina
2 years
So many interesting ideas and discussions at #LaunchAIXSummit2023 @ETH_AI_Center.
@JuliaChatain
Julia Chatain
2 years
Had a great time at the X+AI summit last week!.Many exciting and inspiring questions, looking forward to the next steps. Thanks @dmacjam for the invitation and organization!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
0
9
@dmacjam
Jakub Macina
2 years
RT @arkrause: Doctoral and Postdoc Fellowships at the @ETH_AI_Center! Applications accepted until November 22 2023. .
0
26
0
@dmacjam
Jakub Macina
2 years
Interested in AI for Education and how LLMs to improve education? Come and join us at the #LaunchAIXSummit2023 workshop tomorrow (10:30)! .@ETH_AI_Center.
Tweet media one
2
3
16
@dmacjam
Jakub Macina
2 years
LLMs get better at reasoning, but can they act as a tutors? Turns out, they're quick to spill out the answers.๐Ÿค– Check out the new ๐ŸงฎMathDial dataset, built with input from teachers and roleplaying students.
0
5
18
@dmacjam
Jakub Macina
2 years
Link to the preprint:
0
0
0
@dmacjam
Jakub Macina
2 years
RT @wangchunshu: Tired of the output length limit of ChatGPT? Try RecurrentGPT, a language simulacra of the recurrence mechanism in RNNs. Yโ€ฆ.
0
41
0
@dmacjam
Jakub Macina
2 years
RT @ETH_AI_Center: Celebrating Jakub Macina's remarkable achievement as he secures a coveted spot on Forbes' 30 Under 30 list in Science anโ€ฆ.
0
3
0