
arlo_son
@gson_AI
Followers
186
Following
2K
Media
24
Statuses
267
Undergraduate @ Yonsei. UIC Economics.
Joined February 2023
#NLProc.AI Co-Scientists 🤖 can generate ideas, but can they spot mistakes? (not yet! 🚫). In my recent paper, we introduce SPOT, a dataset of STEM manuscripts (math, materials science, chemistry, physics, etc), annotated with real errors. SOTA models like o3, gemini-2.5-pro
4
38
162
RT @BlancheMinerva: People are really eager to use AIs "to accelerate science" (whatever that means). Designing meaningful tests tailored t….
0
9
0
Last but not least, I’d like to thank all coauthors for their help 👍👍👍. @jiwoohong98 @Void13950782 @hazel_heejeong @cartinoe__5930 @sngwonlim @jinyeop_song @GoncaloSPaulo @YoungjaeYu3 @stella.
0
0
8
I'll be presenting KMMLU, the-most used korean benchmark by Korean big techs at the moment with @seungonekim today, at 2pm!.
0
2
11
RT @seungonekim: @naaclmeeting I'll also be presenting our KMMLU paper with @gson_AI! It is one of the most widely adopted benchmarks used….
0
1
0
RT @TrelisResearch: + GRPO is Poor and for the GPU-Rich +.-------------------------------. *A specific GRPO vs SFT video will be out next w….
0
58
0
RT @lifan__yuan: lessons learned: (1) *capable* (small) base models are good enough to start rl, where (2) reasoning patterns *tailored to….
0
14
0
RT @TheTuringPost: 10 Free Comprehensive Datasets for Supervised Fine-Tuning:. ▪️ Awesome ChatGPT Prompts.▪️ FineWeb from @huggingface.▪️ F….
0
30
0
RT @seungonekim: #NLProc .Just because GPT-4o is 17 times more expensive than GPT-4o-mini, does that mean it generates synthetic data 17 ti….
0
52
0