kylelostat Profile Banner
Kyle Lo Profile
Kyle Lo

@kylelostat

Followers
3K
Following
3K
Media
51
Statuses
647

#nlproc #hci research scientist @allen_ai, co-lead of data for OLMo w/ @soldni, he/him, find me on 👉🏻https://t.co/5Hm9cx3Urz🧋

Seattle, WA
Joined January 2019
Don't wanna be here? Send us removal request.
@kylelostat
Kyle Lo
9 days
"Out of 13,048 reviewers. only 69 were deemed highly irresponsible. and enforcement was applied solely in those cases. These reviewers were contacted multiple times. as well as being personally contacted by the area chairs and senior area chairs, but still failed to fulfill them".
@emnlpmeeting
EMNLP 2025
9 days
This year, EMNLP desk rejected approximately 100 papers. For more insight into the process, and potential future changes, please see this blog post from the PCs: @c_christodoulop @Tanmoy_Chak @VioletNPeng.
0
4
33
@kylelostat
Kyle Lo
10 days
my favorite figure from work by @heinemandavidj . if you're frustrated by LM evals, not knowing if results are real or noise, it's useful to decompose sources of variance:. 🐠is there enough spread between compared models (signal).🐟do scores vary among intermediate ckpts (noise).
@heinemandavidj
David Heineman
10 days
(2/6) Consider these training curves: 150M, 300M and 1B param models on 25 pretraining corpora. Many benchmarks can separate models, but are too noisy, and vice versa! 😧. We want – ⭐ low noise and high signal ⭐ – *both* low variance during training and a high spread of scores.
Tweet media one
0
2
26
@grok
Grok
2 days
Join millions who have switched to Grok.
125
241
2K
@kylelostat
Kyle Lo
15 days
huge thx to @nsf @nvidia for supporting our work on fully open AI model science & development 🤩.
@allen_ai
Ai2
15 days
With fresh support of $75M from @NSF and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡
Tweet media one
0
3
25
@kylelostat
Kyle Lo
28 days
thx to all the feedback from OSS community! . our olmOCR lead @jakepoznanski shipped a new model fixing lotta issues + some more optimization for better throughput. have fun converting PDFs!.
@allen_ai
Ai2
28 days
📝 olmOCR v0.2.1 has arrived with new models! Our open‑source OCR engine now reads tougher docs with greater precision—and it’s still 100 % open. 👇
Tweet media one
1
3
10
@kylelostat
Kyle Lo
1 month
RT @cmalaviya11: People at #ACL2025, come drop by our poster today & chat with me about how context matters for reliable language model eva….
0
6
0
@kylelostat
Kyle Lo
1 month
RT @tongshuangwu: We all agree that AI models/agents should augment humans instead of replace us in many cases. But how do we pick when to….
0
21
0
@kylelostat
Kyle Lo
1 month
issues w preference LM benchmarks.🐡data contains cases where the "bad" response is just as good as chosen one.🐟model rankings can feel off (claude ranks lower than expected). led by @cmalaviya11 (TACL 2025), we study underspecified queries & detrimental effect on model evals.
@allen_ai
Ai2
1 month
In our new paper, “Contextualized Evaluations: Judging Language Model Responses to Underspecified Queries,” we find that adding just a bit of missing context can reorder model leaderboards—and surface hidden biases. 🧵👇
Tweet media one
0
10
31
@kylelostat
Kyle Lo
1 month
presenting olmOCR at the poster session (2:15pm 211 West) for #codeml workshop at #icml2025! .🐟 fully open source OCR, comparable or better than frontier VLMs.🐠 all weights, data, code free & public.🐡 new benchmark of OCR "unit tests" on diverse PDFs & challenging OCR cases.
@allen_ai
Ai2
2 months
New updates for olmOCR, our fully open toolkit for transforming documents (PDFs & images) into clean markdown. We released:. 1️⃣ New benchmark for fair comparison of OCR engines and APIs.2️⃣ Improved inference that is faster and cheaper to run.3️⃣ Docker image for easy deployment
Tweet media one
0
4
30
@kylelostat
Kyle Lo
1 month
RT @_awettig: Presenting two posters at ICML over the next two days:.- Both at 11am - 1:30pm.- Both about how to improve pre-training with….
0
9
0
@kylelostat
Kyle Lo
2 months
will be at #icml2025, lemme kno if wanna chat about OLMo pretraining data curation, evaluation, data mixing, etc!👋. find us at poster sess on 📅Wed 7/16 @ 11am⏲️ to learn about Web Organizer, distilling web data taxonomies into small models & using them for LM data mixing!.
@_awettig
Alex Wettig
6 months
🤔 Ever wondered how prevalent some type of web content is during LM pre-training?. In our new paper, we propose WebOrganizer which *constructs domains* based on the topic and format of CommonCrawl web pages 🌐. Key takeaway: domains help us curate better pre-training data! 🧵/N
Tweet media one
0
7
33
@kylelostat
Kyle Lo
2 months
we developed the benchmark independently so no dev/test leakage, and even so, results show olmOCR produces often higher quality output than even proprietary OCR tools & is way cheaper + local as well!. our team will be at #ICML2025, come find me, @jakepoznanski and @soldni there.
8
1
4
@kylelostat
Kyle Lo
2 months
the benchmark works based on thousands of "unit tests". so instead of fuzzy matching between a model-generated table with a gold reference table,. we define Pass/Fail tests like "the cell to the left of the cell containing 0.001 should contain 1.96"
Tweet media one
1
3
4
@kylelostat
Kyle Lo
2 months
excited to release our new benchmark for OCR addressing 3 eval challenges:.🐟 coverage of many types of docs (born digital vs old scans, pages w tiny fonts, etc).🐡 coverage of many different OCR targets (e.g. equations, tables, etc).🐠 apples-to-apples comparison across systems.
@allen_ai
Ai2
2 months
New updates for olmOCR, our fully open toolkit for transforming documents (PDFs & images) into clean markdown. We released:. 1️⃣ New benchmark for fair comparison of OCR engines and APIs.2️⃣ Improved inference that is faster and cheaper to run.3️⃣ Docker image for easy deployment
Tweet media one
2
6
36
@kylelostat
Kyle Lo
3 months
excited to win 🏆 this award for our work on molmo & pixmo, showing the value of high-quality data curation for VLMs! . recalling when we released same time as Llama 3.2 😆. huge kudos to @mattdeitke chris clark & @anikembhavi for their leadership on this project!
Tweet media one
@mattdeitke
Matt Deitke
3 months
Molmo won the Best Paper Honorable Mention award @CVPR!. This work was a long journey over 1.5 years, from failing to get strong performance with massive scale, low quality data, to focusing on modest scale extremely high quality data! Proud to see what it became. #CVPR2025
Tweet media one
2
5
47
@kylelostat
Kyle Lo
3 months
RT @tyleraromero: Thrilled to announce I've joined the incredible team at @allen_ai! I'll be working on language modeling!.
0
4
0
@kylelostat
Kyle Lo
3 months
RT @finbarrtimbers: excited to announce that I’ve joined the Allen institute, where I’ll be working on RL for LLMs.
0
13
0
@kylelostat
Kyle Lo
4 months
great work from philippe as always☺️ agree w view reliability is absolutely key.
@PhilippeLaban
Philippe Laban
4 months
🆕paper: LLMs Get Lost in Multi-Turn Conversation. In real life, people don’t speak in perfect prompts. So we simulate multi-turn conversations — less lab-like, more like real use. We find that LLMs get lost in conversation. 👀What does that mean? 🧵1/N.📄
Tweet media one
Tweet media two
1
7
9
@kylelostat
Kyle Lo
4 months
lookin for strong data ppl to make tokens, eat snacks & drink boba w us ⌨️🍿🧋.
@natolambert
Nathan Lambert
4 months
Who would make you really excited if they joined Ai2? .We always are looking to hire these people that seem like obvious strong fits.
0
0
31
@kylelostat
Kyle Lo
4 months
we released OLMo 2 1B, showing again how well our OLMo 2 pretrain & post train recipe works!. Our small 1B model is comparable or better than other top open weights-only alternatives while maintaining full open data, code & intermediate checkpoints!.
@allen_ai
Ai2
4 months
We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.
Tweet media one
0
7
49
@kylelostat
Kyle Lo
4 months
outstanding paper award for our AI in Education work!. 🐟 dataset of natural images of student solutions to K-12 math problems from online teaching platform.🐠 annotations (dense captions, VQA pairs) by teachers to eval VLMs. chat w leads @samibaral144 @lucy3_li at #NAACL2025 🤩.
@naaclmeeting
NAACL HLT 2027
4 months
🟢 Announcing the #NAACL2025 Award Winners! . The Best Paper and Best Theme Paper winners will present at our closing session.
Tweet media one
4
8
40