Kyle Lo @kylelostat X Profile

Kyle Lo

@kylelostat

Followers

3K

Following

3K

Media

51

Statuses

647

#nlproc #hci research scientist @allen_ai, co-lead of data for OLMo w/ @soldni, he/him, find me on 👉🏻https://t.co/5Hm9cx3Urz🧋

Seattle, WA

Joined January 2019

Don't wanna be here? Send us removal request.

Kyle Lo

@kylelostat

9 days

"Out of 13,048 reviewers. only 69 were deemed highly irresponsible. and enforcement was applied solely in those cases. These reviewers were contacted multiple times. as well as being personally contacted by the area chairs and senior area chairs, but still failed to fulfill them".

EMNLP 2025

@emnlpmeeting

9 days

This year, EMNLP desk rejected approximately 100 papers. For more insight into the process, and potential future changes, please see this blog post from the PCs: @c_christodoulop @Tanmoy_Chak @VioletNPeng.

0

4

33

Kyle Lo

@kylelostat

10 days

my favorite figure from work by @heinemandavidj . if you're frustrated by LM evals, not knowing if results are real or noise, it's useful to decompose sources of variance:. 🐠is there enough spread between compared models (signal).🐟do scores vary among intermediate ckpts (noise).

David Heineman

@heinemandavidj

10 days

(2/6) Consider these training curves: 150M, 300M and 1B param models on 25 pretraining corpora. Many benchmarks can separate models, but are too noisy, and vice versa! 😧. We want – ⭐ low noise and high signal ⭐ – *both* low variance during training and a high spread of scores.

0

2

26

Grok

@grok

2 days

Join millions who have switched to Grok.

125

241

2K

Kyle Lo

@kylelostat

15 days

huge thx to @nsf @nvidia for supporting our work on fully open AI model science & development 🤩.

Ai2

@allen_ai

15 days

With fresh support of $75M from @NSF and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

0

3

25

Kyle Lo

@kylelostat

28 days

thx to all the feedback from OSS community! . our olmOCR lead @jakepoznanski shipped a new model fixing lotta issues + some more optimization for better throughput. have fun converting PDFs!.

Ai2

@allen_ai

28 days

📝 olmOCR v0.2.1 has arrived with new models! Our open‑source OCR engine now reads tougher docs with greater precision—and it’s still 100 % open. 👇

1

3

10

Kyle Lo

@kylelostat

1 month

RT @cmalaviya11: People at #ACL2025, come drop by our poster today & chat with me about how context matters for reliable language model eva….

0

6

0

Kyle Lo

@kylelostat

1 month

RT @tongshuangwu: We all agree that AI models/agents should augment humans instead of replace us in many cases. But how do we pick when to….

0

21

0

Kyle Lo

@kylelostat

1 month

issues w preference LM benchmarks.🐡data contains cases where the "bad" response is just as good as chosen one.🐟model rankings can feel off (claude ranks lower than expected). led by @cmalaviya11 (TACL 2025), we study underspecified queries & detrimental effect on model evals.

Ai2

@allen_ai

1 month

In our new paper, “Contextualized Evaluations: Judging Language Model Responses to Underspecified Queries,” we find that adding just a bit of missing context can reorder model leaderboards—and surface hidden biases. 🧵👇

0

10

31

Kyle Lo

@kylelostat

1 month

presenting olmOCR at the poster session (2:15pm 211 West) for #codeml workshop at #icml2025! .🐟 fully open source OCR, comparable or better than frontier VLMs.🐠 all weights, data, code free & public.🐡 new benchmark of OCR "unit tests" on diverse PDFs & challenging OCR cases.

Ai2

@allen_ai

2 months

New updates for olmOCR, our fully open toolkit for transforming documents (PDFs & images) into clean markdown. We released:. 1️⃣ New benchmark for fair comparison of OCR engines and APIs.2️⃣ Improved inference that is faster and cheaper to run.3️⃣ Docker image for easy deployment

0

4

30

Kyle Lo

@kylelostat

1 month

RT @_awettig: Presenting two posters at ICML over the next two days:.- Both at 11am - 1:30pm.- Both about how to improve pre-training with….

0

9

0

Kyle Lo

@kylelostat

2 months

will be at #icml2025, lemme kno if wanna chat about OLMo pretraining data curation, evaluation, data mixing, etc!👋. find us at poster sess on 📅Wed 7/16 @ 11am⏲️ to learn about Web Organizer, distilling web data taxonomies into small models & using them for LM data mixing!.

Alex Wettig

@_awettig

6 months

🤔 Ever wondered how prevalent some type of web content is during LM pre-training?. In our new paper, we propose WebOrganizer which *constructs domains* based on the topic and format of CommonCrawl web pages 🌐. Key takeaway: domains help us curate better pre-training data! 🧵/N

0

7

33

Kyle Lo

@kylelostat

2 months

we developed the benchmark independently so no dev/test leakage, and even so, results show olmOCR produces often higher quality output than even proprietary OCR tools & is way cheaper + local as well!. our team will be at #ICML2025, come find me, @jakepoznanski and @soldni there.

8

1

4

Kyle Lo

@kylelostat

2 months

the benchmark works based on thousands of "unit tests". so instead of fuzzy matching between a model-generated table with a gold reference table,. we define Pass/Fail tests like "the cell to the left of the cell containing 0.001 should contain 1.96"

1

3

4

Kyle Lo

@kylelostat

2 months

excited to release our new benchmark for OCR addressing 3 eval challenges:.🐟 coverage of many types of docs (born digital vs old scans, pages w tiny fonts, etc).🐡 coverage of many different OCR targets (e.g. equations, tables, etc).🐠 apples-to-apples comparison across systems.

Ai2

@allen_ai

2 months

New updates for olmOCR, our fully open toolkit for transforming documents (PDFs & images) into clean markdown. We released:. 1️⃣ New benchmark for fair comparison of OCR engines and APIs.2️⃣ Improved inference that is faster and cheaper to run.3️⃣ Docker image for easy deployment

2

6

36

Kyle Lo

@kylelostat

3 months

excited to win 🏆 this award for our work on molmo & pixmo, showing the value of high-quality data curation for VLMs! . recalling when we released same time as Llama 3.2 😆. huge kudos to @mattdeitke chris clark & @anikembhavi for their leadership on this project!

Matt Deitke

@mattdeitke

3 months

Molmo won the Best Paper Honorable Mention award @CVPR!. This work was a long journey over 1.5 years, from failing to get strong performance with massive scale, low quality data, to focusing on modest scale extremely high quality data! Proud to see what it became. #CVPR2025

2

5

47

Kyle Lo

@kylelostat

3 months

RT @tyleraromero: Thrilled to announce I've joined the incredible team at @allen_ai! I'll be working on language modeling!.

0

4

0

Kyle Lo

@kylelostat

3 months

RT @finbarrtimbers: excited to announce that I’ve joined the Allen institute, where I’ll be working on RL for LLMs.

0

13

0

Kyle Lo

@kylelostat

4 months

great work from philippe as always☺️ agree w view reliability is absolutely key.

Philippe Laban

@PhilippeLaban

4 months

🆕paper: LLMs Get Lost in Multi-Turn Conversation. In real life, people don’t speak in perfect prompts. So we simulate multi-turn conversations — less lab-like, more like real use. We find that LLMs get lost in conversation. 👀What does that mean? 🧵1/N.📄

1

7

9

Kyle Lo

@kylelostat

4 months

lookin for strong data ppl to make tokens, eat snacks & drink boba w us ⌨️🍿🧋.

Nathan Lambert

@natolambert

4 months

Who would make you really excited if they joined Ai2? .We always are looking to hire these people that seem like obvious strong fits.

0

31

Kyle Lo

@kylelostat

4 months

we released OLMo 2 1B, showing again how well our OLMo 2 pretrain & post train recipe works!. Our small 1B model is comparable or better than other top open weights-only alternatives while maintaining full open data, code & intermediate checkpoints!.

Ai2

@allen_ai

4 months

We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.

0

7

49

Kyle Lo

@kylelostat

4 months

outstanding paper award for our AI in Education work!. 🐟 dataset of natural images of student solutions to K-12 math problems from online teaching platform.🐠 annotations (dense captions, VQA pairs) by teachers to eval VLMs. chat w leads @samibaral144 @lucy3_li at #NAACL2025 🤩.

NAACL HLT 2027

@naaclmeeting

4 months

🟢 Announcing the #NAACL2025 Award Winners! . The Best Paper and Best Theme Paper winners will present at our closing session.

4

8

40