Stephen Roller @stephenroller X Profile

Stephen Roller

@stephenroller

Followers

5K

Following

23K

Media

147

Statuses

7K

MoTS @thinkymachines. previously pre-training @googledeepmind ,@character_ai, and @aiatmeta.

https://t.co/ZhN0XUcqWJ

NYC

Joined February 2008

Don't wanna be here? Send us removal request.

only cute things

@cutedreamvibes

2 days

https://t.co/sN1zJtxEVQ

108

33K

229K

Bob McGrew

@bobmcgrewai

5 days

After spending billions of dollars of compute, GPT-5 learned that the most effective use of its token budget is to give itself a little pep talk every time it figures something out. Maybe you should do the same.

Alex Tabarrok

@ATabarrok

5 days

What?

46

104

3K

Lilian Weng

@lilianweng

13 days

GPUs are expensive and setting up the infrastructure to make GPUs work for you properly is complex, making experimentation on cutting-edge models challenging for researchers and ML practitioners. Providing high quality research tooling is one of the most effective ways to

40

127

2K

Thinking Machines

@thinkymachines

13 days

Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!

222

767

6K

Thinking Machines

@thinkymachines

15 days

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.

81

555

3K

Thinking Machines

@thinkymachines

18 days

Efficient training of neural networks is difficult. Our second Connectionism post introduces Modular Manifolds, a theoretical step toward more stable and performant training by co-designing neural net optimizers with manifold constraints on weight matrices.

118

463

3K

Cimmerian Pervert

@cimmerian_v

22 days

How much tylenol to make this happen

49

272

4K

Thinking Machines

@thinkymachines

1 month

Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to

240

1K

8K

Jean-Rémi King

@JeanRemiKing

1 month

🗣️Job alert: Our Brain and AI team at FAIR (@AIatMeta) is looking for a software engineer with experience in 3D rendering in the browser: https://t.co/UneZ0WFxIX Please RT 🙏

4

22

143

Stephen Roller

@stephenroller

3 years

The undocumented XID errors just taste better. More fresh.

1

16

Stephen Roller

@stephenroller

2 months

There’s lots wrong with the OPT models and I don’t recommend using them today. Just the widely-provided explanation for quantization aspects doesn’t actually seem explanatory.

0

4

Stephen Roller

@stephenroller

2 months

There’s a line of critique/reviewer feedback in quantization literature that the OPT models are too easy to quantize bc they’re undertrained; but all scales are trained for the same 300B tokens, making the 6.7B and smaller overtrained by Chinchilla estimates.

2

0

8

Vivid Void

@VividVoid_

2 months

gm, it's Friday, go outside

42

365

7K

Stephen Roller

@stephenroller

3 months

We are moving incredibly fast. Come light up GPUs with us.

Mira Murati

@miramurati

3 months

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're

12

345

Sulla

@gaulicsmith

4 months

Millennials use “lol” like STOP at the end of a telegram lol

589

5K

67K

Cody Blakeney

@code_star

3 months

What I send to people to get them to join @datologyai

1

5

25

Will Held

@WilliamBarrHeld

4 months

Based on current administration policies, China will have an influx of returning talent and an accelerated advantage in research investment. You need to be both sinophobic and irrational to expect the US to continue as the global scientific powerhouse with these policies.

Joshua Weitz

@joshuasweitz

5 months

The White House Vision for Dismantling Science in One Simple Plot https://t.co/F3PHSlqRuV

0

1

19

Eric Jang

@ericjang11

5 months

Revoking visas of Chinese students studying in critical fields like AI and Robotics is incredibly short-sighted and harmful to America’s long term prosperity. We want the best from every country to work for team America

Secretary Marco Rubio

@SecRubio

5 months

The U.S. will begin revoking visas of Chinese students, including those with connections to the Chinese Communist Party or studying in critical fields.

19

24

400

David Pfau

@pfau

5 months

The war on science in the US is already affecting private sector research like AlphaFold. Bears repeating but the private sector builds on top of things created by academic research for the public good. This hurts everyone.

13

105

511

Rota

@pli_cachete

5 months

American funding for hard sciences has fallen 2/3 this year. In physics, they are receiving 15% of what they did last year. What the fuck are we doing?

376

474

6K