Andrew Cohen @andrew_e_cohen X Profile

Andrew Cohen

@andrew_e_cohen

Followers

211

Following

726

Media

0

Statuses

134

Reinforcement learning @AIatMeta | Previously #mlagents

Joined December 2019

Don't wanna be here? Send us removal request.

Andrew Cohen

@andrew_e_cohen

2 months

RT @Andr3yGR: New paper! Collaboration with @TianyuHe_ and Aditya Cowsik. Thread.🧵

0

32

0

Andrew Cohen

@andrew_e_cohen

3 months

RT @soniajoseph_: Our paper Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video received an Oral at the Mec….

0

30

0

Grok

@grok

2 days

Join millions who have switched to Grok.

127

243

2K

Andrew Cohen

@andrew_e_cohen

5 months

RT @qqyuzu: d1: to grow in reasoning, masked diffusion language models go beyond supervised learning , we meet RL !😃

0

15

0

Andrew Cohen

@andrew_e_cohen

5 months

RT @OlgaNLP: We have been cooking! 👨‍🍳.🧵(1/6).

0

42

0

Andrew Cohen

@andrew_e_cohen

6 months

RT @MaitrixOrg: ⛓️ Long CoT (O1/R1) style reasoning has gained popularity recently. Rather than directly generating the solution, it exhibi….

0

8

0

Andrew Cohen

@andrew_e_cohen

6 months

RT @michiyasunaga: 📢 Introducing Multimodal RewardBench:. A holistic, human-annotated benchmark for evaluating VLM reward models or judges….

0

37

0

Andrew Cohen

@andrew_e_cohen

6 months

RT @jaseweston: 🚨 New paper & dataset! 🚨.NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions.- Synthesizes 2.8M challen….

0

90

0

Andrew Cohen

@andrew_e_cohen

6 months

RT @tydsh: Great to see that many of our previous works on sparse self-attention (StreamingLLM, H2O, MagicPIG) are mentioned in this great….

0

25

0

Andrew Cohen

@andrew_e_cohen

6 months

RT @tydsh: Our new work Spectral Journey shows a surprising finding: when a 2-layer Transformer is learned to predi….

arxiv.org

Decoder-only transformers lead to a step-change in capability of large language models. However, opinions are mixed as to whether they are really planning or reasoning. A path to making progress...

0

88

0

Andrew Cohen

@andrew_e_cohen

6 months

RT @fly51fly: [LG] Spectral Journey: How Transformers Predict the Shortest Path.A Cohen, A Gromov, K Yang, Y Tian [Meta] (2025). https://t.c….

0

5

0

Andrew Cohen

@andrew_e_cohen

7 months

RT @jaseweston: 🥥🌪️ Introducing CoCoMix - a LLM pretraining framework that predicts concepts and mixes them into its hidden state to improv….

0

59

0

Andrew Cohen

@andrew_e_cohen

7 months

RT @qqyuzu: Widely accepted: the longer CoT the better perf - in TEXT space. What happens in LATENT space? We use latent discrete tokens to….

0

17

0

Andrew Cohen

@andrew_e_cohen

8 months

RT @tydsh: Our Coconut work (learning continuous latent CoT) has opened sourced now. Welcome to play with it:

github.com

Training Large Language Model to Reason in a Continuous Latent Space - facebookresearch/coconut

0

270

0

Andrew Cohen

@andrew_e_cohen

8 months

RT @aramHmarkosyan: We're excited to open-source LeanUniverse! A package that simplifies building consistent #Lean4 training datasets from….

github.com

LeanUniverse: A Library for Consistent and Scalable Lean4 Dataset Management - facebookresearch/LeanUniverse

0

51

0

Andrew Cohen

@andrew_e_cohen

9 months

RT @michiyasunaga: 📣 Introducing ALMA: Alignment with Minimal Annotation. Idea: Conventional LLM alignment (post-tr….

0

40

0

Andrew Cohen

@andrew_e_cohen

9 months

RT @gh_marjan: Everyone’s talking about synthetic data generation — but what’s the recipe for scaling it without model collapse? 🤔. Meet AL….

0

10

0

Andrew Cohen

@andrew_e_cohen

9 months

RT @Ahmad_Al_Dahle: Introducing Llama 3.3 – a new 70B model that delivers the performance of our 405B model but is easier & more cost-effic….

0

477

0

Andrew Cohen

@andrew_e_cohen

9 months

RT @ArmenAgha: Say hello to our new company Perceptron AI. Foundation models transformed the digital realm, now it’s time for the physica….

0

57

0

Andrew Cohen

@andrew_e_cohen

9 months

RT @AkshatS07: 1/n I’m excited to share our new venture, Perceptron AI . With the advancements we have made with AI in the digital world ,….

0

10

0

Andrew Cohen

@andrew_e_cohen

10 months

RT @qqyuzu: Introducing Dualformer: a new model that integrates fast and slow thinking! By learning with randomized reasoning traces, Dualf….

0

28

0